Feed aggregator

The EJB 3.0 Hibernate Fallacy

Mike Keith - Wed, 2005-04-20 13:35

I am Canadian. I have only been to Britain once, but I still know enough about Britain to pronounce Worcestershire correctly. The real question is what do I do when I hear others saying it incorrectly? Do I just ignore them and go on, inwardly laughing at them and their blunder, or do I correct them?

Well, I confess that I am a corrector. I do tell people how to pronounce Worcestershire correctly, not because I want to appear international or love to correct people, but because if I were them I would want to be told. Basic golden rule type of stuff. And, yes, I also tell people about the long pink thread stuck on their shoulder, the black mark that they rubbed on their cheek after changing the toner, and even on occasion the skirt that is caught in the pantyhose (that one is definitely a little trickier to do properly, though).

Every now and then a statement gets made somewhere that EJB 3.0 is just a copy of Hibernate, or even worse, that EJB 3.0 is Hibernate. Claims like these are typically made in innocence by uninformed people whose vision is obscured by the narrow scope of their own experience. It seems that few people actually know enough to even correct the propagation of this fallacy, and those that do know are not doing it. Rather than hanging my head and feeling guilty about this apparent inconsistency in my character I have determined to right my wrong, or at least do what is within my power to spread the truth. This blog entry is the first step in my repentance process.

The root of the fallacy is that Hibernate was the first free O/R mapping software that came even close to providing enough functionality and features to solve some of the problems that real world applications were facing. Lots of developers caught on to this and began to use Hibernate successfully for prototypes and small projects. For the majority of these developers this was their first and only experience using O/R mapping software to solve the O/R impedance mismatch problem.

Some of these people have never gone on to using the full-blown over-the-counter mapping products and do not even realize that these products exist or that they provide all of the features of Hibernate, and in some cases more. Their understanding of the O/R mapping and persistence concepts relate only to Hibernate and the Hibernate API's.

Cue the mandate to make EJB a useful and relevant specification. Linda DeMichiel, to her credit, realized that as the EJB spec lead she needed to fashion EJB 3.0 after existing and successful products instead of adopting the usual ivory tower approach to specification development. To do this she invited members from Oracle TopLink and JBoss Hibernate, the two most successful O/R mapping products ever, people from the top selling application servers, users/consultants from different backgrounds and eventually SolarMetric, the only JDO vendor with a real customer base as far as I know, to participate in the spec. All of the members are combining to produce a spec that will not only make session beans and MDBs easier to develop and use, but also proffer a persistence standard that will please the user community that had previously rejected the EJB standard in favour of the proprietary POJO persistence vendors.

As we progressed and began specifying the persistence layer it was obvious that we had similar features at the 80-90% functionality level that EJB is trying to achieve. These features include:

EntityManager - A transaction-level artifact that references, maintains identity and manages the objects in a given transaction. JDO calls this a PersistenceManager, Hibernate calls this a Session. TopLink calls this a UnitOfWork. These are all very close in scope, purpose and API.

Named queries - Queries must be able to be pre-defined and bound to a name for later retrieval and execution. These are called named queries in all of TopLink, Hibernate and JDO.

Native queries - Native SQL queries that allow the application to specify the query criteria in SQL. These are called SQL queries in all of TopLink, Hibernate and JDO.

Callback Listeners - The ability to define a class or method that will get invoked when a given event occurs. TopLink calls these event listeners, Hibernate and JDO call them life cycle callbacks.

Detaching/Reattaching objects - Objects can leave the scope of the EntityManager that controls them. They can also be reattached to the same or a different EntityManager through the use of the merge API call on the EntityManager. TopLink offers a series of merge calls, the most basic one being mergeClone. Hibernate has saveOrUpdateCopy and JDO has a couple of flavours of attachCopy call on the PersistenceManager.

O/R Mapping Types - All of the direct and relationship mapping types that are fundamental to mapping object state to relational database tables. These are all supported by Hibernate, TopLink and JDO. I won't go through all of the names (one-to-one, etc.. they are all pretty standard), but although some of the names differ a little bit from one to the other the functionality is pretty much the same and what you would expect.

Embedded Objects - Objects that have no persistent identity of their own but depend upon their parent object for identity. JDO calls them embedded objects, TopLink calls them aggregates and Hibernate calls them components.

The list could go on, but hopefully people get the idea. The important features in EJB 3.0 are stock persistence features that anybody that has used multiple persistence products should recognize. The best part is that by standardizing these features the design patterns (actually they are more like "feature patterns", but nobody has written a book about feature patterns, yet :-) will be able to be used and referenced in ways that span products.

Note that I am not comparing the different features offered by these products. That is not what this is about. The point is that there are similarities and that those similarities are getting enshrined in a specification. This is the biggest win for vendors and developers.

Having said this some of the Hibernate/JBoss customers will recognize that there are some similarities as well in some of the API names. This is not a problem for most of us since they represent the feature as well as any other name would, and Gavin happened to have been the first one to write it up and propose it. (Unless there is something actually wrong with a proposed name there is no reason to turn it down.) It doesn't mean that the feature was modelled after Hibernate, just that the guy from Hibernate happened to be the one to propose the name for the feature that everyone already had.

Hibernate 3 users may recognize more similarities than ever because Hibernate has decided to add these to the base Hibernate product and expose them within the core API. From a migration standpoint this may be problematic for them but that is certainly their perogative and I applaud any product that makes their own proprietary API's look more like the standard. So as Hibernate evolves it turns out that it is actually modelling itself after EJB 3.0, not the other way around.

Finally, and maybe this is just pride speaking here, but I also have to confess feeling just a tinge of personal insult given that I have expended a substantial portion of my own time and effort toiling over the specification issues. Saying that we simply copied Hibernate would be trivializing that time and work, especially when I know full well that the conclusions that we arrived at are in most cases either the best solution, or the best possible solution given the circumstances. The spec should look a lot like Hibernate, TopLink and JDO. If it didn't then we would not have done a very good job since the whole point of this was to make use of our experience and standardize it.

So next time somebody says or writes that EJB 3.0 is modelled after Hibernate don't just inwardly laugh at them, or roll your eyes and feel sorry for them for their naivete. Please correct them. It's embarrassing for them and they would want to you to tell them. I know I would.

Eclipsing Persistence

Mike Keith - Thu, 2005-04-14 10:58

Tools are essential for a technology to mature. Without them it stays in the realm of being accessible only to the experts and usable only by those in the upper experience echelons. I had regular arguments with a friend who repeatedly claimed that O/R mapping tools were not required. In the end he at least conceded that if a product wants to be mainstream it has to have graphical tools to enable the sorts of development that can already be done using API's and XML configuration. This is critical to being able to support the types of developers that don't like to wallow in XML, or managers don't trust their developers to do so :-).

EJB 3.0 has now reached this stage. With Oracle's recent announcement that it is leading the Eclipse project to provide the EJB 3.0 O/R mapping tools as part of the Web Tools Project a standard persistence tools platform is being formed. This will provide the infrastructure for meeting the EJB 3.0 goal of making this a technology that entry-level developers can understand and feel comfortable using. The learning curve to use EJB just got a lot shorter.

Hard to believe, but some people are still really missing the point. The EJB Persistence API is now set to be the standard for persistence. *All* of the major O/R vendors are on board and participating in the expert group and acknowledging it as the standard. TopLink and Hibernate, the two leading O/R mapping products already have support for EJB 3.0 and Kodo, the only JDO vendor that has any real market share in JDO-land, is working on it as well. The age of proprietary mapping descriptors is over, at least for the vast majority of applications.

Of course there will always be some proprietary mapping features that go beyond the spec, and the proposal discusses that these will be able to be plugged in by different vendors as they feel so inclined. There will probably always be a need for proprietary features, the trick is just to ensure that they are done in a conscious way and are harnessed in a well-defined application space. Then if the requirement to move to another vendor comes along the difficulty level will be easy to diagnose.

Any way you look at it, the Eclipse proposal is going to be good for EJB 3.0 and for developers. Having a persistence development platform for the most commonly used IDE is going to provide the support that most people want.

Of course at Oracle we are still providing the support within JDeveloper. It will be well-integrated with TopLink and expose all of the deluxe features that make TopLink the coolest and most powerful O/R mapping framework on the planet. :-)

Migration is the key

Mike Keith - Thu, 2005-03-17 12:11

After speaking at TSS on migration I have gotten a few requests from people asking me to write up the presentation in a paper format for people to be able to read and get more details on the subject. I definitely have to do that at some point, but for now Oracle is hosting a couple of webinars on EJB 3 that will help people to start picturing how this can be achieved. The first one is a basic intro to EJB 3 and the second one will be focusing specifically on migration. See the Java Online Seminar site for more details.

I really believe that migration is the key to getting to EJB 3.0. I know this sounds trite, but I mean migration in the implementation sense, not the general sense.

Most consultants/developers/practitioners have to work on and maintain existing systems, but will still want to find ways to incorporate EJB 3 practices and features. By offering a stepwise migration path from existing products towards EJB 3 then those features will be able to be integrated into legacy systems and slowly be able to become more and more EJB 3 compliant.

If EJB 3 represents freedom from vendor lock-in then if I were an IT manager (and I'm not, and neither have any aspirations to be one so that might actually disqualify my moccasin switching) and I wanted to extend the lifetime of my application then I would obviously want my application to be compliant with the standard. No self-repecting IT professional would ever take on the job of taking an application, ripping it to pieces and rewriting it entirely to a new specification. The chances of succeeding are minimal at best and the chances of regression are probably nigh on 100%. As with any successful migration, moving existing systems to EJB 3 should be done in an incremental fashion.

The granularity of the steps might be arguable, and that is the sort of stuff that I discuss in migration talks (which incidently has been accepted for presentation at JavaOne this year).

Testing EJB's outside the Container -- finally

Mike Keith - Fri, 2005-03-04 16:21

Today was kind of a long-awaited day, not because I have to speak at TheServerSide Java Symposium today, but because I along with a number of other Oracle developers have been anxious to be able to share the progress that we have made on EJB 3.0. While this work has been fun, it has been somewhat frustrating because it could not be released until the timing was right. Today the timing was right, and Oracle announced our new EJB 3.0 technology preview.

This preview is really a landmark release for a rew reasons:

1. It is the first commercial application server release that showcases the next generation of standardized persistence.

2. It enables actual unit testability of CMP entities using JUnit or any other test framework outside the server.

3. It provides support for migrating from EJB 2.x to EJB 3.0.

The preview can be downloaded for free here.

EJB 3.0 - It's new, it's hip... and it has interceptors

Mike Keith - Tue, 2005-02-08 21:46

It seems that some people have been complaining that they are in the dark about where EJB 3.0 is at and what is done/still unfinished. If you are amongst this somewhat anxious and concerned gaggle of squawkers then squawk no more. The second EJB 3.0 checkpoint has been reached and Early Access 2 Draft is available for all to read and comment on. Go to the Sun download site to get it while the bits are still hot.

The first thing that the keen observer will notice is that the spec has been split into two documents. The first is the Simplified EJB API for session and message-driven beans that exist uniquely within the J2EE container. The second is the persistence API that will be able to reside both within J2EE and in standalone J2SE mode. While the exact API for accessing stuff like the EntityManager outside of J2EE is not yet included in the spec the packages have been split off into javax.ejb and javax.persistence to accommodate the additional non-managed execution environment. We are working now on filling in the API calls for the J2SE persistence side of entities and would be really happy to get feedback, regardless of your current favorite persistence flavor.

A bunch more stuff has also been added in since the last draft including interceptors, event callbacks and native SQL queries. The O/R mappings have had some ironing as well so not only have some more mappings like LOBs been added but more default values have been added for simpler mapping.

There are lots of other things that we would like to get feedback on, though, as well, so don't keep your thoughts to yourself or your friends. Please comment either here, or to the new feedback alias that is mentioned on the spec (ejb3-edr2-feedback_at_sun.com). Note that the feedback alias has changed from the EA1 alias in an attempt to outrun the spammers. (I'm sure they will find us again, though. They are kind of like the scum in your bathroom drain. It doesn't seem to matter how clean you can get it at any given time, you know it will come back.) Some of the things that we are interested in hearing about are:

- whether the EJB system should be generating a business interface for a session bean or whether the developer should just provide it (or an IDE can generate it)

- if the approach to migration of 2.x session beans to the 3.0 API is appropriate as an adapter pattern

- whether new EJB QL enhancements and/or OR mapping annotations should be made available on 2.x entities as well

So we are looking forward to any and all feedback. Again, it doesn't matter what persistence stripe you are as long as you have a real and evidenced view.

Open source injustice - a new gripe

Mike Keith - Wed, 2005-01-12 22:06

I hate the fact that I am writing yet another in a seemingly interminable list of fashionable commentaries on open source. I suppose the fact that I am continuing to write means that I feel more passion for my gripe than I feel embarrassed for my lack of originality. If I can salvage even a shred of pride in the face of doing this it may come from the fact that at least I have not seen anybody else complain about it. Of course I am probably wrong about that, but if so, good. Maybe lots of people will complain and things will change.

So what gets my knickers in a knot is when I go to conferences and speak I am almost always not permitted to include product descriptions or say anything that pumps up a product from my company in any way, while any open source project presenter has carte blanche to shill to their hearts desire. This may have made sense in yester-year, when open source really was something that was almost exclusively altruistically-motivated and generously donated by volunteers across the globe. Those days are...umm, well some things have changed a little.

With the maturation of "professional open source" and the evolution of companies that support open source software now playing in the same sales sandboxes, open source products are taking the stage at all the major conferences and shows. Many open source developers are following this lead all the way to the bank. Its the latest thing, and it seems to be working pretty well for them. Now don't get me wrong and think that this is my gripe, cuz it isn't at all. In fact I think it is a good thing both for the developers and companies to be able to make money off of the work that they invested in a project. I congratulate them and I'm very happy for them. They deserve it.

No, what seems unjust to me is when they get to do it in forums that are off limits to the rest of the competition, and this just because their business model is structured slightly differently than the traditional software product sales model. A presenter of an open source product can praise his stuff until he is hoarse and nobody will bat an eye, but the moment a product makes its revenue from product sales, instead of solely on support and services, slides get edited or presentations get rejected. Professional open source seems to have stealthily crept its way onto the marketing stage in places where marketing was previously not allowed to be.

Note that I am not blaming the open source folks for this, because they are only doing what they are allowed to do. I would do the same if I were in their situation. The fault lies with the conference organizers that do not yet seem to have a clue, or just don't care.

Similarly, I am not saying that the original open source'er is extinct. Of course they aren't and I am sure that there are myriads of such people that continue to do what they have always done, in perhaps even larger numbers than ever before. This is still good, but is quite beside the point. These people have been virtually pushed out of the conferences by the popular open source products, many if not most of which are revenue-generating in some way. So what, I guess. Most probably do not even care, anyway. As long as their CVS system stays up and their hosting is free they are probably happy enough.

Me, all I have left is to just keep talking about functionality... in a technology that I know of... that provides all of the deluxe features of a persistence engine... that rhymes with HopLink...

Determining the IP address of the cluster interconnect in 9i

Stefan Roesch - Fri, 2004-12-24 18:36
In Oracle database version 9i there is no way to determine the IP address through a database view. The only way to determine this IP address is with the oradebug command (Please keep in mind oradebug is not a supported product from Oracle, so if there are problems/crashes you are on your own). The oradebug ipc command creates a trace file. The example shows the process: SQL> oradebug setmypid stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

How to determine SQL statements that cause hard parses

Stefan Roesch - Fri, 2004-12-24 18:24
From a tuning point of view hard parses can be quite limiting to the scalability of database and an application. If the number of hard parses is high this is a serious problem. To tackle the problem the first step is to determine which SQL statements are causing hard parses. With the column FIRST_LOAD_TIME of the view V$SQL this can be determined. The following example shows how this knowledge stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

Deleting the wrong cluster interconnect information from the OCR

Stefan Roesch - Wed, 2004-12-22 11:36
The current configuration of the cluster interconnect information can be checked with the oifcfg command. The following example shows this:$ oifcfg getif eth0 142.2.166.0 global public ib0 192.169.1.0 global cluster_interconnectLet's assume that the address for the cluster interconnect was specified incorrectly. This will prevent cluster communication. The situation can be resolved by stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com2

Determining what IP address was configured in the OCR

Stefan Roesch - Tue, 2004-12-21 20:59
In 10gR1 the IP address for the cluster interconnect is determined by default from the OCR (Oracle Cluster Repository). The configured address can be determined with the following command: # oifcfg getif eth1 140.87.81.0 global cluster_interconnectIf the database parameter cluster_interconnects is specified, the value that is obtained from OCR is not used.stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

Javapolis all grown up

Mike Keith - Mon, 2004-12-20 11:48

I just got back from Javapolis in Antwerp and was quite surprised to see how popular it is getting. I confess it was my first time going, but from everything that I have heard it has been a fairly smallish conference that typically hosts in the order of hundreds of people from the dutch locale. Upon attending I found that there were 1500 people registered and although there is still a large local majority attending, more people are coming from other parts of Europe and even some from North America (such as myself). Admittedly the majority of North American attendees were giving talks, but there were also a few that weren't. In any case I guess Stephan and the rest of the people that worked with him deserve some credit for really making this conference what it has become.

Although I did take my camera it appears that its sole purpose was to keep my socks flat in my suitcase since I never actually used it. I did see some people taking pictures there, so if someone were to look hard enough I am sure a few snaps could be seen.

There were a few talks on persistence there, so I thought I would highlight some of those at least.

The first talk was by our fearless leader and JSR 220 spec lead, Linda DeMichiel. She went over many of the improvements that we have been adding to EJB 3.0 in the expert group. She kept it to a fairly high level, which I think people were able to use to help them swap the whole thing in properly. An hour is not a lot of time to describe much detail about what has taken us a year to put together. It was received as favourably as it has been at other conferences and people are really excited about using it. Right now there is one EJB 3.0 preview with another couple that are just around the release corner. It is cool that we are getting bunches of questions from folks that are actually trying this stuff out. This already gives us a huge advantage over previous versions of EJB.

The least impressive was a talk given by someone from InterSystems talking about how the Cache (pronounced ca-shay') database solved the impedance mismatch problem. This was really just an excuse to talk about how Cache tries to do as many possible things as it can cram into a shippable system without actually doing any of them well. It is an OODB... no, wait, it is an RDB... well it could be just a JDBC front end on an OODB for all I can tell. So then if you decide to do O/R mapping you actually define the class in some kind of zany propietary grammar which looks like it has to extend some Cache Persistent class. Then the mapping information ends up getting stored in the database itself (which of course many listeners that were not rocket scientists recognized doesn't bide well for portability). Re-doing everything their own way seemed to be their theme. The last company that I remember that bypassed standards and went their own way was Persistence PowerTier and they did so at their peril (are they still around?).

The icing on the cake was when they talked about a tool that they wrote to help migrate people away from their database to another database. This just seemed like a funny thing to offer. It's like saying, "No, we won't use standards as a means to overcome portability concerns, we will just write yet another custom utility that we can offer that will migrate you away from our product." *Bizarre* is the only word that I can think of that describes this.

Parick Linskey did a JDO 2.0 talk. He is actually on both the JDO 2.0 and EJB 3.0 expert groups and is a competitor of ours as well as a good friend. I didn't actually go to his talk but I know his presentation and am sure that he did it justice. He is a good guy to work with and despite the fact that we work on different products we actually see eye-to-eye philosophically on many of the persistence issues that we encounter in JSR 220.

One of the best talks was given by some colleagues and new friends of mine, Hugo Brand and Marc Meewis. They took the stage and I got to watch as they were the ones doing an advanced persistence talk. They discussed some of the O/R persistence issues that people either don't understand, or don't know enough to even ask about and then went on to the discuss how the next impedance mismatch to be solved is in mapping objects to XML (O/X). I was really impressed by both their understanding of the issues and the way that they were able to discuss and properly scope the features of a persistence engine. They did an excellent job of illustrating all of the flexibilty of TopLink and the various ways that it goes about solving all of the issues.

I guess it is hard to showcase all of the advanced features that TopLink has developed over the past decade, so they didn't go too far in that direction, which was probably a good thing I suppose. In the end many of the deluxe features are not used that frequently anyway. People like to have them in their back pocket so that if they ever encounter one then they know they can solve it.

What they did instead was differentiate it from other persistence solutions by the completeness of the solution in the entire data integration space. So not only can it read and write objects to RDB, but it can read and write the same objects to XML and other types of data storage, all of which is done using existing and developing standards. They actually showed a demo that mapped the same objects to an RDM and then to XML, showing that TopLink can round trip an object instance from the RDB through to XML and back again.

I was at a conference recently and there was a talk by Bruce Eckel that was entitled something like "Stuff that I find interesting and think that you will too". I kind of think that Gavin King's Hibernate talk turned into that. It was supposed to be sort of a Features in Hibernate3 talk, but I think that Gavin, whose mind is much like his body and can't sit still for long, gets a little tired and bored of talking about the same old things. He inevitably ends up talking about stuff that he has been thinking about recently and uses the talk as a way of vetting those ideas. Of course this is great for us in the EJB group because he happens to have been spending most of his time over the last number of months thinking about EJB stuff. That has meant that the ideas end up getting aired to lots of people at a time, and we can solicit more comments from more people. This time Gavin ended up talking about how JSF and EJB 3.0 can combine, and how the same objects can fit into both worlds. We are trying and hoping for this, but it takes more effort then one might expect to get multiple expert groups to go in the same direction and end up with harmonious specifications.

So that is "Persistence at Javapolis in a nutshell". Of course lots more stuff happened, but I will leave that to other people to blog. I actually spent most of my time hanging around Linda, Gavin, Christian, Patrick, and Cedric when he was available. We gots lots of spec work done and had a great time. Had some fantastic food and good dinners with Linda and Gavin and another with Patrick and Howard Lewis Ship. I never really knew Howard that well before so I was glad to have the chance to get to know him better. Also met Bela Ban for the frst time, which was good because I have always been interested in similar things as him. Unfortunately we didn't spend too much time talking about about distributed communications protocols but I have a feeling that we will get another chance at some point in the future. :-)

Determining what IP address was picked for the cluster interconnect

Stefan Roesch - Fri, 2004-12-17 21:56
In the past it was difficult for a user or DBA to determine which cluster interconnect was picked. It was possible to obtain that information with an oradebug comnmand. With Oracle Version 10gR1 this information is available with the X$ table X$KSXPIA. The following SQL shows which information is returned by querying that view. SQL> SELECT * FROM x$ksxpia; ADDR INDX INST_ID P stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

OOW on Thursday

Stefan Roesch - Fri, 2004-12-17 17:23
Thursday was my second day at OOW 2004. Yesterday, we presented our first session, everyhting went fine. Today we are presenting to more sessions: "Project MegaGrid: Performance Management in Large Clusters" and "MegaGrid: Capacity Planning for Large Commodity Clusters". Both of the presentations were a success. If you plan on migrating from a single instance to a grid/cluster environment stefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

OOW 2004 on Wednesday

Stefan Roesch - Thu, 2004-12-16 21:35
Monday and Tuesday I haven't attended the OOW, so Wednesday was my first day. Altogether we were presenting 3 sessions. On Wednesday afternoon it was out first session "Project MegaGrid: Deploying large clusters". Abstract: The successful deployment of an enterprise grid computing environment requires careful and thorough planning in order to build a flexible, easily scalable architecture. Thisstefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

OOW 2004

Stefan Roesch - Sun, 2004-12-05 16:27
Oracle OpenWorld 2004 ıs just around the corner. I had the pleasure to go to SF on Sunday afternoon to setup the MegaGrid demo. I arrived together with a collegue and received my speakers pass. But as we learned with the the Speaker pass we are not allowed to enter the demo ground. We went back to registration and obtained additionally the demogrounds pass. We were lucky everyhting went fine andstefan roeschhttp://www.blogger.com/profile/15182787826739074738noreply@blogger.com0

Enhancing my answer

Mike Keith - Fri, 2004-11-26 23:11

In the recent interview on TSS I was asked a question that went something like this:

8. TopLink has historically been implemented using reflection type approaches. Is that still the case? Or are you doing more with bytecode now?

For a long time I urged that JDO 1.x was doing the wrong thing by pushing people into using byte code enhancement. Note that I personally have nothing against using enhancing technology. It has its good and its bad. However, there are a number of shops that have strong objections to it. They each have their own reasons ranging from version control and configuration management issues to strict QA processes and product delivery. Denying this is akin to denying that certain development and corporate processes exist, and a technology that does so shuts out an entire customer segment strictly because of its implementation/dev process.

When I was on JSR-243 (the JDO 2.0 expert group) I did my darnedest to convince the group that binary compatability was really not a feature that was worth keeping. In practice, nobody ever really wants to change vendors without having to recompile their application. In fact, if after changing vendors all they had to do was recompile their code once, then most people I know would shed tears of joy or simply faint in disbelief. Changing persistence vendors is not only a very infrequent occurrence, but is something that people take very seriously. If even a measure of source code compatability could be guaranteed then most folks would be more than satisfied. The benefits of binary compatability are simply not worth the costs and are virtually never realized in practice.

The group did finally allow binary compatability to be an optional thing in JDO 2.0, and although some of the folks kicked and screamed a bit the decision was made to allow both types of implementations. In my opinion the entire enhancement and PersistenceCapable sections of the spec should have been ditched since the premise of binary compatability was not rooted in reality. The JDO spec could have been hugely simpified if it had removed all of the implementation chapters and focused on the interfaces... but I digress.

So why do I bring all of this up? Well, because some people contacted me and thought that my answer to the interview question above was meant to imply that JDO 2.0 required byte code enhancement. This was not my intention. What I was doing was expressing the same opinion that I have expressed in the past, to the JDO group as well as to others. I don't believe that a specification should require byte code enhancement as an implementation strategy. It may allow it, just not mandate it. I still believe this, and restated it in my answer to the question to show that it has not changed. While TopLink is considering using some enhancement techniques we will always ensure that they not be required. Features that make use of such techniques will be optional.

So, as I mentioned I was not referring to JDO 2.0 when I said this but was specifically referring to my own position statements that I had made earlier (and were directed at JDO 1.x and the ongoing JDO 2.0 discussion at the time). As I mentioned, I think that JDO 2.0 could and should have gone further than it did in that area but that is another subject which is really apart from the subject of this blog. I did not intend for my comments to be taken as FUDing JDO. I understand how they could have been, though, and am sorry if they were taken that way. Hopefully this clarifies it.

One decade and counting

Mike Keith - Sun, 2004-11-21 17:43

This month marks the 10-year anniversary of the birth of TopLink. For those that don't know the history behind TopLink Don wrote an excellent historical perspective in his usual entertaining style here.

While I have not been associated with TopLink for the full decade I have been around for about half of it (although it seems hard to believe) and it has been quite a ride. Loads of fun and lots of great people, which is actually one of the main reasons why I joined in the first place.

There is a newsletter on OTN that celebrates a decade of TopLink technology. It is unfortunate that stuff is so hard to find on OTN, but the link above should help you navigate to some of the articles there. The one on Preparing for EJB 3.0, written by yours truly, was an attempt to show how EJB 3.0 is moving in the direction of TopLink, and there are also a couple of others that talk about TopLink's caching and XML facilities.

Doug Clark and I also did an e-interview with TheServerSide that commemorated the 10-year mark.

Trailing CSS comments

Mike Keith - Sat, 2004-11-20 14:43

CSS came and went and I realized that I never actually came back to tell people how good it was. Complaining about the keynote is fair to do, but is not representative of the conference at large.

I had a few colleagues ask me about CSS, and the way that I describe it is a small, fairly tight-knit group of smart people that get to meet and talk individually in a spectacular mountain setting.

The conference is actually sponsored and organized by Wayne Kovsky and his family, who are a stellar group of folks. The interesting thing (and I have to confess that I was somewhat surprised by this) was that despite the fact that it was organized by a small group of people it was actually one of the best-organized conferences that I have ever been to. They really do put their heart and soul into the conference, and it shows.

My talks went really well and were pretty well-received. A colleague of mine, Donald Smith was also speaking and mentioned that he had a full house on his O-X talks as well. I think that people are pretty much done talking about Tiger because there was not as much interest in many of the J2SE 5 talks in general.

Some people that I met for the first time and enjoyed talking to were Bill Dudney and Bruce Eckel. There were some un-named others that I found less enlightening and not as friendly, but to each his/her own.

Anyway, as I mentioned to people that asked me about it, this conference is well worth the trip, and I am already looking forward to next year.

The straw

Mike Keith - Sat, 2004-11-20 13:43

Okay. I admit defeat. I can no longer remain silent.

After a year of resisting the impulse to create a blog something happened to me that I simply could not hold back. It wasn't that I didn't want to blog, only that I was afraid of the time commitment and the responsibility that I was worried I would take upon myself.

But alas, the camel's back got broken today as I attended a keynote by Tim Bray at the Colorado Software Summit. Despite complaining to him afterwards I could not satisfy my frustration about some of the things that he said, and I felt that if I did not let it out then I would be in danger of combusting. This seemed to be the only venue available.

Tim's presentation was a good one, but he is obviously somebody that speaks a lot and has a bunch of polished pieces of material that he bangs together. Being a technical guy, and very accomplished I might add, he likes to bring things to a very technical level.

Where he really burned me was when he started talking about how O-R mapping was broken. Don't get me wrong, I was not angry at that. Everybody knows it is a broken idea, and something that we would rather not have to do. What got my britches bunched was that he proceeded to say how people shouldn't do it. This is not an acceptable solution, being that the only reason why people are doing it is because at this stage they have to. He, himself, said that some things were too late to change, and I really think that this is just one of those things. Too much data in relational databases and people that want to program in Java. They have to do something, and when I stood up during Q&A and told him so his idea that we all use JDBC was just too naive to be taken seriously. He obviously has never really programmed a real-live application lately and the triteness with which he dealt with the problem was indicative of this.

I have to admit, though, that he really did have a very useful and interesting idea for presenting that consisted of a long list of links that he visited in sequence and talked about. With the wireless in the room most people were able to follow the links and bookmark them individually, or the whole page from his website that he was working off of. Really useful as it leaves you with some concrete pointers of the interesting places to go to follow up on the things that he talked about. Turns out that he is a fellow Canadian, too, which I didn't know when I went up to him. Shame.

And so it begins...

Numeric sorting an alphanumeric column

Bar Solutions - Sun, 2001-12-02 18:00

.code, .code pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .code pre { margin: 0em; } .code .rem { color: #ff0000; } .code .kwrd { color: #008080; } .code .str { color: #0000ff; } .code .op { color: #0000c0; } .code .preproc { color: #cc6633; } .code .asp { background-color: #ffff00; } .code .html { color: #800000; } .code .attr { color: #ff0000; } .code .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .code .lnum { color: #606060; }

The other day a customer came up to me and said: I have this column that holds numeric data, usually. But when I sort it it gets all messed up, because is sorts it alphanumerically. That is 10 is listed before 2 etc.

My first suggestion was: well, sort by TO_NUMBER(column) then.

Well, he replied, that can’t be done. Sometimes the column contains alphanumeric data.

Oh, and I don’t want to use PL/SQL since I once learned that switching between the SQL engine and the PL/SQL engine costs a lot of performance.

Well, let’s start by making a simple statement that shows the problem:

WITH t AS
  (          SELECT '1' numval FROM dual
   UNION ALL SELECT to_char(9) FROM dual -- 9
   UNION ALL SELECT '#' FROM dual
   UNION ALL SELECT to_char(10) FROM dual -- 10
   UNION ALL SELECT 'G' FROM dual
   UNION ALL SELECT 'Green' FROM dual
   UNION ALL SELECT 'Yel2low' FROM dual
   UNION ALL SELECT 'Pink' FROM dual
   UNION ALL SELECT 'B' FROM dual
   UNION ALL SELECT '2' FROM dual
   UNION ALL SELECT '2912B' FROM dual
   UNION ALL SELECT 'B2912' FROM dual
  )
SELECT t.*
  FROM t
 ORDER BY t.numval

The output of this query, is as expected, sorted, but alphanumerically:

NUMVAL
-------
#
B
B2912
G
Green
Pink
Yel2low
1
10
2
2912B
9

12 rows selected.

Then I came to think: I want the ordering done numerically, but I cannot use to_number on the column, because there can be alphanumeric values and that will break the query. But what if I remove everything but the numbers for the column, when sorting. That is where Regular Expressions come in. I am definitely not an expert in the field of Regular Expressions, but I thought I would give it a try. I replace all the letters [:alpha:], punctuality elements [:punct:] and blanks [:blank:] with nothing.

WITH t AS
  (          SELECT '1' numval FROM dual
   UNION ALL SELECT to_char(9) FROM dual -- 9
   UNION ALL SELECT '#' FROM dual
   UNION ALL SELECT to_char(10) FROM dual -- 10
   UNION ALL SELECT 'G' FROM dual
   UNION ALL SELECT 'Green' FROM dual
   UNION ALL SELECT 'Yel2low' FROM dual
   UNION ALL SELECT 'Pink' FROM dual
   UNION ALL SELECT 'B' FROM dual
   UNION ALL SELECT '2' FROM dual
   UNION ALL SELECT '2912B' FROM dual
   UNION ALL SELECT 'B2912' FROM dual
  )
SELECT t.*
  FROM t
 ORDER BY to_number(regexp_replace(numval, '([[:alpha:]]|[[:punct:]]|[[:blank:]])')) NULLS LAST

The output is almost what I want:

NUMVAL
-------
1
Yel2low
2
9
10
B2912
2912B
#
B
G
Green
Pink

12 rows selected.

While reading about the character classes I came across the [:digit:] class and I knew there was a NOT operator. And that is actually what I want. Replace everything that is not a digit with nothing.

WITH t AS
  (          SELECT '1' numval FROM dual
   UNION ALL SELECT to_char(9) FROM dual -- 9
   UNION ALL SELECT '#' FROM dual
   UNION ALL SELECT to_char(10) FROM dual -- 10
   UNION ALL SELECT 'G' FROM dual
   UNION ALL SELECT 'Green' FROM dual
   UNION ALL SELECT 'Yel2low' FROM dual
   UNION ALL SELECT 'Pink' FROM dual
   UNION ALL SELECT 'B' FROM dual
   UNION ALL SELECT '2' FROM dual
   UNION ALL SELECT '2912B' FROM dual
   UNION ALL SELECT 'B2912' FROM dual
  )
SELECT t.*
  FROM t
 ORDER BY to_number(regexp_replace(numval, '([^[:digit:]])')) NULLS LAST

Same output, less typing (which is always good)

NUMVAL
-------
1
Yel2low
2
9
10
B2912
2912B
#
B
G
Green
Pink

12 rows selected.

But the combined values are not sorted properly yet. If there are other characters than digits in the column, then they should just be sorted alphabetically. The current query just remove all the non digits before sorting. What I want to happen is when I can remove all the non digits from the column and the length is still the same, then sort numerically, otherwise give it a NULL value putting them at the end of the resultset. My second ordering clause is just the column, resulting in all the non-numeric values being sorted ‘normal’

WITH t AS
  (          SELECT '1' numval FROM dual
   UNION ALL SELECT to_char(9) FROM dual -- 9
   UNION ALL SELECT '#' FROM dual
   UNION ALL SELECT to_char(10) FROM dual -- 10
   UNION ALL SELECT 'G' FROM dual
   UNION ALL SELECT 'Green' FROM dual
   UNION ALL SELECT 'Yel LOW' FROM dual
   UNION ALL SELECT 'Pink' FROM dual
   UNION ALL SELECT 'B' FROM dual
   UNION ALL SELECT '2' FROM dual
   UNION ALL SELECT '2912B' FROM dual
   UNION ALL SELECT 'B2912' FROM dual
  )
SELECT t.*
  FROM t
 ORDER BY CASE
            WHEN (length(numval) = length(regexp_replace(numval, '([^[:digit:]])'))) THEN
             to_number(regexp_replace(numval, '([^[:digit:]])'))
            ELSE
             NULL
          END NULLS LAST
         ,numval

Resulting in this output:

NUMVAL
-------
1
2
9
10
#
B
B2912
G
Green
Pink
Yel2low
2912B

12 rows selected.

This is a lot more like I think the customer wants his column to be sorted. Maybe not exactly what he want it, but it’s a good start, I think.

Pages

Subscribe to Oracle FAQ aggregator