In my career I have seen many a developer embark on the journey that eventually leads them to what I would term ‘Object-Relational Mapping as it was originally intended‘. After watching this process over and over again, I have reached the conclusion that the path is always the same. Different people on the journey may be in different stages and some may even stay ‘stuck’ in one stage for several years (or even, sadly, forever in some cases) but the stages along the journey always seem to happen in roughly the same progression.
I know this because its the same journey I have made myself and I see it repeating in other developers regardless of their backgrounds. Like any journey, its helps to know where you’re headed, so I wanted to share my observations on these different stages.
Stage I: the Dataset (the ‘O/R- what?’ stage)
It (usually) begins with the .NET Dataset. Seduced by the (relative) ease of draggy-droppy™ designer support that allows a painless creation of data-access code, developers are taken in by a technology that offers a RAD experience that couples the ease of a visual design surface with the familiarity of settable properties to control behavior and a wizard that lets you write comfortable SQL statements in small edit boxes at design-time and even preview the results of your queries before you commit them to code (interestingly, by clicking ‘OK’ rather than actually coding any queries, of course ).
Statements like “Hey, I just built my whole data-access-layer in 15 minutes!” are a common occurrence during this stage. You rejoice in the (relative) ease with which you can do really ‘cool’ stuff like databind to form controls just by setting a few properties. You benefit (at least during the initial development stages) from the framework for your application being part of a single, (reasonably) coherent, self-reinforcing ecosystem of interconnected parts designed with each other (mostly) in mind.
“This is the way data access was always supposed to be!“, you cheer.
Stage II: Generated Code (the ‘A Class for every table and every table with its class’ stage)
But then one day, usually when your underlying database changes a lot or when someone suggests a significant new set of functionality be added to your application, the Dateset hangover starts to set in. You begin to get concerned that all that Dataset code that used to ‘just compile perfectly’ has started to raise errors and return arcane error messages only a mother could love from parts of the application that you didn’t even know had code in them in the first place. “My app won’t compile because of some error in my ‘designer.cs’ file. I didn’t even know I *had* a ‘designer.cs’ file in my project!” is a familiar refrain often heard at the beginning of this stage. You’re in .NET Dataset Maintenance Hell, a place you have to spend some time in yourself to truly appreciate.
So you start to look around for a data-access approach that will shield you and your code from what seems to you should be reasonably minor changes in the database. You look for a solution that will save you when the Customer table gets changed to Customers without you’re needing to lift a finger.
Your quest for a better approach leads you begin to explore the world of Object-Relational Mapping frameworks that so many people seem to be talking about. But even if you limit your research to the most-popular O/RM frameworks, there are at least 5-10 of them offering a huge and varying array of features, they all seem really complicated, and most of them are open source so the documentation tends to be lacking compared to the professional (though often also unapproachable) documentation that surrounded the .NET Dataset world with which you are already familiar. But you still need to solve your .NET Dataset pain so you keep at it.
You discover the crack-like high that comes from code-generation tools that can slave your class-design to the database schema, regenerating your code every time your database changes. You notice that the code generated from this approach isn’t nearly as obtuse and opaque as the .NET Dataset code (and it doesn’t come with a “this code generated by a tool, don’t edit it by hand” annotation intended to frighten you away from looking at it in the first place. You even learn to extend the generated code by hand using tricks like partial classes, although even as you’re doing this you can hardly even believe they actually added a whole keyword to the language just to support the needs of the visual tooling .
Some of these code-generation approaches are completely integrated into the O/RM frameworks, others are third-party or OSS engines with templates designed to support your selected O/RM, but they all work in mostly the same way: classes named the same as your tables, and properties named the same as your fields are created with the press of a single button by interrogating and reverse-engineering your database schema.
“Hey, I just built my whole data-access-layer in 1 second!” is commonly heard during this stage, which is much faster than the 15 minutes the last way required. And with the code-gen in place, you find you hardly miss the visual tooling that made the Dataset so attractive.
“This is the way data access was always supposed to be!“, you cheer.
Stage III: Mapping DTOs to Rich Domain Models (the O/O Mapping stage)
But then one day, usually as your projects start to get more complex, you start to realize that you need to have classes in your object model that don’t relate directly to the tables in your database. You start to build a richer object model designed around the needs of your application rather than the needs of your database. Slowly, a more fully-featured and more intricately-designed and finer grained object model emerges from your software designs that includes classes that make sense to solve your software problems but don’t match well with your database design. Sometimes data from two and three different tables needs to find its way into a single object in your model, and you’ve just discovered what was really meant by the “Object-Relational Impedance” mismatch all those people kept talking about all the time in those discussion groups you read when researching O/RMs in the first place.
But you’re addicted to the crack-like high of code-generation, so you try to have the best of both worlds: you code-gen a set of classes against your database schema, call them ‘Data Transfer Objects’ or ‘DTOs’, and then start the arduous process of aggregating and composing multiple DTOs into the Domain objects that need them. You begin this process slowly by hand, thinking “how hard could this be?” and before you know it you start to write reams of brittle object-property-assignment code like…
DomainCustomer.Address = AddressDTO.Address;
DomainCustomer.CustomerType = TypeDTO.Type;
Sure, you’re spending a lot of time now mapping objects-to-objects (and via tedious, hand-written code that the code-generator was supposed to save you from in the first place no-less), but you are still able to leverage your code-generation approach and so it feels like the best of both worlds to you: auto-generated classes that map to the database structure and yet also a whole separate collection of classes that you can point to and say “This is my domain model, driven by the needs of the user-base and the problem-space. They consume the DTOs when needed to persist their values to the database.”
“This is the way data access was always supposed to be!“, you cheer.
Stage IV: Mapping Rich Domain Models to the database (the ‘eureka moment’ stage)
One day you notice that there is a lot of repetition to the object-property-assignment code you keep writing over and over again to hydrate data into your Domain Model objects. You also notice that every time your database changes even slightly and you code-generate your DTO classes in response that the brittle property-assignment-code you wrote keeps breaking. And there’s so much of it! And its scattered all over the place in your class definitions! “What a mess!“, you discover.
So like all good developers faced with an annoying infrastructure challenge, you start to dream of a framework you could build that would make all this repetitive code just ‘go away’. “What I really need now,” you reason, “is an Object-Object-Mapper, a sort of ‘O/OM’ that will let me configure the relations between my DTOs and my Domain Model objects without having to write and maintain all the messy assignment code.” But like all framework and library developers, when you start to really think about the complexity of what you’re considering building, the challenges start to mount. Collections mapping, read-only properties on Domain Objects, identity values, and a whole host of other challenges adorn your list of impediments for your new framework.
So you look around at the landscape and slowly being to realize that the class of software you actually need has already been built but you haven’t been using it as it was intended at all. What you really need to solve this problem is an Object-Relational Mapper, the very tool you have already been using for years!
But instead of hampering the power of your O/RM framework by needlessly shackling it to a code-generation tool, you realize that you need to start to throw off the artificial security-blanket of generated code slaved to your database schema and instead use the O/RM tool to map your database directly to your rich Domain Object Model classes.
Free from the encumbrance of code-generated DTO classes, your single object model is free to express the needs of your software in a much more robust and flexible manner just as it did in the past stage. But now you are also free of the need to do O/O Mapping between your Domain Objects and your DTO classes. The O/RM tool takes care of persisting data from your objects to the database and hydrating these same objects with data from the database when needed. The tool is doing for you what it was originally designed to accomplish: freeing you from maintaining brittle plumbing code for data access anywhere in your own applications.
“This is the way data access was always supposed to be!“, you cheer.
And you would be right. Welcome to the end of your journey.
Yes that pretty much sums up how I progressed…good observations ;).
Steve, what is your opinion of the MyGeneration tool and template you used in your videos? I’m in the process of designing a brownfield data access layer via nhibernate and would like to skip the stage 2 and stage 3 issues you mention here. What’s the best way to get to stage 4 bliss if the database structure is already fairly cemented? There are approximately 100 tables or so in this database.
@Chris:
That’s a really great question and I think that it points to something that I didn’t really touch on in my post but that I firmly believe: that each of these stages may be entirely appropriate in diffferent contexts (yes, even the .NET dataset stage has its place in certain solutions) 🙂
Realistically, all of the examples that I used to illustrate each of these stages sort of assume a greenfield situation rather than a brownfield one. IMHO if your database is already built and you are also in a brownfield application situation (e.g., there is an already-built application that depends upon this existing database) then the ‘code-gen DTO’ approach seems perfectly reasonable to me as a way to jump-start the introduction of an O/RM tool.
In such a context, you already have an existing application (presumably with existing logic, objects, structure, etc.) that is already well-fixed in terms of its modeling of your problem domain (and hopefully also SOLVES the problems its designed to address); the ‘bliss’ (nice term, BTW) provided by Stage 4 is not really going to be achieved in this kind of a situation since you are likely not re-engineering a new rich domain model into your application at the same time that you’re re-engineering your data-access-layer.
If its just your database that’s brownfield and not your application, then the pre-existence of a database doesn’t in any way preclude your approaching the problem from the ‘Stage 4’ perspective. In fact, this is one of the areas that many O/RM tools (NHib included, OF COURSE 🙂 ) can truly shine — mapping an existing database schema to a new object model where the database schema was designed with one set of concerns in mind and the object model is designed with a different set of concerns.
By listing the Stages as a progression, I think I may have left the impression that only ‘enlightened’ developers achieve Stage 4 (like its some kind of Bhuddist ladder or something) and that the higher stages are somehow ‘better’ in every context. Like I said, ALL of theses approaches are valid in different contexts, and so considering your situation before deciding how to approach a solution is of course critical here.
As for the MyGeneration template from the screencast, we have used it in any number of projects (where the code-gen approach made sense) with great success, so if this approach makes sense for your situation, I certainly wouldn’t take this post of mine as an effort to suggest that your approach is invalid.
So I guess I am between a hybrid state 2/3 and heading to stage 4. I have this email in my inbox with a discount for upgrading our CodeSmith installs to V5 and I don’t think that we need it anymore. NHibernate is on our horizon.
Excellent post!
I have been working with NHib for about two weeks now and think I have managed complete most of the journey and am now looking down at the green pastures of Stage IV. What I fear is that when I reach said pastures, while the grass may be greener, I will be greeted by the not so pleasant potpourri of the need for significant ORM specific subject matter expertise.
Moving away from my crappy metaphor (pun intended), what I am afraid of is that the level of expertise required to generate a rich and meaningful domain model (particularly in a brown field situation) will require such expertise with Nhib or other ORM that your tool becomes a new career path instead of a means to an end.
I have already stumbled across a few “thou shalt not” scenarios in my short experience with NHib (mostly due to ignorance and bad design I’m sure), that seem to indicate that the further you get from the bread and butter One Class One Table world the more likely you will be to confronted by the potential steep learning curve of ORM specific expertise and and maintainability / extensibility issues when your ORM guru leaves.
Am I wrong in assuming that the greater the disparity between your data model and the domain model the greater the required ORM expertise? Have you had issues making your chosen ORM bend to the needs of your desired domain model or am i worried about nada…
@Wa:
I actually think that your observations are indeed on-target. As your Domain Model digresses more and more from your relational model, the correspoinding disconnect increases and thus the effort needed to map the two back to each other (usually via an O/RM) can get quite significant.
If your database already exists in your application and you have good reason to radically diverge your Domain Model from the database, then a tool like NHibernate may not be the right one. IBatis (and IBatis.NET) is a tremndously powerful alternate approach that offers a much more flexible relational < --> object mapping experience than even Hibernate/NHibernate provides. If your Domain Model really bears *ZERO* resemblance to your relational model, then this might make more sense for your situation.
Its a spectrum though along which you need to decide on a target ROI and a solid cost-benefit target. Like all good software design questions, the answer is ‘it depends’ and its up to you to decide what’s right for your project, of course. But in general, yes, you can design yourself into a real corner if you’re not careful with developing a Domain Model in complete isolation from any idea about how it will be persisted.
Remember what I always tell people: just because your *objects* are persistent-ignorant doesn’t mean that *you* our your *design* should be too 🙂
[…] the Four Stages of Object/Relational Mapping post, I illustrated in some detail the journey that I personally undertook (and consider to be very […]
What is the 5th stage then? We have been through all 4 and realized that at the end of the day we needed the SQL that was being generated by the ORM layer to be tweaked to our requirements for performance and data accuracy. We had developed a very rich object Model that captured all our business requirements that had grown from an originally ORM Domain or Entity object.
So we wrote a layer that would take the SQL and hydrate our Objects directly through ADO.NET. We were kind of back to the basics, but we wrote a tool to do all the work for us and it generate us the code, and managed the mappings instead of writing another Framework or Layer. It was the ideal ORM solution our team was looking for. Things were finally manageable and under control. We got rid of all the configuration and mapping properites code that we had collected in the previous stages and just generated the code from the tool whenever our DAL needed work. The Objects became oblivious of their pesistance, and the generated DAL code acted as a service to read and persist Objects to fulfill specific business requirements and use cases.
You can go check it out at Orasis Mapping Studio
@Adnan
I realize your post was just a product plug…but you know that most ORMs will allow to use your own SQL right? So just use the ORMs generated SQL until you need to eek out a little more performance and replace that one instance with your hand-crafted SQL. You could even use SPROCs *gasp* (sorry Steve).
@Ryan:
The right tool (for data-access or any job for that matter) is the one that “makes simple stuff easy, makes complex stuff simple, and makes difficult stuff possible”. An ORM that provides you the ‘safety valve’ of falling back to hand-crafted SQL or even (yes!) sprocs if/when you need them for the small percentage of times that the tool isn’t doing for you exactly what you need is absolutely the right choice (and its not a surprise to any reader of this blog that NHibernate does all of that!).
Besides, the world already has an OSS product that “maps SQL queries to objects”. Its called IBATIS (http://ibatis.apache.org/) and in case anyone is interested in pusuing this approach (which to be fair can be quite useful in certain cases), check out either IBATIS (java) or IBATIS.NET (.net) and dig into it– its a really powerful tool to have in your toolbox, but doesn’t rely AT ALL on any code-gen to get its work done. I’ve used it in the past for situations where a ‘traitional’ ORM isn’t appropriate/needed/possible and it works quite well for what it is/does.
Its not clear to me at all how Orasis Mapping Studio is an improvement on this already-existing (and quite mature) OSS project. If anyone knows the answer, please let me know.
@Ryan:
Sorry, that reply (of course) was really intended for Adnan 🙂
@sbohlen and @Ryan
Thanks for your questions.
Orasis is not different from other ORMs in achieving the same results i.e. database abstraction and mapping Objects to Databases. How it achieves that it is completely different. I would just say that while the rest are all frameworks with maybe some GUI to get you going, Orasis is not a framework! It is a database access layer development IDE that you use build and manage queries and mappings to your Objects (using refelection and query metadata). When you say “Build” it then calls its internal code generator to generate you pure ADO.NET code to achieve your mapping results. You can then use the code or the generated Assembly in your project directly. And what developer would not want the ability to actually run and test their DAL layer and mappings before calling them in their Application layer. The Orasis Mapping studio has a built-in tester to execute your code and sql and let you see your results before you even call it from your app.
A download is better than a thousand words. Try it and se for yourself. It is a database access development revolutionary technolgy!
@Adnan:
In the spirit of open-mindedness and being willing to have my assumptions challenged, I will put it on the list to investigate and let you know my thoughts.
It may indeed be solving a problem I didn’t know I had 🙂
Thanks~!
I have been using Orasis Mapping Studio 2009 and I had a great experience and great results. I launch it on my desktop, I do all of my database mappings and I reference the .csproj from my Visual Studio solution. Very easy to use and saves me tons of time. I have tried ORM and not only they do not always perform, some queries take for ever to implement. Why not use and get done in minutes? I have better things to do rather writing database code.
Here is our blog http://www.orasissoftware.com/blog for more info about the mapping software and podcasts!