The Four Stages of Object-Relational Mapping (a progression)

sbohlen October 4, 2008 15

In my career I have seen many a developer embark on the journey that eventually leads them to what I would term ‘Object-Relational Mapping as it was originally intended‘. After watching this process over and over again, I have reached the conclusion that the path is always the same. Different people on the journey may be in different stages and some may even stay ‘stuck’ in one stage for several years (or even, sadly, forever in some cases) but the stages along the journey always seem to happen in roughly the same progression.

I know this because its the same journey I have made myself and I see it repeating in other developers regardless of their backgrounds. Like any journey, its helps to know where you’re headed, so I wanted to share my observations on these different stages.

Stage I: the Dataset (the ‘O/R- what?’ stage)

It (usually) begins with the .NET Dataset. Seduced by the (relative) ease of draggy-droppy™ designer support that allows a painless creation of data-access code, developers are taken in by a technology that offers a RAD experience that couples the ease of a visual design surface with the familiarity of settable properties to control behavior and a wizard that lets you write comfortable SQL statements in small edit boxes at design-time and even preview the results of your queries before you commit them to code (interestingly, by clicking ‘OK’ rather than actually coding any queries, of course smile_wink ).

Statements like “Hey, I just built my whole data-access-layer in 15 minutes!” are a common occurrence during this stage. You rejoice in the (relative) ease with which you can do really ‘cool’ stuff like databind to form controls just by setting a few properties. You benefit (at least during the initial development stages) from the framework for your application being part of a single, (reasonably) coherent, self-reinforcing ecosystem of interconnected parts designed with each other (mostly) in mind.

“This is the way data access was always supposed to be!“, you cheer.

Stage II: Generated Code (the ‘A Class for every table and every table with its class’ stage)

But then one day, usually when your underlying database changes a lot or when someone suggests a significant new set of functionality be added to your application, the Dateset hangover starts to set in. You begin to get concerned that all that Dataset code that used to ‘just compile perfectly’ has started to raise errors and return arcane error messages only a mother could love from parts of the application that you didn’t even know had code in them in the first place. “My app won’t compile because of some error in my ‘designer.cs’ file. I didn’t even know I *had* a ‘designer.cs’ file in my project!” is a familiar refrain often heard at the beginning of this stage. You’re in .NET Dataset Maintenance Hell, a place you have to spend some time in yourself to truly appreciate.

So you start to look around for a data-access approach that will shield you and your code from what seems to you should be reasonably minor changes in the database. You look for a solution that will save you when the Customer table gets changed to Customers without you’re needing to lift a finger.

Your quest for a better approach leads you begin to explore the world of Object-Relational Mapping frameworks that so many people seem to be talking about. But even if you limit your research to the most-popular O/RM frameworks, there are at least 5-10 of them offering a huge and varying array of features, they all seem really complicated, and most of them are open source so the documentation tends to be lacking compared to the professional (though often also unapproachable) documentation that surrounded the .NET Dataset world with which you are already familiar. But you still need to solve your .NET Dataset pain so you keep at it.

You discover the crack-like high that comes from code-generation tools that can slave your class-design to the database schema, regenerating your code every time your database changes. You notice that the code generated from this approach isn’t nearly as obtuse and opaque as the .NET Dataset code (and it doesn’t come with a “this code generated by a tool, don’t edit it by hand” annotation intended to frighten you away from looking at it in the first place. You even learn to extend the generated code by hand using tricks like partial classes, although even as you’re doing this you can hardly even believe they actually added a whole keyword to the language just to support the needs of the visual tooling smile_embaressed .

Some of these code-generation approaches are completely integrated into the O/RM frameworks, others are third-party or OSS engines with templates designed to support your selected O/RM, but they all work in mostly the same way: classes named the same as your tables, and properties named the same as your fields are created with the press of a single button by interrogating and reverse-engineering your database schema.

“Hey, I just built my whole data-access-layer in 1 second!” is commonly heard during this stage, which is much faster than the 15 minutes the last way required. And with the code-gen in place, you find you hardly miss the visual tooling that made the Dataset so attractive.

“This is the way data access was always supposed to be!“, you cheer.

Stage III: Mapping DTOs to Rich Domain Models (the O/O Mapping stage)

But then one day, usually as your projects start to get more complex, you start to realize that you need to have classes in your object model that don’t relate directly to the tables in your database. You start to build a richer object model designed around the needs of your application rather than the needs of your database. Slowly, a more fully-featured and more intricately-designed and finer grained object model emerges from your software designs that includes classes that make sense to solve your software problems but don’t match well with your database design. Sometimes data from two and three different tables needs to find its way into a single object in your model, and you’ve just discovered what was really meant by the “Object-Relational Impedance” mismatch all those people kept talking about all the time in those discussion groups you read when researching O/RMs in the first place.

But you’re addicted to the crack-like high of code-generation, so you try to have the best of both worlds: you code-gen a set of classes against your database schema, call them ‘Data Transfer Objects’ or ‘DTOs’, and then start the arduous process of aggregating and composing multiple DTOs into the Domain objects that need them. You begin this process slowly by hand, thinking “how hard could this be?” and before you know it you start to write reams of brittle object-property-assignment code like…

DomainCustomer.Address = AddressDTO.Address;
DomainCustomer.CustomerType = TypeDTO.Type;

Sure, you’re spending a lot of time now mapping objects-to-objects (and via tedious, hand-written code that the code-generator was supposed to save you from in the first place no-less), but you are still able to leverage your code-generation approach and so it feels like the best of both worlds to you: auto-generated classes that map to the database structure and yet also a whole separate collection of classes that you can point to and say “This is my domain model, driven by the needs of the user-base and the problem-space. They consume the DTOs when needed to persist their values to the database.”

“This is the way data access was always supposed to be!“, you cheer.

Stage IV: Mapping Rich Domain Models to the database (the ‘eureka moment’ stage)

One day you notice that there is a lot of repetition to the object-property-assignment code you keep writing over and over again to hydrate data into your Domain Model objects. You also notice that every time your database changes even slightly and you code-generate your DTO classes in response that the brittle property-assignment-code you wrote keeps breaking. And there’s so much of it! And its scattered all over the place in your class definitions! “What a mess!“, you discover.

So like all good developers faced with an annoying infrastructure challenge, you start to dream of a framework you could build that would make all this repetitive code just ‘go away’. “What I really need now,” you reason, “is an Object-Object-Mapper, a sort of ‘O/OM’ that will let me configure the relations between my DTOs and my Domain Model objects without having to write and maintain all the messy assignment code.” But like all framework and library developers, when you start to really think about the complexity of what you’re considering building, the challenges start to mount. Collections mapping, read-only properties on Domain Objects, identity values, and a whole host of other challenges adorn your list of impediments for your new framework.

So you look around at the landscape and slowly being to realize that the class of software you actually need has already been built but you haven’t been using it as it was intended at all. What you really need to solve this problem is an Object-Relational Mapper, the very tool you have already been using for years! smile_embaressed

But instead of hampering the power of your O/RM framework by needlessly shackling it to a code-generation tool, you realize that you need to start to throw off the artificial security-blanket of generated code slaved to your database schema and instead use the O/RM tool to map your database directly to your rich Domain Object Model classes.

Free from the encumbrance of code-generated DTO classes, your single object model is free to express the needs of your software in a much more robust and flexible manner just as it did in the past stage. But now you are also free of the need to do O/O Mapping between your Domain Objects and your DTO classes. The O/RM tool takes care of persisting data from your objects to the database and hydrating these same objects with data from the database when needed. The tool is doing for you what it was originally designed to accomplish: freeing you from maintaining brittle plumbing code for data access anywhere in your own applications.

“This is the way data access was always supposed to be!“, you cheer.

And you would be right. Welcome to the end of your journey.