ObjectSpaces: The Matrix is not Enough
In response to my earlier post, “Spanning the Matrix“, Ralfs Sudelbücher has something to say about building O-R mapping engines in “Spans in ObjectSpaces are not Enough.“
He is making the point that it is not enough just to describe which associated objects/collections should be retrieved along with the primary objects of the query, that you should in addition be able to specify exactly which properties of an object are retrieved to avoid pulling back too much data.
I have to agree that it would be desirable to do this for may particular scenarios, however, as long as the objects and schema remain static throughout your applications this has a variety of drawbacks.
1) Selectively populating data for an object has a big downside if you ever intend to pass off this information to another part of your application or someone else's component. The object itself does not encapsulate the semantics you were implying by restricting certain fields of information, so another piece of code that sees instances of the same object may very well assume all data is accurate. Even if the properties are encoded using something equivalent to NULL, it is not evident given an object is the data is really NULL in the database, or it was just omitted from the query.
2) Behaviors built into the object may depend on data being available. For example, a read-only property that calculates its value based on other fields would be impossible if you could merely omit certain fields given a query.
The only rational thing to do would not be to omit properties during a query but to project your data into a new strongly typed object definition that would correctly describe the result set that you want. You would basically be re-encapsulating your data and would likely not have any behaviors associated with the result at all.
You really have to decide whether you want objects to be bastions of behavior, that describe fixed semantics over your entire database schema, possibly encoding strong relationships between properties as well as stronger criteria such as cardinality. This puts you in a world where your objects are truly just mirrors of the database state in the truest Object-Persistence world view. Or you have to decide that your object-data is really just a projection of this meta model, and that the results you obtain in your application code are merely just the resulting data, and there is no strong tie back to the semantics of the database. You can only have consistency given these two extremes. There is no middle ground.
So are your objects data or are you data objects?
You decide.
Matt
Comments
- Anonymous
March 22, 2004
The comment has been removed - Anonymous
March 22, 2004
The comment has been removed - Anonymous
March 23, 2004
My understanding of the above situation described matches the issue of reporting quite well. IMO, reporting is not where OO shines, and so, to use it as an example to drawbacks of objectspaces would be like saying a screwdriver isn't good at hammering in nails - although the need to hammer nails is commonly agreed.
This is not meant to slam Ralfs in any way, just that I think that too often developers look at a new tool/technique and try to do everything with it. "But does it do the dishes ?" - Objectspaces is good for what it was meant to do. - Anonymous
March 25, 2004
I think spans are fine if you (the app developer) know that you'll want to pull data into, for example, customers and orders. If you don't, then lazy loading is the way to go. I'm hoping that the way Ospaces are built, you can specify lazy loading in your schema & class definition, and use it or spans depending on how you structure your query. I haven't quite figured out what magic is hap'nin under the hood for lazy loading. Let's say we have an order table and an order catalog table. Our order table has a columns, order_catalog_id, which is a foreign key into the catalog table. When I get an order, I'd to be able to lazy load the row from the catalog. My first query runs, and pulls in the order. Does my catalog ID column get placed into a property of my order class? I then try to access order.catalog.name. Does it next use a query that runs based on the catalog_id field that I have stashed? Or does it have to join back to the order table? In WebObjects EOF, this is handled properly by storing a "fault" in the order class-catalog relationship, and resolving that at late load time.
Or course, I'd like to be able to intervene during this process to allow a couple of improvements. For instance, I'd like to cache all or parts of my relatively static catalog table, and use that instead of the query. I'd also like to be able to specify an additional flavor of lazy loading and/or span - one that runs in a background thread, where I have information that I want to present to the user immediately, and anticipate the next big click. - Anonymous
April 05, 2004
Lazy loading...
If your authority is the database and the objects model it, then the sort of collection (array, array list, etc) that you choose for related rows (e.g. children) is arbitrary, isn't it?
So then once we have generics why not have a DelayedLoadList<MyChildClass> which is loaded on demand?
Life is easier when you can dictate some of the data structures that the implementor can use... or at least what they should use to get the best functionality.
Of course if you load a list of X where condition (A) is true, then it might be ideal to load all children of X where (parent key in condition(A))
I always imagined that the best way is to allow casual queries, but to focus the skilled/disciplned workers on designing access patterns that can define how a network of objects are loaded.