Jaa


Objects in Axum

When we talk about Axum as a programming language, we make the point that it is not an object-oriented language, but that it is still object-aware. What do we mean by this, and is it really true that you cannot define objects with Axum?

What we mean is that the core concept of Axum is not the “object” of object-oriented programming, but agents and domains. These could be viewed as objects, of course, but have so many constraints placed on them that anyone who is a fan of OO programming would protest against Axum as an OO language. It would also be obscuring the central point that we are trying to make. On the other hand, being a .NET language means floating on a sea of objects, so Axum must be aware of the underlying platform and its central paradigm, which is inescapably object-oriented.

Then, we usually say that “in fact, you can’t even define a class in Axum,” as if to prove the point that it’s not OO. This is true, there are no ways to define classes; however, there is a way to define types, which we call “schema.” In C++ jargon, a schema type would be called a POD, something that is less than a full-fledged object. We’ve heard from some that “schema” isn’t a good name for this, so if you have a better names that works well as a language keyword, too, then we’ll be all ears.

A schema is a .NET class which contains only public properties and side-effect-free methods and a new kind of member called a ‘rule.’

Schema types are intended for use as payload definitions for channel communication, and thus are guaranteed to be deeply cloneable. The compiler generates the clone code, which is about 100x faster than reflection-based cloning. Schema instances are also guaranteed to be serializable and therefore automatically suitable for inter-process communication.

Why do we need this? If you are familiar with distributed programming, a schema is just a data-transfer object type, but with language support. The original reason for DTOs was to cut down on round-trips across the network – calling setters and getters on a remote object wasn’t really feasible. For Axum, the reason is somewhat different – it’s another constraint placed on objects – we simply cannot trust that types implementing ICloneable are doing so in a deep fashion (there’s no formal requirement to do so).

We could have built a deep-clone runtime capability based on reflection, but that would be orders of magnitude slower than the compiler-generated clone that having language support allows us to rely on.

A simple Address schema for US addresses:

schema Address
{
required String StreetAddress;
required String City;
required String State;
required String ZipCode;

    rules
    {
        require ! String.IsNullOrEmpty(StreetAddress);
        require ! String.IsNullOrEmpty(City);
        require ! String.IsNullOrEmpty(State);
        require ! String.IsNullOrEmpty(ZipCode);

require State.Length == 2;
require ZipCode.Length == 5 || ZipCode.Length == 10;
    }
}

Specifying rules for a schema is entirely optional, but can be a useful tool both because of the runtime enforcement that it provides and for the additional information it provides the reader of the source code with. The rules are enforced when you send data to a channel port; they may only involve calls to methods that are known to be side-effect free.

Schema are versionable, meaning that the version of the schema that you use to write a serialized object and the one you use to de-serialize don’t have to be exactly the same. When de-serializing a schema instance from a stream, only the required properties need to be found in the stream; the schema may also contain a number of optional properties, which, if not present, will be given default values.

If an optional field is present in the stream but not recognized by the target schema type, the data is stored in a private data structure so that the instance can be re-serialized without losing the information.

Schema types are really simple – everything (except the type itself) is public, methods must be side-effect free, and the property definitions look like fields, i.e. you don’t get to define the implementation. Schema rules are invoked by the runtime.

We’ve discussed internally whether schema instances ought to be immutable, a property that would have all kinds of nice implications, but the code in the CTP that we hope to announce soon on this blog does not treat them as immutable. This is one area where getting feedback would be very valuable to us – should our transfer objects be immutable? In J2EE, for example, there is no strong recommendation one way or the other, but I’m thinking we should be a bit more specific.

I’m also thinking that we need to add compiler-generated equality and hash-key functions to make sure that schema have value rather than reference equality semantics. Clearly, schema types are by no means a finalized concept…

Niklas Gustafsson
Axumite

Comments

  • Anonymous
    April 20, 2009
    PingBack from http://microsoft-sharepoint.simplynetdev.com/objects-in-axum/

  • Anonymous
    April 21, 2009
    A big definite yes for immutability from me. The more the better. And please include it in the CTP!

  • Anonymous
    April 21, 2009
    It should have value-type semantics where local modifications are not visible to others unless communicated back (at which point the rules are run).  Of course, it should be fast as reference-type values for passing around. In other words, it should be a bit like memory-transacted objects. Since you're in control of the whole data flow experience, I would think this would be easy to accomplish.  Each schema has a corresponding "journal" helper that keeps track of locally modified fields.  When it comes time to send the schema instance to someone else, the journal creates a new immutable instance using the original fields and the modified fields tracked by the journal.  This approach provides the benefit that instance fields can be read and written to as if the instance were mutable.  Also, it creates clear semantics on when the schema rules/invariants are triggered: namely when the journal creates the new instance.  This transition could be triggered at transmission time or some other boundary (e.g. a "modify" keyword similar to "using" in C# inside which an instance becomes journaled). Just some food for thought. :)

  • Anonymous
    April 21, 2009
    a definite no for immutability. because objects are cloned when they cross channels there is no risk in having a concurrent access/mutation of an object. the application can still treat a schema as immutable of course. what about having a keyword immutable that can be applies to schema definitions? immutability is very very restrictive in this case because just about every piece of data would be immutable. in order to update a single field you have to clone the instance and assign it to your reference variable. that is very unproductive. there are just too many cases where immutability is too limiting. again, you can treat objects as immutable in your code if you like (but not the other way round).

  • Anonymous
    April 21, 2009
    How about making mutability optional?  I can see some times where you'd want an immutable DTO in which case it behaves more like a ValueObject but in the uses I have, it will be modified as it passes through the tiers but thread safety will not be a concern in the scenario.  I would absolutely hate to have large object graphs cloned everytime I set a property.  It would be more ideal if the person defining the schema could simply apply a "immutable schema Address" or something like "readonly" even and so on.  It would be better to give us the option on that perticular behavior or I wouldn't be able to use it if it produces only immutable schemas.

  • Anonymous
    April 21, 2009
    The comment has been removed

  • Anonymous
    April 21, 2009
    Our object graphs represent many pieces of data.  It will pass through a chain of functions that populate pieces of it according to their rules and need before it is actually transferred and after it is transferred. True, there is one function that sets the data, but it is broken into many sub functions that populate portions of it.  Sometimes it makes it into the queue and when dequed, evaluated, used by an operation in some way, and then a status changes and it goes into another queue for further operaion.  DTO's are not immutable (or rather I don't see in its spec that it must be).  If it is therefore optional, then why force it? Because our volume is millions of these transactions daily and sometimes hourly, it is not feasible to make copies everytime we set a property when there are hundreds of them between the objects in the graph (customer, order[s], payment[s], etc.).  Refactoring is not an option.  It is what it is.  Our responses must be nearly instantanous too.  So there's not need to clone post 500k object graphs every single operation. It is perfectly acceptable to pass through as many setter functions as we want, pass it through the wire, and change it some more without having to clone it (for memory and performance reasons).  I just optimized an unreliable function that cloned before writes and brought the process to its knees.  Not cloning fixed it beautifully. One more thing to keep in mind, is that every operation is stateless, so there is never anything else sharing it in our scenario.  Threading is not an issue here.  And where it is, we've dealt with it in other ways.  So it still isn't an issue even then. It may smell funny to you, but it is the nature of our operation.  Because Transfer Objects are not required to be immutable one way or the other per definition, then it should not be forced upon us.  If MS leaves it optional to make immutable, then you are free to make all your schemas immutable, and I am free to leave mine mutable per our operational requirements/design/legacy implementation and all the compatibility that must come with it. Usually, I make my DT/VO's immutable by design, but I see no good reason why it must be that way all the time.  If you want immutable, use ValueObject instead.  It is immutable by definition.

  • Anonymous
    April 21, 2009
    The comment has been removed

  • Anonymous
    April 21, 2009
    i cannot believe how fixated some people are around forcing everyone to best practices. immutability has its places but is not appropriate everywhere. repeat the previous sentence for any of the following: singleton, factory, 3-tier architecture, web service ... there is no solution that fits everyone. are you stating that 95% of all software projects could be done more productively with mandatory immutability? "And no one's "forcing" Axum on you, just like you're not "forced" to use the CLR if you really think you want to modify memory directly". the goal of the axum team is to gain adoption not hinder it. your comment is pointless in this regard. "Might I suggest that your needs sound very very very specific." what is specific about constructing a dto across multiple methods? this scenario alone would be enough to justify optional immutability.

  • Anonymous
    April 22, 2009
    Tobi, if the main goal of the Axum team was to gain adoption, then they'd just build a drag n' drop designer. I mean, come on, YAGNI really plays a big part here. And you completely missed the point of my comment about Shawn's needs being very specific and took it totally out of context - I was clearly talking about his argument of requiring mutability to support his 140MB/s architecture. I'm not sure exactly what you mean by "constructing a dto across multiple methods", but you might want to look into the Builder pattern (and the Builder itself could be mutable or immutable, depending on whatever floats your boat).

  • Anonymous
    April 22, 2009
    I meant to say 50k DTO's serialized, not 500k.  That was a typo, sorry about that. Your comments are interesting.  You state that you have not heard a good case for supporting mutable DTO's.  This sounds like a thinly veiled attempt to proclaim that you don't believe there is a good case; or a case that any system could benefit from mutable DTO's.  Yet such cases exist, they just don't satisfy your requirements for acceptability. Microsoft solicited feedback and I'm just providing mine.  Our system is constructed with the notion of mutability in our DTO's and we do not suffer for it.  We do not have maintainence issues with it, or any heartache.  It works fine.  One could argue that our projects were constructed even before Design Patterns were understood (by the team), but even if it were contructed with absolute guidance from the Patterns, the DTO pattern does not require mutability one way or the other and we could have ended up with a solution that used mutability.  So why should Microsoft force upon us an idealogy into a Design Pattern/Methodology that doesn't require the idealogy?  I see it as no problem to support both paradigms to satisfy those who both need it or don't need it. The projects I work on have been in place years longer than I've been with the company and they will not change to support mutability without major overhauls, given the way we manipulate data through the various tiers and stages of operation.  Our DTO's must carry their full state everywhere they go, since we have no other shared state.  As they pass along, their state changes.  I would not like to be forced to apply every change against a copy of the DTO when the "old" copy will never again be used or referenced by anything else, under any circumstance,  I realize that atomic data types are immutable in .NET, but why should the encapsulating object also be? Regardless, the Axum project looks to be a very efficient way to describe objects compared to what we do today and I'd like to benefit from that in the future.  If Microsoft wants widespread adoption of this toolkit, then they should might do well to not force a design idealogy based on a design pattern that does not mandate a particular level of isolation or behavior.  This project will not have my support or endorsement if it won't help to solve any of my problems or will require me to scrap what we have and change 9 years of development.  I suspect I'm not alone on this. I'm not arguing that I believe in absolute mutability.  I believe there is a place for both.  So if feasible, Microsoft should support both models. Let's you and I agree to dissagree, but let Microsoft understand that both sides of the camp would like to benefit from this toolkit in the future.

  • Anonymous
    April 22, 2009
    Shawn, I appreciate that it works for you to have mutable DTOs. However, I think it goes against the intent of the pattern from a business perspective. Now, before you jump in and say that the pattern doesn't give guidance either way, well that's totally true, but nor does it give guidance on a whole range of other design issues, like for example whether you could be making DB calls from your DTO - however, you and I both know that such a strategy would be against the intent of the pattern. So I'd prefer to keep the conversation around why you'd want to have mutable state from a business perspective, rather than a "the pattern doesn't say it can't be mutable" perspective. The main issue I have with allowing state changes on a DTO is that those state changes begin to leak domain logic and encourage developers to begin using DTOs as domain objects, which I'm sure we'd both agree definitely goes against the intent of the pattern. Now we both only have our own experiences to go off, so for me I've seen examples where developers were modifying fields in DTOs that completely bypassed the validation scheme because the validation was in the domain objects and in the DTO builder (which is where it should be). These DTOs were then forwarded to another service that began to break in hard-to-track down ways. By encouraging developers to think of DTOs as just a data contract between service endpoints, which is what they are intended for, rather than objects that you can perform state changes on, you get a big win and your codebase stays clean (on top of the shareability, etc that's been discussed before). If you've got some good guidance about how to separate state changes that have business meaning as opposed to whatever state changes you had in mind for your DTOs (and ensure that DTO changes are somehow valid), then that would be good to hear about too.

  • Anonymous
    April 23, 2009
    The degree of passion and interest in this topic has been fascinating to watch here on the sidelines. I want to thank you all for being so clear and giving us both sides of the argument! This kind of feedback is exactly what can take Axum from an incubation project to a solid product, so please stay involved in the blog and download the bits when we make them available (hopefully shortly).

  • Anonymous
    April 23, 2009
    Instead of "schema" how about "model"?

  • Anonymous
    April 23, 2009
    The comment has been removed

  • Anonymous
    April 23, 2009
    The comment has been removed

  • Anonymous
    April 24, 2009
    The comment has been removed

  • Anonymous
    April 24, 2009
    One would have to ask, even if the design of an existing project is flawed, does Microsoft only expect Axum to be used in new projects (so they can be designed to work with Axum's way of doing things), or also to enhance existing projects?  Axum cannot provide much enhancement if the existing project must be completely rethought and reimplemented. So the issue of right or wrong, being extremely subjective to the architect of the project, can be also be defined in terms of what Axum will allow you to do.  If its scope is narrow, then it will only solve a narrow set of problems and will be the wrong tool for the job for many existing projects that don't fit the definition. I am not a proponent of doing it wrong.  But I don't think mutability is a black and white issue.  If Axum will only ever be intended to define message only during their transport stage, then perhaps it will not be useful to enhance our projects because our schema "instances" are also used as entities when not in transport, during construction and after transport.

  • Anonymous
    April 24, 2009
    Will an Axum schema instance be the message itself or just a message payload?

  • Anonymous
    April 24, 2009
    The comment has been removed

  • Anonymous
    April 24, 2009
    The comment has been removed

  • Anonymous
    April 24, 2009
    Let me ask another question: What if schema instances were mutable, but were "true" transfer objects in the sense that once you sent an instance, it would no longer be accessible within the domain of the sender? Object ownership transfer, in other words...

  • Anonymous
    April 24, 2009
    I can live with immutability being enforced at the point where the message is transported.  On the receiving side, I receive a clone anyway.  However, we still modify the payload after reception.  At that point, you're saying it is read-only. I can live with that, though not ideal, if I can very easily create a new "unlocked" clone based on it (ideally the message will give me an unlocked clone without using reflection internally).  Then I can continue to modify the new object as I see fit until it becomes a message again.  This scenario isn't ideal, but it is better than a pure read-only after contructor approach. Performance won't suffer terribly.  And everyone gets what they want :)    Though preferably, I'd like to be able to unlock the locked or specify it not be locked at all.  But from your comments, it will be locked and I can live with that as long as I can get an unlocked clone without having to build it myself or use reflection (which is too darn slow for our environment) given our volume of workload.

  • Anonymous
    April 24, 2009
    Niklas, Our scenarios differ.  I see exactly what you are getting at. We will pass our object into some type of RPC via load balancer (hence our stateless nature) and if we require a mutated object, it comes back as a result and so will be a seperate assignment on a clone anyway.  But that "clone" should still be read-only.  The original at this point is "forgotten". All other times the call is one-way and we'll never need the object again from the sending side. Both scenarios are equally as common for us. The second is the one I had in mind during my lengthy discussions.  It forms our backend (where I spend my career) or various integration services and object ownership transfer, as you put it, is the norm. The customer facing projects however, use the first scenario.

  • Anonymous
    April 25, 2009
    Regarding name choice: I've used the term 'Spec' (short for Specification) so far when writing (meta)types that define structure and rules to be implemented by/adhered to by other types.

  • Anonymous
    April 27, 2009
    Making schemata immutable and generating Equals() and GetHashCode() at compile-time both strike me as desirable features for a purpose built language like Axum. I am more than willing to deal with the inconveniences of immutability for the guarantees that come with it.

  • Anonymous
    May 06, 2009
    The subject of immutability sparks intense interest among the people who follow our blog, as is evident

  • Anonymous
    May 06, 2009
    Design Patterns as External DSLs

  • Anonymous
    May 10, 2009
    What about "datatype" or simply "data" (as replacement for "schema")?