Jaa


More help needed from community: Do you rely on string truncation?

Previously the X++ language allowed the LEFT and RIGHT keywords on definitions of string variables. That is no longer valid X++. However, we still allow specifying an explicit string length. The X++ language runtime implicitly does string truncation on assignment, both directly and when passing parameters. Consider the following example:

{
    str 5 s;
    s = ‘124567890";
}

The resulting string will only contain the first 5 characters, i.e. ‘12345'. Normally the user would not explicitly provide the length of the string in this way, but the behavior is exactly the same when using an extended type that has a particular length. One example is the SysGroup type, that is defined as having length 10 in the properties for the extended data type. If you run

{
    SysGroup g = ‘1234567890ABCDEFGHIJ";
    print g;
}

You will only get the first 10 characters printed. The remaining ones are lost to truncation.

The compiler is currently very weak in its support for these extended datatypes (EDTs). There is no validation that you are not assigning apples to oranges (so to speak). It would be natural for the compiler to respect the hierarchy that is expressed in the extended data types as is the case for class hierarchies, but that semantic check is not currently done: The compiler will base its checking on the base types of the EDTs. Introducing extra checks for the EDT hierarchies would be pleasing from a correctness perspective, but it would be an unreasonably daunting task for the eco-system to clean up the errors that would be the result of adding this check. On the other hand, using the EDTs does allow for a degree of documentation and customization.

It is quite easy for the X++ interpreter to provide these truncation semantics; it is not as easy to do in managed code. If we did, it would both compromise performance, and introduce considerable complexity into both the generation of the IL code and its execution.

Consider the following example, in which the interpreter and IL will provide different results:

{

str 5 s = ‘1234’;        // Interpreter: ‘1234’; IL: ‘1234’
s += ‘ABCDE’;         // Interpreter: ‘1234A’; IL: ‘12345ABCDE’
s = strrem(s, ‘1’);      // Interpreter: ‘234A’; IL: ‘2345ABCDE’

}

No truncation would take place when assignments are made to bound strings, neither for direct assignments or by passing parameters. Only when the string is persisted into the database will the truncation take effect.

The question for you to answer is: Do you rely on these semantics in your code? Would your code run well if all string types were equal and unbound? Do we need to muddy the waters for IL by introducing the same behavior in IL?

Comments

  • Anonymous
    March 16, 2010
    No.

  • Anonymous
    March 16, 2010
    No

  • Anonymous
    March 16, 2010
    No.  Bad idea to be relying on this anyway really.

  • Anonymous
    March 16, 2010
    The only reason I can see that it is needed is for select statements since you are not allowed to use unbounded strings.  That said, I don't think I ever used it in production quality code.  

  • Anonymous
    March 17, 2010
    No

  • Anonymous
    March 17, 2010
    Go ahead, do it. I remember wrong use of extended data types truncating arguments to a standard method. Bad! I have a faint memory of using the truncation once, when reading a fixed column text file. I did not bother to do the substr. But I could be persuaded. The declaration "str 5 x;" should give a best practice warning of cause. It is rarely used and would be an indication of problems.

  • Anonymous
    March 17, 2010
    No. Anyway if someone uses a string-based EDT to specify a variable type then I can hardly imagine he would concatenate any other strings to this variable.

  • Anonymous
    March 24, 2010
    It's a good idea to do this. I don't remember any instance where we did this.

  • Anonymous
    March 24, 2010
    It would be a great feature to have it in the managed code even of course. But in fact I personally don't use it and I did not notice anyone does niether. No one basically is relying on it in X++ and let me tell you that many of the people who develop in X++ do not know this feature even.

  • Anonymous
    March 25, 2010
    Nope, can't see any good reason to rely on this form of sting truncation.

  • Anonymous
    March 28, 2010
    No

  • Anonymous
    March 31, 2010
    Since I wouldn't rely on string truncation it would be helpful if the compiler issued a warning similar to assigning a real to an int.

  • Anonymous
    April 07, 2010
    Seems reasonable, as long as the kernel truncates before updating/inserting data.  Otherwise, it may result in sql errors like: "String or binary data would be truncated.".

  • Anonymous
    June 09, 2010
    I can only think of one place where strings are concatenated like that, and it's subprojects. The code to generate a subproject ID adds formatted numbers to the end of the parent ID, and ranges have '*' appended to include all subprojects. Even there, I have no idea what happens when the project id becomes too long and characters are lost from the end of it.

  • Anonymous
    July 01, 2010
    As an experienced programmer I am of course aware of this truncation mechanismn and I'm pretty sure that I did use this behaviour before here and there. Of course such things do matter a lot when doing equals comparisons or as set/map keys. So I would like to have the same behaviour as before for x++ code - otherwise old code will probably break in some cases. Of course I'm also aware of factors of performance and understand the nature of the question. Are there plans to remove the x++ interpreter? (For example by changing it to compile to manage code also?) If not, then what about keeping the x++ the way it was before, but introducing a new semantic for managed code languages (and explain developers to do truncation by some left$ or whatever call if they need to). So old code would not break and for the future there will be no additional performance hit for managed code?