The JScript Type System, Part Five: More On Arrays In JScript .NET

Artigo
11/12/2003

"urn:schemas-microsoft-com:office:office" />As
I was saying the other day, CLR arrays and JScript arrays are totally different beasts.
It is hard to imagine two things being so different and yet both called the same thing.
Why did the CLR designers and the JScript designers start with the same desire --
create an array system -- and come up with completely different implementations?

Well,
the CLR implementers knew that dense, nonassociative hard-typed arrays are easy to
make fast and efficient. Furthermore,
such arrays encourage the programmer to keep
homogenous data in strictly bounded tables. That makes large
programs that do lots of data manipulation easier to understand. Thus, languages
such as C++, C# and Visual Basic have arrays like this, and thus they are the basic
built-in array type in the CLR.

Sparse,
associative, soft-typed arrays are not particularly fast but they are far
more dynamic and flexible than Visual Basic-style arrays. They make it easy to
store heterogeneous data in any table without
worrying about picky details like exactly how big that table is. In other words,
they are scripty. Languages such as JScript
and Perl have arrays like this.

JScript
.NET has both very dynamic, scripty arrays
and more strict CLR arrays, making it suitable
for both rapid development of scripts and programming in the large. But like I
said, making these two very different kinds of arrays work well together is not trivial.

JScript
.NET supports the creation of multidimensional hard-typed arrays. As with single-dimensional
arrays, the array size is not part of the
type. To annotate a variable as containing a hard-typed multidimensional array
the syntax is to follow the type with brackets containing commas. For example, to
annotate a variable as containing a two dimensional array of Strings you would say:

var multiarr
: String[,];

The
number of commas between the brackets plus one is equal to the rank of the array.
(By this definition if there are no commas between the brackets then it is a rank-one
array, as we have already seen.)

A
multidimensional array is allocated with the new keyword
as you might expect:

multiarr
= new String[4,5];

multiarr[0,0]
= "hello";

Notice
that hard-typed array elements are always
accessed with a comma-separated list of integer indices. There must always be exactly
one index for each dimension in the array. You can't use the ragged array syntax [0][0].

There
are certain situations in which you know that a variable or function argument will
refer to a hard-typed CLR array but you do not actually know the element type or the
rank, just that it is an array. Should you find yourself in one of these (rather rare)
situations there is a special annotation for a CLR array of unknown type and rank:

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr
= new double[10];

As
you can see, a variable of type System.Array may
hold any CLR array of any type and rank. However, there is a drawback. Variables of
type System.Array may
not be indexed directly because the rank is not known. This is illegal:

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr[1,2]
= "hello"; // ILLEGAL, System.Arrays
are not indexable

Rather,
to index a System.Array you
must call the GetValue and SetValue methods
with
an array of indices:

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr.SetValue("hello",
[1,2]);

The
rank and size of a System.Array can
be determined with the Rank, GetLowerBound and GetUpperBound members.

Thinking
about this a bit now, I suppose that we could have
detected at compile time that a System.Array was
being indexed, and constructed the call to the getter/setter appropriately for you,
behind the scenes. But apparently we
didn't. Oh well.

Next
time: mixing and matching JScript and CLR arrays.

Comments

Anonymous
November 12, 2003
So far the only potential downside I see in this design is that you cannot write a generic function that works for both JScript arrays and CLR arrays if the arrays have a rank higher than 1. I assume the following function will work on both types of arrays:function sum(arr, length) { var result = 0; for ( int i = 0 ; i < length ; ++i ) result += arr[i]; return result;}Would have been even nicer if CLR arrays in JScript were somehow made to support the length property if they were one dimensional.Anyway, you could not write such a function if arr was, say, two dimensional.BTW, I can't resist, using the BeyondJS JavaScript library you could write the above code as:var sum = arr.fold("+");
Anonymous
November 13, 2003
> I assume the following function will work on both types of arrays:Indeed.> you cannot write a generic function that works for both JScript arrays and CLR arrays if the arrays have a rank higher than 1Yep, but there are no JScript arrays with rank higher than one, so basically this is saying that you can't write a generic function that handles arrays of different ranks -- but wait a minute, that is what System.Array is for! ie, those rare cases where you don't know the rank at compile time.> Would have been even nicer if CLR arrays in JScript were somehow made to support the length property if they were one dimensional.Dude, wait for it. I said I'd discuss interoperability in my NEXT blog! :-)> var sum = arr.fold("+");I assume that your fold operator calls eval if the thing passed in is not a function object?
Anonymous
November 13, 2003
> Yep, but there are no JScript arrays with rank higher than oneTechnically you are correct, but practically you simply create an array of arrays. And the resulting syntax looks just like C++ or Java. That was my point actually, that for a 2D JScript array you write a[1][2] while for a CLR array you write a[1,2].> that is what System.Array is forYou misunderstood me. I wasn't looking to write a function that would work for any rank. I was looking for a function that would work for, say, a 2D JScript array and a 2D CLR array. While I fully understand the reasons you chose the indexing syntax used for multi-dimensional CLR arrays, I simply pointed out that as result they are not polymorphic with multi-dimensional JScript arrays.> Dude, wait for it.You caught me, I'm the impatient type ;-)>I assume that your fold operator calls eval if the thing passed in is not a function object?BeyondJS implements a mechanism of converting strings to functions:"+".toFunction() will generate a binary function"!".toFunction() will generate a unary function.You can also do "-".toFunctionUnary() or "-".toFunctionBinary() to control which version is generated. Here is the implenetation:String.prototype.toFunctionUnary = function() { eval("function unary(op) { return " + this + " op; }"); unary.op = this.valueOf(); return unary;};String.prototype.toFunctionBinary = function() { eval("function binary(op1, op2) { return op1 " + this + " op2; }"); binary.op = this.valueOf(); return binary;};String.prototype.toFunction = function() { return ",!,~,++,--,new,delete,typeof,void,".indexOf("," + this + ",") > -1 ? this.toFunctionUnary() : this.toFunctionBinary();};
Anonymous
November 13, 2003
Yeah, there's no interoperation between ragged arrays and two-d arrays. But there is no interoperation between ragged CLR arrays and two-d CLR arrays either! Ragged arrays and rectangular arrays are pretty much separate concepts. In fact, the whole notion of rank of a ragged array is ill-defined -- you can have a ragged array that is 3-d in some axes, 2-d in others, 1-d in still others, etc. There is no sensible notion of "rank", so making them interoperate is more trouble than its worth.Your implementation is pretty slick. (A less functional but perhaps more performant approach would be to generate all the unary and binary operator functions once and put them in a lookup table, rather than searching that string every single time and reconstructing the function object every single time.)
Anonymous
November 13, 2003
You are quit correct about both points.With regard to ragged arrays: I always found it amusing that C++ employs the same exact syntax for accessing ragged and contiguous array. So a[1][2] would generate wildly different code base on the definition of a. OTOH it did buy you that polymorphic behavior I mentioned before.With regard to BeyondJS, our motivation was always functionality, with performance a consideration but not more. Anyway, fold generates the function once, and then applies it iteratively to all the members. So the performance hit of generating a new function every time is relatively minor when compared to the cost of the loop.

Compartilhar via

The JScript Type System, Part Five: More On Arrays In JScript .NET

Comments

Recursos adicionais