Jaa


DynamicDataTable, Part 1

Let’s get started by doing “the simplest thing that could possibly work”.

     public class DynamicDataTable : DynamicObject {
        private readonly DataTable _table;

        public DynamicDataTable(DataTable table) {
            _table = table;
        }
    }

For now, we’ll use a DataTable for the actual storage and a DynamicObject to provide an implementation of IDynamicMetaObjectProvider. What can we accomplish with this? Well, quite a lot, actually – in a very real sense, we’re only limited by our imagination.

GetMember

The first ability we want is to be able to extract a column out of the data table; given a DynamicDataTable “foo”, the expression “foo.Bar” should give us something enumerable that represents the data in the column. The DLR describes this operation as “get member”, and DLR-based languages implement a GetMemberBinder in order to bind a dynamic “get member” operation.

DynamicObject makes it very easy for us to handle the GetMemberBinder. We simply override the virtual method TryGetMember and implement the behavior that we want. The binder has two properties: Name, which indicates the name of the member that is being bound, and IgnoreCase. You can reasonably expect that case-sensitive languages like C#, Ruby and Python will set IgnoreCase to false, while VB will set it to true.

         private DataColumn GetColumn(string name, bool ignoreCase) {
            if (!ignoreCase) {
                return _table.Columns[name];
            }
            for (int i = 0; i < _table.Columns.Count; i++) {
                if (_table.Columns[i].ColumnName.Equals(name, StringComparison.InvariantCultureIgnoreCase)) {
                    return _table.Columns[i];
                }
            }
            return null;
        }

        public override bool TryGetMember(GetMemberBinder binder, out object result) {
            var c = _table.Columns[binder.Name];
            if (c == null) {
                return base.TryGetMember(binder, out result);
            }
            var a = Array.CreateInstance(c.DataType, _table.Rows.Count);
            for (int i = 0; i < _table.Rows.Count; i++) {
                a.SetValue(_table.Rows[i][c], i);
            }
            result = a;
            return true;
        }

Here I’ve chosen to return an Array whose elements are typed identically to the column’s original data type. That’s because it’s very easy to create an Array of a particular type and to set its individual elements from the System.Objects that we can get from the DataRow.

By factoring out GetColumn into a separate method, I’ve made it easy to change just this logic. We might want, for instance, to allow a symbol name like “hello_world” to match the column named “hello world”.

Non-dynamic members

What if I want to directly access other properties of the DataTable like the “Rows” DataRowCollection? The design of the DLR makes this easy. If you don’t handle a binding operation yourself, it’s possible to fall back to a default behavior implemented by the language-provided binder. And for VB, C#, Python and Ruby, the fallback behavior is to treat the object like a normal .NET object and to access its features via Reflection. That’s why it’s useful to call base.TryGetMember instead of throwing an exception when the column name can’t be found.

So if we implement a trivial “Rows” property, a reference to DynamicDataTable.Rows will return DataTable.Rows even when the GetMember is performed dynamically at runtime (unless there actually is a column named “Rows”…).

         public DataRowCollection Rows {
            get { return _table.Rows; }
        }

SetMember

The next interesting thing we want to be able to do is to set a column on the DataTable whether or not it already exists. The DLR describes this operation as “set member”, and defines a corresponding SetMemberBinder to perform the binding operation. Like the GetMemberBinder, this class has two properties: Name and IgnoreCase.

We want to be able to set the column either to a single repeated constant value or to a list of values. But there are lots of different lists we might like to support: for instance, lists, collections or even plain IEnumerables. Let’s make some decisions about the semantics of the SetMember operation on our type:

  • If the object’s type implements IEnumerable and the object isn’t a System.String, then we’ll treat it like an enumeration. Otherwise, we’ll treat it like a single value.
  • If it’s an IEnumerable<T> we’ll use the generic type as our DataType. For a plain IEnumerable, the DataType will be System.Object.
  • If the object does not implement IEnumerable (or the object is a System.String) then the DataType will be the object’s actual RuntimeType.

For an enumeration, we’ll read items into a temporary array until we reach the number of rows in the table. If the enumeration ends before then, we’ll raise an error. If at that point, there are still additional items remaining in the enumeration, then we’ll also raise an error.

The specific behavior of our implementation for each of these types isn’t very important. What is important is that we’ve identified all the types that we expect we might get, and have identified the logic we’re going to implement for those types. Now, on to the code!

         public override bool TrySetMember(SetMemberBinder binder, object value) {
            Type dataType;
            IEnumerable values = (value is string) ? null : (value as IEnumerable);
            bool rangeCheck = (values != null);
            
            if (values != null) {
                dataType = GetGenericTypeOfArityOne(value.GetType(), typeof(IEnumerable<>)) ?? typeof(object);
            } else {
                values = ConstantEnumerator(value);
                dataType = (value != null) ? value.GetType() : typeof(object);
            }
            
            object[] data = new object[_table.Rows.Count];
            var nc = values.GetEnumerator();
            int rc = _table.Rows.Count;
            for (int i = 0; i < rc; i++) {
                if (!nc.MoveNext()) {
                    throw new ArgumentException(String.Format("Only {0} values found ({1} needed)", i, rc));
                }
                data[i] = nc.Current;
            }
            if (rangeCheck && nc.MoveNext()) {
                throw new ArgumentException(String.Format("More than {0} values found", rc));
            }
            
            var c = GetColumn(binder.Name, binder.IgnoreCase);
            if (c != null && c.DataType != dataType) {
                _table.Columns.Remove(c);
                c = null;
            }
            if (c == null) {
                c = _table.Columns.Add(binder.Name, dataType);
            }
            
            for (int i = 0; i < rc; i++) {
                _table.Rows[i][c] = data[i];
            }
            return true;
        }

(GetGenericTypeOfArityOne and ConstantEnumerator are methods whose names are pretty self-explanatory – and whose implementations can be found in the downloadable source code).

Armed with these two methods, our type now supports all of the operations we need to implement the sample program described in Part 0 of this series. A version of the complete source code can be downloaded this location.

In Part 2, we’ll add the ability to perform numerical operations between columns. See you then!

Comments

  • Anonymous
    May 24, 2009
    Thank you for submitting this cool story - Trackback from DotNetShoutout
  • Anonymous
    May 25, 2009
    The next thing we want for our dynamic DataTable is to do calculations between one or more columns. Imagine