The Mgraph structure
[This content is no longer valid. For the latest information on "M", "Quadrant", SQL Server Modeling Services, and the Repository, see the Model Citizen blog.]
Values in the Microsoft code name “M” modeling language are represented internally in the “M” compiler as instances of MGraph data structures.
You may need to understand MGraph structures if you are creating a domain specific language (DSL) or making extensions to the SQL Server Modeling CTP tool set.
When you create a DSL using code name “Intellipad” tool and the available command-line tools, by default the output is an MGraph data structure, which can be compiled by the “M” compiler and loaded to the repository.
Often DSLs do not generate their output data as “M” data at all, but instead use some other format, such as XML or a format specific to the DSL. In this case, you must write a program that does the following:
Creates a parser for the DSL that parses the input and generates an MGraph data structure as output.
Parses the MGraph data structure and converts it to the desired format.
To do this, use classes and methods in the System.Dataflow and Microsoft.M namespaces.
The MGraph object model can represent arbitrary graph-shaped data, and thus, can represent data structures beyond what can presently be represented in “M”. This topic restricts itself to data in the form of collections of entities, which is the most common construct used in modeling relational databases, because when compiled, they generate T-SQL CREATE TABLE
statements.
The Internal Structure of an MGraph Value
MGraph values consist of nodes and directed edges.
Nodes represent data values, and can be atomic or leaf nodes, or they can be composite. An atomic leaf node represents the value contained in a column in a row in a SQL Server table. Composite nodes do not contain values, but point to other nodes, and can represent a row in a table, or the table itself.
Edges specify the relationships between nodes. For example, a node that represents a table has outgoing edges, each of whose target nodes are columns in the table. Edges can be labeled. For edges whose target node is atomic, the label represents the column name in the table. Edges can also be unlabeled. If a table node has edges connected to the rows in the table, those edges are often unlabeled.
To illustrate these concepts with an example, consider the following “M” representation of an Employees
table.
// Populate employees with some people
Employees => {
{ FirstName => 'Jennifer', LastName => 'Jones', Dept => 10 },
{ FirstName => 'Richard', LastName => 'Jones', Dept => 15 },
{ FirstName => 'Charlotte', LastName => 'Jones', Dept => 5 }
}
This example shows the following details:
There are four labeled edges:
Employees
,FirstName
,LastName
, andDept
.Edge labels appear to the left of the binding operator (=>). In this example, the labeled edges that point to atomic nodes correspond to T-SQL column names. The labeled edge
Employees
corresponds to a T-SQL table name.MGraph supports the “M” intrinsic types for atomic node values. In addition, an atomic node can contain a reference to another node, as a possible mechanism for modeling foreign keys.
MGraph supports verbatim text with the @ delimiter.
MGraph supports comments using the //, /*, and */ delimiters.
The MGraph API
For a comprehensive discussion, see MGraph Object Model.
The types and methods required to traverse an MGraph are mostly contained in the System.Dataflow namespace. When you need to create a parser for your DSL, the Microsoft.M namespace is useful.
The types most used are:
MGraph values are stored in Graph Stores. The system provides two default store implementations, and the Graphstore type may be used if you want to extend the capabilities of the default graph stores.
There are a number of Graph Reader and Writer types that can be used to map between graphs and serial streams.