Data Table
Important
Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.
Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.
- See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
- Learn more about Azure Machine Learning.
ML Studio (classic) documentation is being retired and may not be updated in the future.
Data Table Class
A dataset is data that has been uploaded to Machine Learning Studio (classic) so that it can be used in the modeling process. Even if you upload data in another format, or specify a storage format such as CSV, ARFF, or TSV, the data is implicitly converted to a DataTable
object whenever used by a module in an experiment.
Note
Applies to: Machine Learning Studio (classic) only
Similar drag-and-drop modules are available in Azure Machine Learning designer.
The dataset is based on the .NET Data Table
Column types
A DataTable
consists of a collection of columns with associated metadata. These columns implement the IArray
interface. Columns of data in Machine Learning Studio (classic) are understood to be one-dimensional arrays – that is, vectors.
The .NET Array class implements these generic interfaces: System.Collections.Generic.IList<T>
, System.Collections.Generic.ICollection<T>
, and System.Collections.Generic.IEnumerable<T>
.
Columns of types int
, double
, and Boolean
are typically represented as numeric dense arrays. If a dense column contains missing values, it will handled either as a missing values array or as a nullable object dense array.
Columns containing strings are handled as object dense arrays. If there are missing values, the missing values are represented either as nulls or as the type MissingValuesObjectArray<string>
.
For more information, see Array Class (MSDN Library).
Getting columns in a DataTable
You can get a column by calling the GetColumn
method on the DataTable. The GetColumn
method has two overloads:
GetColumn(<Int64>)
gets a column by its index.GetColumn(<string>)
gets a column by its name.
Other interfaces in Studio (classic)
This section also describes the following interfaces for Machine Learning Studio (classic):
Type | Description |
---|---|
ICluster interface | The ICluster interface defines the structure of clustering models. |
IFilter interface | The IFilter interface defines the structure of digital signal processing filters applied to an entire series of numerical values. Filters can be created and then saved and applied to a new series. |
ILearner interface | The ILearner interface provides a generic structure for defining and saving analytical models, excluding some special types such as clustering models. |
ITransform interface | The ITransform interface provides a generic structure for defining and saving transformations. You can create an iTransform using Machine Learning Studio (classic) and then apply the transformation to new datasets. |