Add Columns
Important
Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.
Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.
- See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
- Learn more about Azure Machine Learning.
ML Studio (classic) documentation is being retired and may not be updated in the future.
Adds a set of columns from one dataset to another
Category: Data Transformation / Manipulation
Note
Applies to: Machine Learning Studio (classic) only
Similar drag-and-drop modules are available in Azure Machine Learning designer.
Module overview
This article describes how to use the Add Columns module in Machine Learning Studio (classic) to concatenate two datasets.
You combine all columns from the two datasets that you specify as inputs to create a single dataset. If you need to concatenate more than two datasets, use several instances of Add Columns.
When combining two datasets that contain a different number of rows, we recommend using the Join Data module, which supports outer joins on a common key column.
How to configure Add Columns
Add the Add Columns module to your experiment.
Connect the two datasets that you want to concatenate. If you want to combine more than two datasets, you can chain together several combinations of Add Columns.
It is possible to combine two columns that have a different number of rows. The output dataset is padded with missing values for each row in the smaller source column.
You cannot choose individual columns to add. All the columns from each dataset are concatenated when you use Add Columns. Therefore, if you want to add only a subset of the columns, use Select Columns in Dataset to create a dataset with the columns you want.
Run the experiment.
Results
After the experiment has run:
- To see the first rows of the new dataset, right-click the output of Add Columns and select Visualize.
- To save and name the concatenated dataset, right-click the output and select Save as Dataset .
The number of columns in the new dataset equals the sum of the columns of both input datasets.
If there are two columns with the same name in the input datasets, a numeric suffix is added to the name of the column from the dataset used in the right input column. For example, if there are two instances of a column named TargetOutcome, the right column would be renamed TargetOutcome (1).
Examples
For examples of how Add Columns is used in an experiment, see the Azure AI Gallery:
Customer relationship prediction: A column that contains labels is combined with a feature dataset.
Breast cancer detection: Datasets that contain features are cleaned and then combined by using Add Rows, Add Columns, and Join Data.
Expected inputs
Name | Type | Description |
---|---|---|
Left dataset | Data Table | Left dataset |
Right dataset | Data Table | Right dataset |
Output
Name | Type | Description |
---|---|---|
Combined dataset | Data Table | Combined dataset |
Exceptions
Exception | Description |
---|---|
Error 0003 | An exception occurs if one or more input datasets is null or empty. |
Error 0017 | An exception occurs if one or more specified columns has a type that is unsupported by the current module. |
For a list of errors specific to Studio (classic) modules, see Machine Learning Error codes.
For a list of API exceptions, see Machine Learning REST API Error Codes.