Data Transformation - Scale and Reduce
Important
Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.
Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.
- See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
- Learn more about Azure Machine Learning.
ML Studio (classic) documentation is being retired and may not be updated in the future.
This article describes the modules in Machine Learning Studio (classic) that can help you work with numerical data. For machine learning, common data tasks include clipping, binning, and normalizing numerical values. Other modules support dimensionality reduction.
Note
Applies to: Machine Learning Studio (classic) only
Similar drag-and-drop modules are available in Azure Machine Learning designer.
Modeling numerical data
Tasks such as normalizing, binning, or redistributing numerical variables are an important part of data preparation for machine learning. The modules in this group support the following data preparation tasks:
- Grouping data into bins of varying sizes or distributions.
- Removing outliers or changing their values.
- Normalizing a set of numeric values into a specific range.
- Creating a compact set of feature columns from a high-dimension dataset.
Related tasks
- Select relevant and useful features to use in building the model: Use the Feature Selection or Fisher Linear Discriminant Analysis modules.
- Select features based on counts of the values: Use the Learning with Counts module.
- Remove or replace missing values: Use the Clean Missing Data module.
- Replace categorical values with numerical values that are derived from calculations: Use the Replace Discrete Values module.
- Compute a probability distribution for discrete or numerical columns: Use the Evaluate Probability Function module.
- Filter and transform digital signals and waveforms: Use the Filter module.
List of modules
This Data Transformation - Scale and Reduce category includes the following modules:
- Clip Values: Detects outliers, and then clips or replaces their values.
- Group Data into Bins: Puts numerical data into bins.
- Normalize Data: Rescales numeric data to constrain dataset values to a standard range.
- Principal Component Analysis: Computes a set of features that have reduced dimensionality for more efficient learning.