Convert to ARFF
Important
Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.
Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.
- See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
- Learn more about Azure Machine Learning.
ML Studio (classic) documentation is being retired and may not be updated in the future.
Converts data input to the attribute relation file format used by the Weka toolset
Category: Data Format Conversions
Note
Applies to: Machine Learning Studio (classic) only
Similar drag-and-drop modules are available in Azure Machine Learning designer.
Module overview
This article describes how to use the Convert to ARFF module in Machine Learning Studio (classic), to convert datasets and results the attribute-relation file format used by the Weka toolset. This format is known as ARFF.
The ARFF data specification for Weka supports multiple machine learning tasks, including data preprocessing, classification, and feature selection. In this format, data is organized by entites and their attributes, and is contained in a single text file. You can find details of the Weka file format in the Technical Notes section.
In general, conversion to the Weka file format is required only if you want to use both Machine Learning and Weka, and intend to move your training data back and forth between them.
For more information about the Weka toolset, see this Wikipedia article: Weka (machine learning)
Warning
You cannot overwrite an existing ARFF file in Azure Storage.
How to use Convert to ARFF
Add the Convert to ARFF module to your experiment. You can find this module in the Data Format Conversions category in Machine Learning Studio (classic).
Connect it to any module that outputs a dataset.
Run the experiment, or click the Convert to ARFF module, and click Run selected.
Results
To create a copy of the data in a local folder, double-click the output of Convert to ARFF, and select the Download option.
If you do not specify a folder, a default file name is applied and the file is saved in the local Downloads library.
Note
This module does not support export to Python or R code.
Examples
There are no examples specific to this format in the Azure AI Gallery. However, these experiments demonstrate other types of format conversion:
Color-Based Image Compression: Exports the datasets used for each portion of the analysis to files for reproducibility and use on other analytics platforms.
Cross Validation for Binary Classification sample: Exports the results of cross validation to files so that the results for multiple models can be compared by using a tool such as Excel.
Technical notes
This section contains implementation details, tips, and answers to frequently asked questions.
Example of ARFF format
This section provides an example of how a typical dataset would look when converted to ARFF.
Typically an ARFF data file is comprised of two sections: a header that defines the data source and schema, and the data section, which contains the actual entities and their attributes.
ARFF header
The header for an ARFF file defines the list of the attributes (in columns) and their data types. The header can also contain multiple comment lines that describe the data source or any other notes.
% Source: Iris dataset, UCI % 0 = Iris-setosa, 1= Iris-virginica @RELATION iris @ATTRIBUTE sepal_length NUMERIC @ATTRIBUTE sepal_width NUMERIC @ATTRIBUTE petal_length NUMERIC @ATTRIBUTE petal_width NUMERIC @ATTRIBUTE class {0, 1}
Tip
If the dataset you are converting does not have column names, use the Edit Metadata module to add column names before using converting to ARFF.
ARFF data
The data section consists of comma-separated values, and looks very much like a CSV file without column headings.
@DATA 5.1,3.5,1.4,0.2,0
For additional information about this file format, see the Weka Wiki page: ARFF (developer version).
Current ARFF version
Machine Learning Studio (classic) saves ARFF files by using the ARFF 3.0 format.
Expected inputs
Name | Type | Description |
---|---|---|
Dataset | Data Table | Input dataset |
Outputs
Name | Type | Description |
---|---|---|
Results dataset | Arff | Output dataset |