Hi Ajam, Meraj
Welcome to Microsoft Q&A Forum, thank you for posting your query here!
To read all CSV files from a registered folder data asset in an Azure ML notebook and process each one individually,
Here are some steps:
- Use the mltable library to access the folder asset.
- List the CSV files in the folder.
- Read each CSV file into a DataFrame.
- Remove the specific column you don't need.
- Save the modified DataFrame back as a CSV file.
Here's a sample code snippet to illustrate this:
import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
import os
# Initialize MLClient
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
# Get the data asset (folder)
data_asset = ml_client.data.get(name="<name_of_asset>", version="<version>")
folder_path = data_asset.path
# List all CSV files in the folder
csv_files = [f for f in os.listdir(folder_path) if f.endswith('.csv')]
# Process each CSV file
for csv_file in csv_files:
file_path = os.path.join(folder_path, csv_file)
df = mltable.from_delimited_files(paths=[{'file': file_path}]).to_pandas_dataframe()
df = df.drop(columns=['<column_to_remove>'])
df.to_csv(os.path.join(folder_path, f'modified_{csv_file}'), index=False)
Kindly refer below link: access-your-data-in-a-notebook
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.
Thank You.