- The escape character
\
is automatically added when writing a JSON array to a file as a string because JSON strings require special characters (like"
or\
) to be escaped. - The debug output does not show the escape character because it's not writing the data to a file at that point, just evaluating it in memory.
- The output you see, like
"["all.be"]"
, is because ADF is treating the array["all.be"]
as a string and then escaping it for JSON serialization.
Option 1: Use a Mapping Data Flow to Format the Output Properly
- Create a Derived Column:
- Ensure the value you are adding (e.g.,
["all.be"]
) is not treated as a string but as a JSON array. Instead of directly typing["all.be"]
, use an expression that generates an array.
This will create a JSON array without treating it as a string.byName(["all.be"])
- Ensure the value you are adding (e.g.,
- Set the Sink Settings:
- In the Sink transformation, configure the output file's File Pattern to write as JSON objects, not as strings.
- Choose
Single Document
orPer Line Document
(depending on your requirement) in the JSON settings.
- Write to the Output File:
- When writing, ADF will serialize the array directly, avoiding unnecessary escape characters or extra quotes.
Option 2: Use a Post-Processing Script
If you still see escape characters in the output JSON file, you can use a post-processing script in Azure Functions, Azure Logic Apps, or a Script activity in ADF to clean up the file.
For example:
- Write a small Python script in a Script activity:
import json # Read the file with open('output.json', 'r') as file: data = file.read() # Replace escape characters and clean up quotes clean_data = json.loads(data) with open('clean_output.json', 'w') as file: json.dump(clean_data, file, indent=4)
Option 3: Modify Sink Dataset JSON Settings
- Ensure the Sink dataset in ADF is set to write arrays as native JSON types.
- Configure the following:
- File format: JSON
- Serialization settings: Make sure "Write as array" is selected (if available).
- Compression: None (if you're debugging and want to inspect the raw output).
Option 4: Use Replace Logic in Data Flow Expression
You can try the replace
function to preprocess the string before writing it. For instance:
replace(byName(["all.be"]), '\\', '')
Option 5: JSON Flatten Tool (Advanced Data Processing)
If you want finer control over how your JSON is written (such as removing outer quotes), consider using a Python/Databricks notebook or other external processing tools.