Use CMS claims data transformations (preview) in healthcare data solutions
[This article is prerelease documentation and is subject to change.]
With CMS claims data transformations (preview), you can ingest, store, and analyze claims data in CMS (Centers for Medicare & Medicaid Services) CCLF (Claim and Claim Line Feed) format. To learn more about the capability and understand how to deploy and configure it, see:
- Overview of CMS claims data transformations (preview)
- Deploy and configure CMS claims data transformations (preview)
Understand the transformation mechanism
The claims data transformation pipeline ingests claims files in either native or compressed format into the lakehouse. The end-to-end transformation follows these high-level consecutive steps:
- Transform the claims files in OneLake
- Organize the claims files in OneLake
- Extract claims data into the bronze lakehouse
- Convert claims data to FHIR NDJSON files
- Transform claims data into FHIR flattened tables in the bronze lakehouse
- Transform claims data into FHIR relational tables in the silver lakehouse
Run the claims data transformations pipeline
Ensure you complete the steps in Set up claims sample data before running the claims data transformations pipeline.
To transform the claims data from the bronze lakehouse to the silver lakehouse, open the healthcare#_msft_clinical_claims_cclf_data_transformation data pipeline and select Run.
After the pipeline runs successfully, open the ExplanationOfBenefit table in the silver lakehouse to view the transformed data.
Usage considerations
Review these key points before using the CMS claims data transformations (preview) capability.
Spark version
The notebooks are preconfigured to run with Spark runtime version 1.2 (Spark 3.4, Delta 2.4) by default. Ensure you maintain this setting at the environment level. To learn more, see Reset Spark runtime version in the Fabric workspace.
File extension
The uploaded CCLF files must follow the extension format: *.T1000001
to *.T1000009
. Files with incorrect extensions move to the Failed folder in the bronze lakehouse.
Record length
A record length mismatch in CCLF files occurs when one or more records deviate from the required fixed-length format. This mismatch can cause data misalignment, incomplete data capture, or processing errors. Files with records that don't meet the expected length move to the Failed folder in the bronze lakehouse.