How to use CXRReportGen Healthcare AI model to generate grounded findings
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Important
The healthcare AI models are intended for research and model development exploration. The models are not designed or intended to be deployed in clinical settings as-is nor for use in the diagnosis or treatment of any health or medical condition, and the individual models’ performances for such purposes have not been established. You bear sole responsibility and liability for any use of the healthcare AI models, including verification of outputs and incorporation into any product or service intended for a medical purpose or to inform clinical decision-making, compliance with applicable healthcare laws and regulations, and obtaining any necessary clearances or approvals.
In this article, you learn how to deploy CXRReportGen as an online endpoint for real-time inference and issue a basic call to the API. The steps you take are:
- Deploy the model to a self-hosted managed compute.
- Grant permissions to the endpoint.
- Send test data to the model, receive, and interpret results
CXRReportGen - grounded report generation model for chest X-rays
Radiology reporting demands detailed image understanding, integration of multiple inputs (including comparisons with prior imaging), and precise language generation, making it an ideal candidate for generative multimodal models. CXRReportGen generates a list of findings from a chest X-ray study and also perform a grounded report generation or grounding task. That is, the CXRReportGen model also incorporates the localization of individual findings on the image. Grounding enhances the clarity of image interpretation and the transparency of AI-generated text, which end up improving the utility of automated report drafting.
The following animation demonstrates the conceptual architecture of the CXRReportGen model, which consists of an embedding model paired with a general reasoner large language model (LLM).
The CXRReportGen model combines a radiology-specific image encoder with a large language model and takes as inputs a more comprehensive set of data than many traditional approaches. The input data includes the current frontal image, the current lateral image, the prior frontal image, the prior report, and the indication, technique, and comparison sections of the current report. These additions significantly enhance report quality and reduce incorrect information, ultimately demonstrating the feasibility of grounded reporting as a novel and richer task in automated radiology.
Prerequisites
To use the CXRReportGen model, you need the following prerequisites:
A model deployment
Deployment to a self-hosted managed compute
CXRReportGen model can be deployed to our self-hosted managed inference solution, which allows you to customize and control all the details about how the model is served. You can deploy the model through the catalog UI (in AI Foundry or Azure Machine Learning studio) or deploy programmatically.
To deploy the model through the UI:
Go to the catalog.
Search for CxrReportGen and select the model card.
On the model's overview page, select Deploy.
If given the option to choose between serverless API deployment and deployment using a managed compute, select Managed Compute.
Fill out the details in the deployment window.
Note
For deployment to a self-hosted managed compute, you must have enough quota in your subscription. If you don't have enough quota available, you can use our temporary quota access by selecting the option I want to use shared quota and I acknowledge that this endpoint will be deleted in 168 hours.
Select Deploy.
To deploy the model programmatically, see How to deploy and inference a managed compute deployment with code.
Work with a grounded report generation model for chest X-ray analysis
In this section, you consume the model and make basic calls to it.
Use REST API to consume the model
Consume the CXRReportGen report generation model as a REST API, using simple GET requests or by creating a client as follows:
from azure.ai.ml import MLClient
from azure.identity import DeviceCodeCredential
credential = DefaultAzureCredential()
ml_client_workspace = MLClient.from_config(credential)
In the deployment configuration, you get to choose the authentication method. This example uses Azure Machine Learning token-based authentication. For more authentication options, see the corresponding documentation page. Also, note that the client is created from a configuration file that is created automatically for Azure Machine Learning virtual machines (VMs). Learn more on the corresponding API documentation page.
Make basic calls to the model
Once the model is deployed, use the following code to send data and retrieve a list of findings and corresponding bounding boxes.
input_data = {
"frontal_image": base64.encodebytes(read_image(frontal_path)).decode("utf-8"),
"lateral_image": base64.encodebytes(read_image(lateral_path)).decode("utf-8"),
"indication": indication,
"technique": technique,
"comparison": comparison,
}
data = {
"input_data": {
"columns": list(input_data.keys()),
# IMPORANT: Modify the index as needed
"index": [0], # 1, 2],
"data": [
list(input_data.values()),
],
}
}
# Create request json
request_file_name = "sample_request_data.json"
with open(request_file_name, "w") as request_file:
json.dump(data, request_file)
response = ml_client_workspace.online_endpoints.invoke(
endpoint_name=endpoint_name,
deployment_name=deployment_name,
request_file=request_file_name,
)
Use CXRReportGen REST API
CXRReportGen model assumes a simple single-turn interaction where one request produces one response.
Request schema
Request payload is a JSON formatted string containing the following parameters:
Key | Type | Required/Default | Description |
---|---|---|---|
input_data |
[object] |
Y | An object containing the input data payload |
The input_data
object contains the following fields:
Key | Type | Required/Default | Allowed values | Description |
---|---|---|---|---|
columns |
list[string] |
Y | "frontal_image" , "lateral_image" , "prior_image" ,"indication" , "technique" , "comparison" , "prior_report" |
An object containing the strings mapping data to inputs passed to the model. |
index |
integer |
Y | 0 - 10 | Count of inputs passed to the model. You're limited by how much GPU RAM you have on the VM where CxrReportGen is hosted, and by how much data can be passed in a single POST request—which depends on the size of your images. Therefore, it's reasonable to keep this number under 10. Check model logs if you're getting errors when passing multiple inputs. |
data |
list[list[string]] |
Y | "" | The list contains the list of items passed to the model. The length of the list is defined by the index parameter. Each item is a list of several strings. The order and meaning are defined by the columns parameter. The text strings contain text. The image strings are the image bytes encoded using base64 and decoded as utf-8 string |
Request example
A simple inference requesting list of findings for a single frontal image with no indication provided
{
"input_data": {
"columns": [
"frontal_image"
],
"index":[0],
"data": [
["iVBORw0KGgoAAAANSUhEUgAAAAIAAAACCAYAAABytg0kAAAAAXNSR0IArs4c6QAAAARnQU1BAACx\njwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAAAbSURBVBhXY/gUoPS/fhfDfwaGJe///9/J8B8A\nVGwJ5VDvPeYAAAAASUVORK5CYII=\n"]
]
}
}
More complex request passing frontal, lateral, indication and technique
{
"input_data": {
"columns": [
"frontal_image",
"lateral_image",
"indication",
"technique"
],
"index":[0],
"data": [
["iVBORw0KGgoAAAANSUhEUgAAAAIAAAACCAYAAABytg0kAAAAAXNSR0IArs4c6QAAAARnQU1BAACx\njwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAAAbSURBVBhXY/gUoPS/fhfDfwaGJe///9/J8B8A\nVGwJ5VDvPeYAAAAASUVORK5CYII=\n",
"iVBORw0KGgoAAAANSUhEUgAAAAIAAAACCAYAAABytg0kAAAAAXNSR0IArs4c6QAAAARnQU1BAACx\njwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAAAbSURBVBhXY/gUoPS/fhfDfwaGJe///9/J8B8A\nVGwJ5VDvPeYAAAAASUVORK5CYII=\n",
"Cough and wheezing for 5 months",
"PA and lateral views of the chest were obtained"]
]
}
}
Response schema
Response payload is a JSON formatted string containing the following fields:
Key | Type | Description |
---|---|---|
output |
list[list[string, list[list[float]]]] |
The list of findings. Each finding is an item in a list represented by a list that contains a string with the text of finding and a list that contains bounding boxes. Each bounding box is represented by a list of four coordinates of the bounding box related to the finding in the following order: x_min , y_min , x_max , y_max . Each coordinate value is between 0 and 1, thus to obtain coordinates in the space of the image for rendering or processing these values need to be multiplied by image width or height accordingly |
Response example
A simple inference requesting embedding of a single string
{
"output": [
["The heart size is normal.", null],
["Lungs demonstrate blunting of both costophrenic angles.", [[0.005, 0.555, 0.965, 0.865]]],
["There is an area of increased radiodensity overlying the left lower lung.", [[0.555, 0.405, 0.885, 0.745]]],
["Healed fractures of the left fourth, fifth, sixth, seventh, and eighth posterior ribs are noted.", [[0.585, 0.135, 0.925, 0.725]]]
]
}
Supported image formats
The deployed model API supports images encoded in PNG or JPEG formats. For optimal results, we recommend using uncompressed/lossless PNGs with 8-bit monochromatic images.
Learn more from samples
CXRReportGen is a versatile model that can be applied to a wide range of tasks and imaging modalities. For more examples see the following interactive Python notebook:
- Deploying and Using CXRReportGen: Learn how to deploy the CXRReportGen model and integrate it into your workflow. This notebook also covers bounding-box parsing and visualization techniques.