Troubleshoot Manufacturing data solutions (preview)

Important

Some or all of this functionality is available as part of a preview release. The content and the functionality are subject to change.

This troubleshooting guide is designed to help you identify and resolve common issues that can arise during and after the deployment of Manufacturing data solutions service.

Debug failed deployment

You can check if your deployment succeeded or not by looking at status of the resource. If the status of your Manufacturing data solutions resource is Failed, go to Managed Resource group for that resource using the following naming convention MDS-{your-deployment-name}-MRG-{UniqueID}. You can also refer to Resource JSON details to fetch the details of Managed Resource group.

Note

The minimum access required is the Reader role for the specified Managed Resource group.

Screenshot showing Deployment Resource JSON.

Go to the specified resource group, select Deployments tab from the left-hand menu, and review any deployment errors that have occurred.

Screenshot showing post deployment errors.

If you receive a BadRequest error code, it indicates that the deployment values you sent don't match the expected values by Resource Manager. Check the inner message for more details.

Deployment can fail due to various reasons as listed in the following table:

Error Code Meaning Mitigation
Failed to assign role to UMI: <your-UMI>. Check if UMI has the required roles to assign roles. Missing permissions for the User Managed Identity used If the User Assigned Managed Identity doesn't have the Owner role assigned at the subscription scope, the role assignments don't work and causes the deployment to fail. Also, User managed identity’s service principal need to have Owner role assigned to itself.
DeploymentQuotaExceeded Insufficient quotas and limits for Azure services Azure resources have certain quotas and limits for various Azure services like Azure Data Explorer, OpenAI etc. Make sure you have enough quota for all the resources before starting the Manufacturing data solutions deployment.
NoRegisteredProviderFound/ SubscriptionNotRegistered/ MissingSubscriptionRegistration Required Resource Providers aren't registered One of the prerequisites for the deployment is that you must register the Resource Providers listed in the deployment guide. Ensure that all the resource providers are properly registered before starting the deployment.
Only one User Assigned Managed Identity is supported You can’t add multiple User Managed Identities in Manufacturing data solutions For more information, see common Azure Resource Manager (ARM) deployment errors.

Note

For any other intermittent issues, try to redeploy the solution once. If problem persists, you need to get in touch with Microsoft Support Team.

Debug custom entity CRUD

To check status of create Entity API, perform the following steps:

  1. Call the POST API. The response includes an Operation-Location header containing a URL. For example, the code is trying to register device entity.

     curl -X POST https://{{serviceUrl}}/mds/service/entities \
     -H "Authorization: {BEARER_TOKEN}" \
     -H "Content-Type: application/json" \
     -d '{
         "name": "Device",
         "columns": [
             {
                 "name": "id",
                 "description": "A unique id",
                 "type": "String",
                 "mandatory": true,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": true,
                 "primaryKey": true
             },
             {
                 "name": "description",
                 "description": "Additional information about the equipment",
                 "type": "String",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "hierarchyScope",
                 "description": "Identifies where the exchanged information",
                 "type": "String",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "equipmentLevel",
                 "description": "An identification of the level in the role-based equipment hierarchy.",
                 "type": "Enum",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false,
                 "enumValues": [
                     "enterprise",
                     "site",
                     "area",
                     "workCenter",
                     "workUnit",
                     "processCell",
                     "unit",
                     "productionLine",
                     "productionUnit",
                     "workCell",
                     "storageZone",
                     "Storage Unit"
                 ]
             },
             {
                 "name": "operationalLocation",
                 "description": "Identifies the operational location of the equipment.",
                 "type": "String",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "operationalLocationType",
                 "description": "Indicates whether the operational",
                 "type": "Enum",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false,
                 "enumValues": [
                     "Description",
                     "Operational Location"
                 ]
             },
             {
                 "name": "assetsystemrefid",
                 "description": "Asset/ERP System of Record Identifier for Equipment (AssetID)",
                 "type": "Alphanumeric",
                 "mandatory": false,
                 "semanticRelevantFlag": false,
                 "isProperNoun": false,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "messystemrefid",
                 "description": "Manufacturing system Ref ID",
                 "type": "Alphanumeric",
                 "mandatory": false,
                 "semanticRelevantFlag": false,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "temperature",
                 "description": "temperature of device",
                 "type": "Double",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             },
             {
                 "name": "pressure",
                 "description": "pressure of device",
                 "type": "Double",
                 "mandatory": false,
                 "semanticRelevantFlag": true,
                 "isProperNoun": false,
                 "groupBy": false,
                 "primaryKey": false
             }
         ],
         "tags": {
             "ingestionFormat": "Batch",
             "ingestionRate": "Hourly",
             "storage": "Hot"
         },
         "dtdlSchemaUrl": "https%3A%2F%2Fl2storage.blob.core.windows.net%2Fcustomentity%2FDevice.json",
         "semanticRelevantFlag": true
     }'
    

    It populates header's Operation-Location with the URL to check status of API.

  2. Decode the URL provided in the Operation-Location header in the previous call to check the status via GET request.

    curl -X GET "https://{serviceUrl}/mds/service/entities/status/RegisterEntity/efd66e795001492fb67def8a94336384" \
    -H "Authorization: Bearer {BEARER_TOKEN}"
    
    Check Action
    API Response Ensure you receive a 202 response from the API and check the status using the Operation-Location header. Perform a GET request on the URL provided in the header and verify that the status in the response payload indicates success.

    "createdDateTime": "2024-05-27T03:22:50.1367676Z",
    "resourceLocationUrl": "https://{serviceUrl}/mds/service/entities/Device",
    "status": "Succeeded"
    Path: https://{serviceUrl}/mds/service/entities

To debug the error, check the response generated by the API. For more instructions, follow Entity registration Updation.

Troubleshoot via logs

Note

The instructions are written for a Technical/IT professional with familiarity with Azure services. If you are not familiar with these technologies, you may need to work with your IT professional to complete the steps.

Important

Make sure that you at a minimum have Log Analytics Reader role to the deployed Log Analytics Workspace instance

  1. Go to Application Insights Resource in the Managed Resource Group.

    Screenshot of the Azure portal showing Application Insights.

  2. Select Logs under Monitoring navigation section on left hand side.

    Screenshot of the Azure portal showing Logs section in Application Insights.

  3. Run the KQL in Application Insights provided in Table:

    For cases where CRUD API returned failed status

    Check Action
    API Response Ensure you receive a 202 response from the API and check the status using the Operation-Location header. Perform a GET request on the URL provided in the header and if the status is Failed Path: https://{serviceUrl}/mds/service/entities
    Application Insights Check Use the following query to check whether it's Redis issue, run the following query in Application Insights:
    traces \| sort by timestamp \| where message contains "Error while creating entry in Redis".
    If it's a Redis issue, check the actual exception and check Redis resource is up and running.
    Expected Result - Should return a single entry if an entity was registered within the specified time
    Application Insights Check Use the following query to check whether it's Cosmos DB issue, run the following query in Application Insights:
    `traces | sort by timestamp | where message contains "Error occurred while updating the status in Cosmos".
    If it's a Cosmos DB issue, check the inner exception and verify the Cosmos DB is up and running.
    Expected Result - Should return a single entry if an entity was registered within the specified time

    For cases where CRUD API returned faulted status -

    Check Action
    API Response Ensure you receive a 202 response from the API and check the status using the "Operation-Location" header. Perform a GET request on the URL provided in the header and if the status is Faulted
    Path: {serviceUrl}/mds/service/entities
    Application Insights Check This check occurs when another job is already processing this entity
    traces | sort by timestamp | where message contains " Another job is already processing this entity"

Commonly encountered errors

Scenario Condition Response Code Response Body Detail
Entity Already Exists Attempting to register an entity that already exists 400 {"Detail": "An entity already exists with the name Device. Please retry with a different name"}
Invalid DTDL Schema Location Providing a location where the DTDL schema isn't present 400 {"Detail": "Failed to get the DTDL schema from URL. Error: Response status code does not indicate success: 404 (The specified blob does not exist.)."}
Invalid DTDL Schema URL Providing an invalid dtdlschemaUrl 400 {"Detail": "Failed to get the DTDL schema from URL. Error: Response status code does not indicate success: 404 (The specified resource does not exist.)."}
Response of GET job status API is Faulted This response occurs when Another job is already processing this entity 202 {"createdDateTime": "2024-05-27T03:22:50.1367676Z",
"resourceLocationUrl": "https://{serviceUrl}/mds/service/entities/Device",
"status": "Faulted"}
Response of GET job status API is Failed Case 1: Redis error: Error while creating entry in Redis.
Case 2: Cosmos error: Error occurred while updating cosmos
202 {"createdDateTime":"2024-05-27T03:22:50.1367676Z",
"resourceLocationUrl": "https://{serviceUrl}/mds/service/entities/Device",
"status": "Failed"}

Troubleshoot data mapping

Initial checks

For assistance in verifying that the mapping is done correctly, contact the Microsoft Support Team at mdssupport@microsoft.com. Additionally, ensure you understand the mapping by reviewing the examples provided on the mapping page.

Troubleshoot data ingestion

Initial checks

  1. Match the actual Twin and relationship count to expected one to use the following API:

    curl -X GET "https://{serviceUrl}/mds/service/query/ingestionStatus?operator=BETWEEN&endDate=2024-05-20&startDate=2024-05-01" \
        -H "Authorization: Bearer {BEARER_TOKEN}"
    

    Check the following link for more detailed information:

    Ingestion Status API

  2. Check that you get a healthy response from Manufacturing data solutions Health check API.

    curl -H "Authorization: Bearer {BEARER_TOKEN}" https://{{serviceUrl}}/mds/service/health
    

    Output:

    {
        "message": "Manufacturing data solutions is ready to serve the requests.",
        "setupStatus": "Succeeded",
        "setupInfo": {
            "registerEntityStatus": "Succeeded"
        },
        "errorMessage": [],
        "id": "f26dcdeb-6453-4521-b28e-610171db3997",
        "version": 1
    }
    

Debug ingestion validation errors

Note

The minimum access required is Workspace member role for accessing the validation errors in Fabric.

Batch ingestion

When files are uploaded to the lakehouse, they undergo validation checks.

For batch ingestion, the first 200 validation failures per entity are recorded. These errors are stored in a subfolder named ValidationErrors within the specified lake path provided by the customer. A CSV file is then created in this folder, where the errors are recorded following the format provided in the sample CSV file.

During reingestion, a new file is created with a new timestamp, and any validation errors encountered are recorded in this new file.

Cleanup of files in ValidationErrors isn't being performed. On reingestion, new files get created with latest timestamp. It's user's responsibility to clean up older files if not needed.

Sample CSV file

"File Name","Entity Name","File Type" , "Error Message", "Record"
"Operations Request_20240501134049.csv", "Operations Request", "CSV","DataTypeMisMatch: Cell Value(s) do(es) not adhere to the agreed upon data contract",""ulidoperationsrequest3762901per","YrbidoperationsRequest3762901", "descriptionoperationsRequest732483", "Production", "hierarchyscopeoperationsRequest577992", "2024-02-13T06:00:04.12", "2019-06-13T00:00:00","priorityoperationsRequest646820", "aborted""

Stream ingestion

When streaming data is sent to Manufacturing data solutions, errors are captured in a single file per day at the validation layer.

On performing multiple reingestions, errors are appended in the same file.

The limit on file size is set to 10 MB; if a file reaches this maximum size, no more errors are written to the file per day.

Sample file

"TimeStamp", "Entity Name", "DataFormat" , "ValidationErrorDataCategory" ,"Error Message", "Properties","Relationships"
"20240501130605", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"
"20240501130623", "Equipment Property", "JSON", "Property", "Mandatory Property Cannot be null or empty", "{"id":"","description":"Address of the hierarchy scope object","value":"Contoso Furniture  HQ  Hyderabad  India","valueUnitOfMeasure":"NA","externalreferenceid":"externalreferenceid7ET2VF39EH","externalreferencedictionary":"externalreferencedictionaryVfMZahPNcL","externalreferenceuri":"https://www.bing.com/"}", "[]"
"20240501130624", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"
"20240501130625", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"FHnAddress1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501130627", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"FHnAddress1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501130628", "Equipment Property", "JSON", "Property", "Mandatory Property Cannot be null or empty", "{"id":"","description":"Ownership Type of the estabslishment","value":"Private Limited Company","valueUnitOfMeasure":"NA","externalreferenceid":"externalreferenceid6U5CFU5OVT","externalreferencedictionary":"externalreferencedictionarylxzpGxVYEN","externalreferenceuri":"https://www.bing.com/"}", "[]"
"20240501130630", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"EZWProducts Handled1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501134049", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"

OPC UA stream ingestion

When a streaming OPC UA data or metadata is sent to Manufacturing Data Solutions, errors are captured in a single file per day at the validation layer.

On performing multiple reingestions, errors are appended in the same file. The ValidationErrorDataCategory field indicates whether the issue is with metadata or data.

A limit of 10 MB is set on file size; if a file reaches this maximum size, no more errors are written to the file per day.

"TimeStamp", "DataFormat" , "ValidationErrorDataCategory" ,"Error Message", "DataMessage"
"20240513112048", "OPCUA", "OpcuaData", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUA_MappingDataFieldsNotMatching] : Ingestion failed as Field key not matching with field in metadata with namespaceUri", "{"DataSetWriterId":200,"Payload":{"new key":{"Value":"Contoso Furniture  HQ  Hyderabad  Indianew"}},"Timestamp":"2023-10-10T09:27:39.2743175Z"}"
"20240508115927", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPublisherIdMissing]: Validation failed as publisherId is missing", "{"MessageType":"ua-metadata","PublisherId":"","MessageId":"600","DataSetWriterId":200,"MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=403","Fields":[{"Name":"Address Hyderabad","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508120004", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPublisherIdMissing]: Validation failed as publisherId is missing", "{"MessageType":"ua-metadata","PublisherId":"","MessageId":"600","DataSetWriterId":200,"MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=403","Fields":[{"Name":"Address Hyderabad","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508120329", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPayloadDeserializationFailed] : OPCUAMetadataIngestionFailed as JsonDeserialization to MetadataMessagePayload payload failed", "{"MessageType":"ua-metadata","PublisherId":"publisher.seattle","MessageId":"602","MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=405","Fields":[{"Name":"Address Peeniya","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508121318", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPayloadDeserializationFailed] : OPCUAMetadataIngestionFailed as JsonDeserialization to MetadataMessagePayload payload failed", "{"MessageType":"ua-metadata","PublisherId":"publisher.seattle","MessageId":"602","MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=405","Fields":[{"Name":"Address Peeniya","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"

Debug other ingestion errors

Troubleshoot via logs

Note

The instructions are written for a technical/IT professional with familiarity with Azure services. If you are not familiar with these technologies, you may need to work with your IT professional to complete the steps.

Make sure that you at a minimum have Log Analytics Reader role to the deployed Log Analytics Workspace instance.

  1. Go to Application Insights Resource in the Managed Resource Group.

Screenshot with Azure portal showing Application Insights.

1.Select "Logs" under "Monitoring" section on left hand side.

Screenshot with Azure portal showing Logs section in Application Insights.

1.Run the following KQL in Application Insights to get more information.

traces
| where message contains "ingestion failed"

Example of Batch ingestion csv row failure in Application Insights.

For Twins

[05:46:50 ERR] Ingestion failed for row: [rbsawmill1,KupSawMill1,Production Line for wood sawing operations,Large Wood Saw Medium Wood Saw,workUnit,ProductionLine,Description,assetsystemrefidYKUQX8MDDD,messystemrefidBA3XM6UOAL,dkjsfhdks] at 04/29/2024 05:46:50 for entity Equipment with filename : sampleLakehouse.Lakehouse/Files/dpfolder/Equipment_202404291105102034.csv due to There is a mismatch in total number of columns defined.

For Relationships

[12:09:41 ERR] Ingestion failed for row: [Person,dhyardhandler11Test,,bbhandlinglocationyard3511,hasValuesOf] at 04/30/2024 12:09:41 with filename : sampleLakehouse.Lakehouse/Files/dpfolder/Person_Mapping_202403190947077573.csv due to Source Type and Target Type cannot be empty for the record.

Commonly encountered errors

Error Code Meaning Mitigation
MDS_IG_FileOnlyHasHeaderInformation The file only has header Ensure that the file contains data in addition to the header.
MDS_IG_InvalidHeader Column Names in the header don't match with registered columns Check and correct the column names in the file header.
MDS_IG_NotAllMandatoryColumnsArePresent Header Validation Error: Mandatory Column is Missing Include all mandatory columns defined during entity registration in the file.
MDS_IG_PrimaryKeyColumnNotPresent Header Validation Error: PrimaryKey is missing Ensure that the PrimaryKey, which is a mandatory column, is present in the entity file.
MDS_IG_ColumnCountRowCountMismatch There's a mismatch in total number of columns defined Ensure that the number of columns in the file matches the expected count.
MDS_IG_MandatoryColumnNotNullNotEmpty Cell Value(s) at Mandatory Column(s) are either null or empty Provide non-null and nonempty values for mandatory columns.
MDS_IG_DataTypeMisMatch DataTypeMisMatch: Cell Value(s) doesn't adhere to the agreed upon data contract Ensure that cell values conform to the specified data types.
MDS_IG_PrimaryKeyValueNotPresent (487) PrimaryKey value is missing Ensure that the PrimaryKey value, which is mandatory, is present and not null or empty.
MDS_IG_MandatoryPropertiesOfEntityAreMissingInRelationships (489) Mandatory Properties of entity are missing in relationships Include all mandatory properties in the entity relationships.
MDS_IG_InvalidPropertyValue (490) Invalid property value as there's mismatch with its type Ensure that property values match their specified types.
MDS_IG_ProvideAValidTargetEntityNameInTheRelationship (493) Provide a valid entity name in the relationship Specify a valid target entity name in the relationship.
MDS_IG_TargetEntityNameIsNotRegisteredInDMM (494) Target entity name isn't registered in Manufacturing data solutions Register the target entity in Manufacturing data solutions before defining relationships.
MDS_IG_PrimaryKeyPropertiesOfTargetEntityAreMissingInRelationship (495) Properties marked as primary in target entity are missing in relationship Include all primary key properties in the relationship.
MDS_IG_MandatoryPropertyCannotBeNullOrEmpty (496) Mandatory Property can't be null or empty Provide non-null and nonempty values for mandatory properties.
MDS_IG_CSVMappingFileSourceAndTargetShouldNotBeEmpty (499) Source Type and Target Type can't be empty for the record Provide valid source and target types in the mapping file.
MDS_IG_CSVMappingNumberOfCellsDoNotMatchNumberOfColumns (501) Row data count isn't matching with the header count of the mapping file Ensure that the number of cells in each row matches the number of columns in the mapping file header.
MDS_IG_EntityNameIsNotRegisteredInDMM (502) Entity isn't registered in Manufacturing data solutions Register the entity in Manufacturing data solutions before ingestion.
Error Code Meaning Mitigation
MDS_IG_OPCUAInvalidDataNoRelatedTargetTwinExist [MDS_Ingestion_Response_Code :MDS_IG_OPCUAInvalidDataNoRelatedTargetTwinExist] Unable to update OPCUA telemetry data as no related twin or target twin exist Ensure that a related twin or target twin exists before attempting to update OPCUA telemetry data.
MDS_IG_OPCUAMetadataPublisherIdMissing [MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPublisherIdMissing]: Validation failed as publisherId is missing Include a valid publisherId in the metadata before reingesting the data.
MDS_IG_OPCUAMetadataDataSetWriterIdIdMissing [MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataDataSetWriterIdIdMissing]: Validation failed as DatasetWriterId is missing Provide a valid DatasetWriterId in the metadata before reingesting the data
MDS_IG_OPCUAMetadataUnsupportedMessageType [MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataUnsupportedMessageType]: Validation failed as provided MessageType isn't supported. The supported MessageTypes are ua-data or ua-metadata. Ensure the MessageType is either ua-data or ua-metadata before reingesting the data.
MDS_IG_OPCUAMetadataNameMissing [MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataNameMissing]: Validation failed as Metadata Name is null or empty Provide a non-null and nonempty Metadata Name before reingesting the data.
MDS_IG_OPCUAMetadataFieldsMissing [MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataFieldsMissing]: Validation failed as Metadata fields can't be empty Ensure that Metadata fields are populated with valid entries before reingesting the data.
MDS_IG_OPCUA_MappingMissing [MDS_Ingestion_Response_Code : MDS_IG_OPCUA_MappingMissing]: Ingestion failed due to missing mapping Ensure that the mapping is uploaded before sending telemetry events

Start fresh

If for some reason you need to start Manufacturing data solutions with a clean slate, you can delete all the ingested data via the Cleanup API. This step makes sure you can reingest the data and don't get bothered with one or more previous ingested datasets.

Debug response of health API

Various Issues encountered related to health API.

Unable to access health API

After the deployment is completed, you can call the health API to check if the system is ready state. There can be Authentication or Authorization issue while trying to invoke the health API. Ensure that you created the App registration as per the prerequisites and has necessary roles assigned.

To invoke any of the Manufacturing data solutions APIs, you must fetch the access token for the App ID specified during deployment and use it as the Bearer token.

Failed status in health API response

Sometimes the Manufacturing data solutions health API can return the response with status as Failed. The response also contains the actual error message. Depending on the error message, you need to get in touch with the Microsoft support team.

Troubleshoot control plane issues

Initial checks

Check metrics emitted by Azure resources using Azure Monitor

Diagnostic settings are enabled by default to capture extra metrics for our Infrastructure resources. These metrics are enabled for following resources and send to Log Analytics workspace which gets procured as part of deployment. These settings are enabled for all Manufacturing data solutions SKUs.

  1. Azure Data explorer
  2. Event-hub
  3. Function App
  4. Azure Kubernetes Service
  5. Cosmos DB
  6. Azure OpenAI
  7. App Service Plan
  8. Azure Redis Cache

Note

The retention period for these Metrics in Log Analytics workspace is 30 days.

How to view metrics in Azure Monitor

Follow the guidance on Azure Monitor in MS Learn.

Debugging control plane issues

Common encountered scenarios

How to change network settings

  • Ensure you have at least Contributor role on the Resource. Go to Networking tab for the required resource and add your IP address.

How to grant database access

Prerequisite: Your IP address is added in networking settings.

  1. Ensure you have corresponding role required to access the Resource. For example, if you were to access storage account, ensure you have Storage Blob Data Reader or Storage Blob Data Contributor on the Storage account. Similarly, for Cosmos Database.

  2. For Azure Data Explorer (ADX), you would need Database Reader permissions to query the data. For detailed instructions, refer to Manage Cluster Permissions.

Troubleshoot copilot

Initial checks

Try to fetch information from a table by running Copilot Query API.

POST https://{serviceUrl}/mds/copilot/v3/query?api-version=2024-06-30-preview
{
    "ask": "string", // natural language question
}

Debug Copilot issues

Troubleshoot via logs

Note

The instructions are written for a technical/IT professional with familiarity with Azure services. If you are not familiar with these technologies, you may need to work with your IT professional to complete the steps.

Make sure that you at a minimum have Log Analytics Reader role to the deployed Log Analytics Workspaceinstance.

  1. Run the following KQL in Application Insights with updated start and end time to get more information.
KQL Query
    let startTime = datetime('2024-07-04T00:00:00');
    let endTime = datetime('2024-07-05T00:00:00');
    let reqs = requests
        | where timestamp between (startTime .. endTime)
        | where cloud_RoleInstance startswith "aks-dmmcopilot-"
        | where url has 'copilot/v3/query'
        | project operation_Id, timestamp, resultCode, duration;
    let exc = exceptions
        | where timestamp between (startTime .. endTime)
        | where cloud_RoleInstance startswith "aks-dmmcopilot-"
        | distinct operation_Id, outerMessage
        | summarize exception_list = make_list(outerMessage, 10) by operation_Id;
    let copilot_traces = traces
        | where timestamp between (startTime .. endTime)
        | where cloud_RoleInstance startswith "aks-dmmcopilot-";
    let instructions = copilot_traces
        | where message has 'The instruction id used in prompt is '
        | extend instructions = parse_json(customDimensions)['Instructionid']
        | project operation_Id, instructions;
    let alias = copilot_traces
        | where message has 'AliasDictionary -'
        | extend alias = tostring(parse_json(customDimensions)['aliasDictionary'])
        | where isnotempty(alias)
        | summarize aliases = take_any(alias) by operation_Id
        | project operation_Id, aliases;
    let retries = copilot_traces
        | where message has 'Retry attempt {autoCorrectCount} out of'
        | summarize arg_max(timestamp,customDimensions) by operation_Id
        | project retries = toint(parse_json(customDimensions)['autoCorrectCount']), operation_Id;
    let total_vector_search_ms = copilot_traces
        | where message has 'Time taken to do similarity search on collection'
        | summarize total_vector_search_ms = sum(todouble((parse_json(customDimensions)['timeTaken']))) by operation_Id
        | project  total_vector_search_ms, operation_Id;
    let query = copilot_traces
        | where message startswith 'User input -'
        | extend query = tostring(parse_json(customDimensions)['input'])
        | extend intent = iff(message contains 'invalid', 'invalid', 'valid')
        | distinct operation_Id, query, intent;
    let tokens = copilot_traces
        | where message has 'Prompt tokens:' and message has 'Total tokens:'
        | extend prompt_tokens = toint(parse_json(customDimensions)['PromptTokens'])
        | extend completion_tokens = toint(parse_json(customDimensions)['CompletionTokens'])
        | summarize make_list(prompt_tokens), make_list(completion_tokens) by operation_Id;
    let total_tokens = copilot_traces
        | where message has 'Prompt tokens:' and message has 'Total tokens:'
        | extend total_tokens = toint(parse_json(customDimensions)['TotalTokens'])
        | summarize total_token = sum(total_tokens) by operation_Id;
    let kql = copilot_traces
        | where message has 'Generated Graph KQL Query -'
        | extend KQLquery = parse_json(customDimensions)['kqlQuery']
        | summarize kqls = make_list(KQLquery) by operation_Id
        | project operation_Id, kqls;
    let kql_sanitized = copilot_traces
        | where message has 'Sanitized Graph KQL Query -'
        | extend KQLquery = parse_json(customDimensions)['kqlQuery']
        | summarize s_kqls = make_list(KQLquery) by operation_Id
        | project operation_Id, s_kqls;
    reqs
    | extend operation_id_req = operation_Id
    | join kind=leftouter exc on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter query on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter instructions on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter alias on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter tokens on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter total_tokens on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter kql on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter kql_sanitized on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter retries on $left.operation_id_req == $right.operation_Id
    | join kind=leftouter total_vector_search_ms on $left.operation_id_req == $right.operation_Id
    | project
        timestamp,
        operation_Id,
        resultCode,
        query,
        duration,
        intent,
        exception_list,
        instructions,
        aliases,
        list_prompt_tokens,
        list_completion_tokens,
        total_token,
        kqls,
        s_kqls,
        retries,
        total_vector_search_ms
    | order by timestamp desc
  1. Go to Application Insights Resource in the Managed Resource Group.

    Screenshot with Azure portal showing Application Insights.

  2. Select "Logs" under Monitoring navigation section on left hand side.

    Screenshot with Azure portal showing Logs section in Application Insights.

Commonly encountered issue

Alias dictionary

Error code Error Message Description
MDSCP5001 Alias info not found related to the ID: {id} The system was unable to find any alias information for the provided ID. This error can occur if the ID is incorrect or if the alias isn't yet been created.
MDSCP5002 Error saving custom alias with the key {key} There was an error while attempting to save the custom alias with the specified key. This error could be due to a conflict with an existing alias or a temporary issue with the database.
MDSCP5003 Custom Alias Dictionary with the {id} is Internal. Hence, can't be deleted. The alias dictionary identified by the given ID is marked as internal and can't be deleted. Internal dictionaries are protected to ensure system integrity.

Example Query

Error Code Error Message Description
MDSCP4001 Example query doc with exampleId {exampleId} not found The system couldn't find any example query document associated with the provided exampleId. This error can occur if the ID is incorrect or the example isn't yet been created.
MDSCP4002 The given query has admin command {Admin Command} at index {Index} The provided query contains an admin command at the specified index, which isn't allowed. Remove the admin command and try again.
MDSCP4003 {ExampleId} is already in use The example ID provided is already in use by another query. Use a unique example ID.
MDSCP4004 Example query contains Reserved Keyword Internal The example query contains the reserved keyword 'Internal', which isn't allowed. Remove or replace the reserved keyword.
MDSCP4005 An example with the same {UserQuestion}, {SampleQuery}, {LinkedInstructions}, and {exampleId} already exists An example query with the same user question, sample query, linked instructions, and example ID already exist in the system.
MDSCP4006 Example query or queries contains duplicate IDs The provided payload contains duplicate example IDs. Each example ID must be unique.

Instruction

Error Code Error Message Description
MDSCP3001 An example with ID {exampleId} has the same linked instructions An existing example with the provided example ID has the same linked instructions. Ensure that the linked instructions are unique or modify the existing example.
MDSCP3002 Can't find the following linked instruction ID registered: {Instruction Ids} The system couldn't find the provided instruction IDs. Verify the IDs and try again.
MDSCP3003 Validation failed for Instruction request. \nError: {Message} The instruction request failed validation. For more information, see the provided error message and correct the request accordingly.
MDSCP3004 No Instruction with InstructionId: {instructionId} and Version: {version} is present The system couldn't find any instruction with the provided instruction ID and version. Verify the ID and version and try again.
MDSCP3005 No Instruction with InstructionId : {0} and Version : {1} is present. The system couldn't find the instruction with the provided instruction ID. Verify the ID and try again.
MDSCP3006 Internal Instruction with the {instructionId} can't be deleted The instruction identified by the given ID is marked as internal and can't be deleted. Internal instructions are protected to ensure system integrity.
MDSCP3007 Instruction can't be deleted as it's referenced by others The instruction can't be deleted because it's currently referenced by other entities. Remove the references before attempting to delete the instruction.
MDSCP3008 Can't delete bulk delete instruction custom versions, make sure deleteAll flag is true. Exercise caution. Can't delete bulk delete instruction custom versions. Make sure deleteAll flag is true. Exercise caution.
MDSCP3010 This instruction has only one version. Deleting it removes the instruction entirely, but it's referenced by other instructions. To completely delete this instruction, use the deleteAll endpoint and set forceDelete to true. This instruction has only one version. Deleting it removes the instruction entirely, but it's referenced by other instructions. To completely delete this instruction, use the deleteAll endpoint and set forceDelete to true.

Operation

Error Code Error Message Description
MDSCP1001 Operation ID doesn’t exists The system couldn't find any operation associated with the provided operation ID. Verify the ID and try again.

Query

Error Code Error Message Description
MDSCP2001 Query has Invalid Intent The user query doesn't relate to the Manufacturing Operations Management Domain. Ensure the query is relevant to the domain.
MDSCP2002 Expected KQL isn't valid The expected KQL (Kusto Query Language) is invalid. Correct the KQL syntax and try again.
MDSCP2003 Validate Test details not found The system couldn't find any validation test details corresponding to the provided test ID. Verify the ID and try again.
MDSCP2004 Unable to generate valid KQL for the user Ask The system was unable to generate a valid KQL for the user's query. Review the query and try again.
MDSCP2005 Include Summary isn't specified when working with Conversation The 'Include Summary' flag is mandatory when working with Conversation API. Specify the flag and try again.
MDSCP2007 Test cases Validation failure The provided test cases are empty. Provide valid test cases and try again.
MDSCP2008 Test Query “Aks” is empty The query for test case is empty. Provide a valid query and try again.
MDSCP2009 Query KQL is empty The expected KQL for the test case isn't specified. Provide a valid KQL and try again.
MDSCP2010 Test summary request invalid The test summary request is invalid. Review the request and try again.
MDSCP2011 Conversation ID isn't present in the header The conversation ID is missing from the request header. Include the conversation ID and try again.
MDSCP2012 Query has cast exception There was an invalid cast exception in the resultant KQL. Review the query and correct any type mismatches.
MDSCP2013 API has throttled The API has encountered a throttling issue with OpenAI. Try again later or reduce the frequency of requests.

Feedback

Error Code Error Message Description
MDSCP6001 Feedback Request is Invalid The feedback request is invalid. Review the request and ensure all required fields are correctly filled.
MDSCP6002 Operation ID is invalid The provided operation ID is invalid. Verify the ID and try again.
MDSCP2003 Invalid value for the Feedback The feedback contains an invalid value. Review the feedback and provide a valid value.

Create a support e-mail ticket

Select Support in the left hand pane in Solution Center. Select Microsoft Cloud for Manufacturing (Preview) and then Data Solutions and Factory Operations Agent(Preview) under Solution Area. Select the support email template to send a support request to the Manufacturing data solutions team.

Screenshot showing how to request support through Solution Center

You can also create a support ticket from the announcements in Solution Center in the Microsoft Cloud for Manufacturing Home page. Select Send Email to send a support request to the Manufacturing data solutions team.