Write back validation errors
Important
Some or all of this functionality is available as part of a preview release. The content and the functionality are subject to change.
The new feature enhances visibility into ingestion validation errors within the lakehouse environment, allowing customers to troubleshoot and resolve issues more effectively.
The key benefits of this feature include:
- Detailed error reporting: Provides information on which records failed during ingestion, the specific validation errors, and the entities involved.
- Improved troubleshooting: Empowers customers with the necessary context to identify and resolve ingestion validation failures quickly.
- Enhanced debugging: Offers granular insights into validation issues, reducing the time and effort required for debugging.
Batch ingestion
When files are uploaded to the lakehouse, they undergo validation checks. For batch ingestion, you write the first 200 validation failures per entity. These errors are stored in a subfolder named ValidationErrors
within the specified lake path provided by the customer.
A CSV file is then created in this folder, where the errors are recorded following the format provided in the sample CSV file. During reingestion, a new file is created with a new timestamp, and any validation errors encountered are recorded in this new file.
A sample batch ingestion CSV file is shown as follows:
"File Name", "Entity Name", "File Type" , "Error Message", "Record"
"Operations Request_20240501134049.csv", "Operations Request", "CSV", "DataTypeMisMatch: Cell Value(s) do(es) not adhere to the agreed upon data contract", ""ulidoperationsrequest3762901per", "YrbidoperationsRequest3762901", "descriptionoperationsRequest732483", "Production", "hierarchyscopeoperationsRequest577992", "2024-02-13T06:00:04.12", "2019-06-13T00:00:00", "priorityoperationsRequest646820", "aborted""
Stream ingestion
The following section describes how validation errors are handled during stream ingestion:
When streaming data is sent to Manufacturing data solutions, the validation layer captures the errors in a single file per day. On performing multiple reingestions, errors are appended to the same file. The limit on the size of this file is set to 10 MB. If the file reaches this maximum size, no more errors are written to the file.
A sample stream ingestion CSV file is shown as follows:
"TimeStamp", "Entity Name", "DataFormat" , "ValidationErrorDataCategory" ,"Error Message", "Properties", "Relationships"
"20240501130605", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"
"20240501130623", "Equipment Property", "JSON", "Property", "Mandatory Property Cannot be null or empty", "{"id":"","description":"Address of the hierarchy scope object","value":"Contoso Furniture HQ Hyderabad India","valueUnitOfMeasure":"NA","externalreferenceid":"externalreferenceid7ET2VF39EH","externalreferencedictionary":"externalreferencedictionaryVfMZahPNcL","externalreferenceuri":"https://www.bing.com/"}", "[]"
"20240501130624", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"
"20240501130625", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"FHnAddress1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501130627", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"FHnAddress1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501130628", "Equipment Property", "JSON", "Property", "Mandatory Property Cannot be null or empty", "{"id":"","description":"Ownership Type of the estabslishment","value":"Private Limited Company","valueUnitOfMeasure":"NA","externalreferenceid":"externalreferenceid6U5CFU5OVT","externalreferencedictionary":"externalreferencedictionarylxzpGxVYEN","externalreferenceuri":"https://www.bing.com/"}", "[]"
"20240501130630", "", "JSON", "Relationship", "Provide a valid entity name in the relationship", "{}", "[{"targetEntityName":"","keys":{"id":"EZWProducts Handled1","value":"NA","valueUnitOfMeasure":"NA"},"relationshipType":null}]"
"20240501134049", "Operations Request", "JSON", "Property", "Invalid property value as there is mismatch with its type ", "{"id":"0128341001001","description":"EXTRUDE","hierarchyScope":"BLOWNLINE06","startTime":"2023-09-24T02:43","endTime":"2023-09-24T19:23:48","requestState":"completed"}", "[]"
OPC UA stream ingestion
The following section describes how validation errors are handled during OPC UA stream ingestion. A sample OPC UA stream ingestion CSV file is shown as follows:
"TimeStamp", "DataFormat" , "ValidationErrorDataCategory" ,"Error Message", "DataMessage"
"20240508115927", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPublisherIdMissing]: Validation failed as publisherId is missing", "{"MessageType":"ua-metadata","PublisherId":"","MessageId":"600","DataSetWriterId":200,"MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=403","Fields":[{"Name":"Address Hyderabad","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508120004", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPublisherIdMissing]: Validation failed as publisherId is missing", "{"MessageType":"ua-metadata","PublisherId":"","MessageId":"600","DataSetWriterId":200,"MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=403","Fields":[{"Name":"Address Hyderabad","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508120329", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPayloadDeserializationFailed] : OPCUAMetadataIngestionFailed as JsonDeserialization to MetadataMessagePayload payload failed", "{"MessageType":"ua-metadata","PublisherId":"publisher.seattle","MessageId":"602","MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=405","Fields":[{"Name":"Address Peeniya","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
"20240508121318", "OPCUA", "OpcuaMetadata", "[MDS_Ingestion_Response_Code : MDS_IG_OPCUAMetadataPayloadDeserializationFailed] : OPCUAMetadataIngestionFailed as JsonDeserialization to MetadataMessagePayload payload failed", "{"MessageType":"ua-metadata","PublisherId":"publisher.seattle","MessageId":"602","MetaData":{"Name":"urn:assembly.seattle;nsu=http://opcfoundation.org/UA/Station/;i=405","Fields":[{"Name":"Address Peeniya","FieldFlags":0,"BuiltInType":6,"DataType":null,"ValueRank":-1,"MaxStringLength":0,"DataSetFieldId":"b6738567-05eb-4b48-8cd4-f38c9b120129"}],"ConfigurationVersion":null}}"
Error samples during ingestion
This section provides examples of common validation errors encountered during ingestion.
Operations Request - Invalid Date
Record 2 startTime
date format is incorrect as it doesn't adhere to ISO 8601 standard date time format
PrimaryKey,id,description,operationsType,hierarchyScope,startTime,endTime,Priority,requestState
ytidoperationsrequest5110081,hOSidoperationsRequest5110081,descriptionoperationsRequest271466,Quality,hierarchyscopeoperationsRequest728864,**2009-09-15T00:00:00**,2023-02-11T00:00:00,priorityoperationsRequest763271,suspended
ulidoperationsrequest3762901per,YrbidoperationsRequest3762901,descriptionoperationsRequest732483,Production,hierarchyscopeoperationsRequest577992,**2024-13-02T06:00:04.12**,2019-06-13T00:00:00,priorityoperationsRequest646820,aborted
Material Actual - Data Type Mismatch
assemblyRelationship
is an enum and can have only two values (Permanent & Transient) but in first two records, their value is given as 23
which is an integer.
PrimaryKey,id,description,hierarchyScope,storageLocation,storageLocationType,assemblyType,assemblyRelationship,materialUse,quantity,quantityUnitOfMeasure,inventorysystemrefid,messystemrefid
vjidmaterialactual2493431,kBcidmaterialActual2493431,descriptionmaterialActual839502,hierarchyscopematerialActual741019,storagelocationmaterialActual712468,Equipment,Physical,**23**,Co-Product produced,458,quantityunitofmeasurematerialActual438966,inventorysystemrefidmaterialActual128310,messystemrefidmaterialActual353029
npidmaterialactual50598441,yNXidmaterialActual50598441,descriptionmaterialActual506341,hierarchyscopematerialActual112393,storagelocationmaterialActual24980,Physical asset,Physical,**23**,Inventoried,546,quantityunitofmeasurematerialActual430351,inventorysystemrefidmaterialActual122697,messystemrefidmaterialActual73601
vgidmaterialactual67115741,ECAidmaterialActual67115741,descriptionmaterialActual336100,hierarchyscopematerialActual176517,storagelocationmaterialActual594702,Description,Physical,**Transient**,Material produced,109,quantityunitofmeasurematerialActual143474,inventorysystemrefidmaterialActual968522,messystemrefidmaterialActual209334
Equipment Property - PrimaryKey Value Not Present
In Record2, PrimaryKey
column is empty.
PrimaryKey,id,description,value,valueUnitOfMeasure,externalreferenceid,externalreferencedictionary,externalreferenceuri
gbaddresscontosofurniturehqhyderabadindia1,ImOAddress1,Address of the hierarchy scope object,Contoso Furniture HQ Hyderabad India,NA,externalreferenceidOYR570XAR3,externalreferencedictionaryKHovxtwSkM,https://www.bing.com/
**,** cVBAddress1,Address of the hierarchy scope object,Contoso Furniture Production Plant Peeniya Bangalore KA India,NA,externalreferenceidFQLO8SKSHH,externalreferencedictionaryyjrGGqJXRv,https://www.bing.com/
yvownershiptypeprivatelimitedcompany1,MHiOwnership Type1,Ownership Type of the estabslishment,Private Limited Company,NA,externalreferenceidTCKMGZ5AJR,externalreferencedictionaryeWioLlyBJX,https://www.bing.com/
Equipment Class - Header Mismatch
The headers don't match with the columns defined in the entity. In this example, hierarchyScopeScope
is an invalid header.
PrimaryKey,id,description,**hierarchyScopeScope**,equipmentLevel,assetsystemrefid,messystemrefid,laenterprise1,XwIEnterprise1,An enterprise is a collection of sites and areas and represents the top level of a role based equipment hierarchy. The enterprise is responsible for determining what products will be manufactured at which sites they will be manufactured and in general how they will be manufactured.,Site,enterprise,assetsystemrefidXFPJK9TWHQ,messystemrefidZN0YFVYR8I,jusite1,NksSite1,A site is a physical, geographical, or logical grouping determined by the enterprise. It may contain areas production lines process cells and production units. The Level 4 functions at a site are involved in local site management and optimization. Site planning and scheduling may involve work centers or work units within the areas.Area,site,assetsystemrefid2D806IW1FP,messystemrefidVFIKR7QKB6,tparea1,qeXArea1. An area is a physical,geographical or logical grouping determined by the site. It may contain work centers such as process cells production units production lines and storage zones. Most Level 3 functions occur within the area. The main production capability and geographical location within a site usually identify areas.Production Line Storage Zone,area,assetsystemrefidT69FF1NWMY,messystemrefid9BC8T88NMY,viworkcenter1,mjKWorkCenter1,Production lines and work cells are the lowest levels of equipment typically scheduled by the Level 4 or Level 3 functions for discrete manufacturing processes,Work Cell,workCenter,assetsystemrefid4QEER8NCWF,messystemrefidHIKQTMSV8K