Event Hubs data connection (Preview)
Azure Event Hubs is a big data streaming platform and event ingestion service. Azure Synapse Data Explorer offers continuous ingestion from customer-managed Event Hubs.
The Event Hubs ingestion pipeline transfers events to Azure Synapse Data Explorer in several steps. You first create an Event Hubs in the Azure portal. You then create a target table in Azure Synapse Data Explorer into which the data in a particular format, will be ingested using the given ingestion properties. The Event Hubs connection needs to know events routing. Data is embedded with selected properties according to the event system properties mapping. Create a connection to Event Hubs to create an Event Hubs and send events. This process can be managed through the Azure portal, programmatically with C# or Python, or with the Azure Resource Manager template.
For general information about data ingestion in Azure Synapse Data Explorer, see Azure Synapse Data Explorer data ingestion overview.
Data format
Data is read from the Event Hubs in form of EventData objects.
See supported formats.
Note
Event Hub doesn't support the .raw format.
Data can be compressed using the
GZip
compression algorithm. SpecifyCompression
in ingestion properties.- Data compression isn't supported for compressed formats (Avro, Parquet, ORC).
- Custom encoding and embedded system properties aren't supported on compressed data.
Ingestion properties
Ingestion properties instruct the ingestion process, where to route the data, and how to process it. You can specify ingestion properties of the events ingestion using the EventData.Properties. You can set the following properties:
Property | Description |
---|---|
Table | Name (case sensitive) of the existing target table. Overrides the Table set on the Data Connection pane. |
Format | Data format. Overrides the Data format set on the Data Connection pane. |
IngestionMappingReference | Name of the existing ingestion mapping to be used. Overrides the Column mapping set on the Data Connection pane. |
Compression | Data compression, None (default), or GZip compression. |
Encoding | Data encoding, the default is UTF8. Can be any of .NET supported encodings. |
Tags | A list of tags to associate with the ingested data, formatted as a JSON array string. There are performance implications when using tags. |
Note
Only events enqueued after you create the data connection are ingested.
Events routing
When you set up an Event Hubs connection to Azure Synapse Data Explorer cluster, you specify target table properties (table name, data format, compression, and mapping). The default routing for your data is also referred to as static routing
.
You can also specify target table properties for each event, using event properties. The connection will dynamically route the data as specified in the EventData.Properties, overriding the static properties for this event.
In the following example, set Event Hubs details and send weather metric data to table WeatherMetrics
.
Data is in json
format. mapping1
is pre-defined on the table WeatherMetrics
.
Warning
This example uses connection string authentication to connect to Event Hubs for simplicity of the example. However, hard-coding a connection string into your script requires a very high degree of trust in the application, and carries security risks.
For long-term, secure solutions, use one of these options:
- Passwordless authentication
- Store your connection string in an Azure Key Vault and use this method to retrieve it in your code.
var eventHubNamespaceConnectionString=<connection_string>;
var eventHubName=<event_hub>;
// Create the data
var metric = new Metric { Timestamp = DateTime.UtcNow, MetricName = "Temperature", Value = 32 };
var data = JsonConvert.SerializeObject(metric);
// Create the event and add optional "dynamic routing" properties
var eventData = new EventData(Encoding.UTF8.GetBytes(data));
eventData.Properties.Add("Table", "WeatherMetrics");
eventData.Properties.Add("Format", "json");
eventData.Properties.Add("IngestionMappingReference", "mapping1");
eventData.Properties.Add("Tags", "['mydatatag']");
// Send events
var eventHubClient = EventHubClient.CreateFromConnectionString(eventHubNamespaceConnectionString, eventHubName);
eventHubClient.Send(eventData);
eventHubClient.Close();
Event system properties mapping
System properties store properties that are set by the Event Hubs service, at the time, the event is enqueued. The Azure Synapse Data Explorer Event Hubs connection will embed the selected properties into the data landing in your table.
Note
- System properties are supported for
json
and tabular formats (csv
,tsv
etc.) and aren't supported on compressed data. When using a non-supported format, the data will still be ingested, but the properties will be ignored. - For tabular data, system properties are supported only for single-record event messages.
- For JSON data, system properties are also supported for multiple-record event messages. In such cases, the system properties are added only to the first record of the event message.
- For
csv
mapping, properties are added at the beginning of the record in the order listed in the System properties table. - For
json
mapping, properties are added according to property names in the System properties table.
System properties
Event Hubs exposes the following system properties:
Property | Data Type | Description |
---|---|---|
x-opt-enqueued-time | datetime | UTC time when the event was enqueued |
x-opt-sequence-number | long | The logical sequence number of the event within the partition stream of the Event Hubs |
x-opt-offset | string | The offset of the event from the Event Hubs partition stream. The offset identifier is unique within a partition of the Event Hubs stream |
x-opt-publisher | string | The publisher name, if the message was sent to a publisher endpoint |
x-opt-partition-key | string | The partition key of the corresponding partition that stored the event |
If you selected Event system properties in the Data Source section of the table, you must include the properties in the table schema and mapping.
Schema mapping examples
Table schema mapping example
If your data includes three columns (Timespan
, Metric
, and Value
) and the properties you include are x-opt-enqueued-time
and x-opt-offset
, create or alter the table schema by using this command:
.create-merge table TestTable (TimeStamp: datetime, Metric: string, Value: int, EventHubEnqueuedTime:datetime, EventHubOffset:string)
CSV mapping example
Run the following commands to add data to the beginning of the record. Note ordinal values.
.create table TestTable ingestion csv mapping "CsvMapping1"
'['
' { "column" : "Timespan", "Properties":{"Ordinal":"2"}},'
' { "column" : "Metric", "Properties":{"Ordinal":"3"}},'
' { "column" : "Value", "Properties":{"Ordinal":"4"}},'
' { "column" : "EventHubEnqueuedTime", "Properties":{"Ordinal":"0"}},'
' { "column" : "EventHubOffset", "Properties":{"Ordinal":"1"}}'
']'
JSON mapping example
Data is added by using the system properties mapping. Run these commands:
.create table TestTable ingestion json mapping "JsonMapping1"
'['
' { "column" : "Timespan", "Properties":{"Path":"$.timestamp"}},'
' { "column" : "Metric", "Properties":{"Path":"$.metric"}},'
' { "column" : "Value", "Properties":{"Path":"$.value"}},'
' { "column" : "EventHubEnqueuedTime", "Properties":{"Path":"$.x-opt-enqueued-time"}},'
' { "column" : "EventHubOffset", "Properties":{"Path":"$.x-opt-offset"}}'
']'
Event Hubs connection
Note
For best performance, create all resources in the same region as the Azure Synapse Data Explorer cluster.
Create an Event Hubs
If you don't already have one, Create an Event Hubs. Connecting to Event Hubs can be managed through the Azure portal, programmatically with C# or Python, or with the Azure Resource Manager template.
Note
- The partition count isn't changeable, so you should consider long-term scale when setting partition count.
- Consumer group must be unique per consumer. Create a consumer group dedicated to Azure Synapse Data Explorer connection.
Send events
See the sample app that generates data and sends it to an Event Hubs.
For an example of how to generate sample data, see Ingest data from Event Hubs into Azure Synapse Data Explorer
Set up Geo-disaster recovery solution
Event Hubs offers a Geo-disaster recovery solution.
Azure Synapse Data Explorer doesn't support Alias
Event Hubs namespaces. To implement the Geo-disaster recovery in your solution, create two Event Hubs data connections: one for the primary namespace and one for the secondary namespace. Azure Synapse Data Explorer will listen to both Event Hubs connections.
Note
It's the user's responsibility to implement a failover from the primary namespace to the secondary namespace.
Next steps
- Ingest data from Event Hubs into Azure Synapse Data Explorer
- Create an Event Hubs data connection for Azure Synapse Data Explorer using C#
- Create an Event Hubs data connection for Azure Synapse Data Explorer using Python
- Create an Event Hubs data connection for Azure Synapse Data Explorer using Azure Resource Manager template