Solution ideas
This article describes a solution idea. Your cloud architect can use this guidance to help visualize the major components for a typical implementation of this architecture. Use this article as a starting point to design a well-architected solution that aligns with your workload's specific requirements.
This article describes an elastic, flexible publish-subscribe model for data producers and consumers to create and consume validated, curated content or data.
Architecture
Download a Visio file of this architecture.
Dataflow
The Data Producer app publishes data to Azure Event Hubs, which sends the data to the Azure Functions Event Processing function.
The Data Producer also sends the JSON schema for storage in an Azure Storage container.
The Event Processing function retrieves the JSON schema from Azure Cache for Redis to reduce latency, and uses the schema to validate the data.
If the schema isn't cached yet, the Event Processing function retrieves the schema from the Azure Storage container. The request for the schema also stores the schema in Azure Cache for Redis for future retrieval.
Note
Azure Schema Registry in Event Hubs can be a viable alternative to storing and caching JSON schemas. For more information, see Azure Schema Registry in Event Hubs (Preview).
If a topic already exists and the data is valid, the Event Processing function merges the data into the existing Valid Data Azure Service Bus topic, and sends the topic to the Data Consumer app.
If a topic already exists and the data is invalid, the Event Processing function merges the data into the existing Invalid Data Service Bus topic, and sends the topic back to the data producer. The data producer subscribes to the Invalid Data topics to get feedback about invalid data that the producer created.
If a topic doesn't exist yet, the Event Processing function publishes the new data to a New Data Service Bus topic, and sends the topic to the Service Bus Topic Manager function.
If the new data is valid, the Event Processing function also inserts the data as a new Snapshot Data record in Azure Cosmos DB.
If the new data is valid, the Service Bus Topic Manager function creates a new Valid Data Service Bus topic, and sends the topic to Event Hubs.
If the new data is invalid, the Service Bus Topic Manager function creates a new Invalid Data Service Bus topic, and sends the topic back to the Data Producer app.
The Snapshot Data Flat File Processor in Azure Data Factory runs on a schedule to extract all snapshot data from the Snapshot Data Azure Cosmos DB database. The processor creates a flat file and publishes it to a Snapshot Data Flat File in Azure Storage for downloads.
The Data Consumer app retrieves a list of all the Service Bus topics that the Service Bus Topic Manager has available for subscription. The app registers with the Service Bus Topic Manager to subscribe to Service Bus topics.
Components
- Azure Event Hubs
- Azure Service Bus
- Azure Functions
- Azure Data Factory
- Azure Cosmos DB
- Azure Blob Storage
- Azure Cache for Redis
Scenario details
The Transit Hub is a dynamic publish-subscribe model for data producers and data consumers to create and consume validated, curated content or data. The model is elastic to allow for scale and performance. Data producers can quickly onboard and upload data to a service. The service validates the data against a schema that the data producer provides. The service then makes the validated data available for subscribers to consume data they're interested in.
The service validating the data doesn't need to know about the payload, only whether it's valid against the schema that the producer provides. This flexibility means the service can accept new payload types without having to be redeployed. This solution also lets data consumers get historical data that was published before the consumer subscribed.
Potential use cases
This model is especially useful in the following scenarios:
- Messaging systems where user volume and status are unknown or vary unpredictably
- Publishing systems that potentially need to support new or unknown data sources
- Commerce or ticketing systems that need to continually update data and cache it for fast delivery
Next steps
- Azure Web PubSub service documentation
- Service Bus queues, topics, and subscriptions
- Tutorial: Create a serverless notification app with Azure Functions and Azure Web PubSub service