Share via


StreamInsight: More Than Just an API

Microsoft StreamInsight is more than just an API for writing streaming queries. Included with the StreamInsight installation are powerful tools that allow you to manage, monitor, tune, and troubleshoot your streaming data applications. These tools range from diagnostic methods and properties built into the API to a full graphical debugging application.

Server/Client Architecture

StreamInsight is built upon a client/server architecture: a client deploys queries to a server where they are executed. In the simplest case, the server actually can be embedded within the client, keeping everything within a single application. But, more typically, the StreamInsight server resides in a separate application, running on the same machine as the client or running on a separate machine or even in a server farm.

http://i.technet.microsoft.com/dynimg/IC617388.gif

For more information, see StreamInsight Server Deployment Models and Deploying StreamInsight Entities to a StreamInsight Server.

This architecture provides you with many options for managing performance by giving you choices on where to deploy computation-intensive queries. It also allows for an external Event Flow Debugger with which you can monitor, troubleshoot, and manage active queries within a running StreamInsight server.

Event Flow Debugger

StreamInsight includes a stand-alone Event Flow Debugger with a graphical user interface. The debugger enables you to inspect, debug, and reason about the flow of events through a StreamInsight query.

The debugger can be used for two primary purposes:

  1. Debugging an event flow trace – Using a trace of events within the server you can perform a Replay to step through the event stream one event at a time, you can do a Root Cause Analysis to view the sequence of operations that led up to the current state, or you can do an Event Propagation Analysis to view the effects of an event downstream. The trace can be generated from a live recording of a specific operational query, or it can be generated using a command-line utility and then loaded into the debugger.
  2. Monitoring the StreamInsight server - You can use the debugger to connect to a live server and, using an object explorer, obtain operational diagnostics about system and application objects. You can also start and stop individual queries from within the debugger.

The following is an example “query graph” view in the debugger:

http://i.technet.microsoft.com/dynimg/IC592298.gif

For more information on the Event Flow Debugger, including available options and examples, see Using the StreamInsight Event Flow Debugger.

Extended Development Features

With StreamInsight, extended features are built into the API that make it easier for you to develop, troubleshoot, and manage your application.

Resiliency

StreamInsight supports resiliency against system failures during data stream processing by providing a checkpointing feature that can periodically save the state of a query to disk. After an outage, the query can then be restored to its state as of the checkpoint.

Note that if an outage does happen, events that occurred after the checkpoint need to be replayed and presented again to the query after the query has been restored to its checkpoint state. Events that were received after the checkpoint but before the outage, when they are replayed, will be duplicates, and this needs to be accounted for.

Fortunately, StreamInsight takes these issues into account by providing three levels of resiliency. Selecting a level depends on your requirements and your ability to change existing applications, sources, and sinks.

  1. State retention - You can use checkpoints to save the state of queries without making any changes to sources or sinks. This level of resiliency does not guarantee that the resulting stream after recovery from an outage is equivalent to the stream if no outage had occurred, because events that occurred after the last checkpoint was captured and events that occurred during the failure have been lost. However, this may be acceptable in situations where equivalent results are not needed, and where approximately correct output can be achieved with partial input.
  2. Complete output - You can guarantee that no events will be missed by changing sources so that they can replay events. The output stream from a recovered query will be logically equivalent to a superset of the output stream from an uninterrupted query, and the additional events will be duplicates of events in the uninterrupted stream.
  3. Equivalent output - You can guarantee logically equivalent output by changing sources and also changing sinks to eliminate duplicate events.

For more information, including examples of creating and configuring a resilient server and setting up checkpoints, see StreamInsight Resiliency.

Diagnostics

StreamInsight provides a diagnostic view API that allows you to monitor the server, queries, and entities in your StreamInsight application. You can find information such as the current state of a query (Initializing, Running, Suspended, Stopped, etc.), query latency, or entity properties. You can use Windows PowerShell to access the manageability information or manage metadata for a running instance.

The following are some examples of the kinds of issues you can troubleshoot using diagnostic views.

  • User-defined extension is slow
  • User-defined extension has raised an exception
  • Input adapter is not feeding data quickly
  • Input adapter is not generating CTIs
  • Output adapter is not keeping up

You can focus on specific areas of event flow, including events incoming through sources, consumed by queries, produced by queries, and outgoing through sinks. This allows you to measure things such as latency or memory consumption at different specific points in the event flow.

http://i.technet.microsoft.com/dynimg/IC586648.gif

For a definition of the DiagnosticView class, see Microsoft.ComplexEventProcessing.DiagnosticView. For more information on how to use diagnostic views, including examples, see Monitoring the StreamInsight Server and Queries.

Performance Counters and Events

StreamInsight automatically installs and configures performance counters for StreamInsight servers, queries, input adapters, and for resiliency. Server counters are turned on by default and cannot be turned off. Query counters are off by default but you can turn on counters for individual queries (and any associated adapters) using the SetDiagnosticSettings method.

The following performance counters are available.

Server counters

  • Events in input queues
  • Events in output queues
  • Memory
  • Running queries

Query counters

  • Average produced event latency
  • CTIs produced
  • Events in output queue
  • Events produced
  • Memory
  • Produced events/sec

Input Adapter counters

  • Adjusted events
  • CTIs input
  • Dropped events
  • Events in input queue
  • Incoming events/sec
  • Resumes/sec
  • Suspensions/sec
  • Total events enqueued

Here are some example scenarios in which performance counters can help you understand, monitor, and troubleshoot your StreamInsight applications:

Server scenarios

  1. Memory footprint - How much memory is the embedded instance of StreamInsight, separately from the hosting application?
  2. Capacity planning -How many queries can I run on this server before performance degrades?
  3. Post-mortem analysis - How many queries were running when the server crashed?

Query scenarios

  • Throughput monitoring. Where is the bottleneck?
  • Latency monitoring.

StreamInsight also logs events to the Windows Application event log from both the server and queries. These logs identify events such as the server being created or a query starting or being suspended. You can turn administrative logging on or off using the SetDiagnosticSettings method.

The following are events you may see in the event log.

  • QueryInitializing
  • QueryRunning
  • QueryCheckpointing
  • QueryStopping
  • QuerySuspended
  • QueryCompleted
  • QueryAborted
  • QueryStopped
  • QueryError
  • QueryRecoveryError
  • QueryCheckpointError
  • ServerCreated
  • ServerDisposed

For more information, including examples of initializing and configuring performance counters and administrative event logging, see Monitoring StreamInsight Performance Counters and Events