Surge protection (preview)
Surge protection helps limit overuse of your capacity by limiting the amount of compute consumed by background jobs. You configure surge protection for each capacity. Surge protection helps prevent throttling and rejections but isn't a substitute for capacity optimization, scaling up, and scaling out. When the capacity reaches its compute limit, it experiences interactive delays, interactive rejections, or all rejections even when surge protection is enabled.
Prerequisites
You need to be an admin on the capacity.
Surge protection thresholds
Capacity admins set a background rejection threshold and a background recovery threshold when they enable surge protection.
- The Background Rejection threshold determines when surge protection becomes active. The threshold applies to the 24-hour background percentage for the capacity. When the threshold is reached or exceeded, surge protection becomes active. When surge protection is active, the capacity rejects new background operations. When surge protection isn't enabled, the 24-hour background percentage is allowed to reach 100% before the capacity rejects new background operations.
- The Background Recovery threshold determines when surge protection stops being active. Surge protection stops being active when the 24-hour background percentage drops below the background recovery threshold. The capacity starts to accept new background operations.
Note
Capacity admins can see the 24-hour background percent in the Capacity metrics app compute page under Throttling on the Background Throttling chart.
Enabling surge protection
To enable surge protection, follow these steps:
Open the Fabric Admin Portal.
Navigate to Capacity settings.
Select a capacity.
Expand Surge Protection.
Select Enable Surge Protection.
Set a Background Rejection threshold.
Set a Background Recovery threshold.
Select Apply.
How to monitor surge protection
Open the Microsoft Fabric Capacity Metrics app.
On the Compute page, select System events.
The system events table shows when surge protection became active and when the capacity returned to a not overloaded state.
System events for Surge Protection
When surge protection is active, capacity state events are generated. The System events table in the Fabric Capacity metrics app shows the events. Below are the state events relevant to surge protection. A complete list of capacity state events is available in Understanding the Microsoft Fabric Capacity Metrics app compute page.
Capacity State | Capacity state change reason | When shown |
---|---|---|
Active | NotOverloaded | Indicates the capacity is below all throttling and surge protection thresholds. |
Overloaded | SurgeProtectionActive | Indicates the capacity exceeded the configured surge protection threshold. The capacity is above the configured recovery threshold. Background operations are being rejected. |
Overloaded | InteractiveDelayAndSurgeProtectionActive | Indicates the capacity exceeded the interactive delay throttling limit and the configured surge protection threshold. The capacity is above the configured recovery threshold. Background operations are being rejected. Interactive operations are experiencing delays. |
Overloaded | InteractiveRejectedAndSurgeProtectionActive | Indicates the capacity exceeded the interactive rejection throttling limit and the configured surge protection threshold. The capacity is above the configured recovery threshold. Background and interactive operations are being rejected. |
Overloaded | AllRejected | Indicates the capacity exceeded the background rejection limit. Background and interactive operations are being rejected. |
Note
When the capacity reaches its compute limit, it experiences interactive delays, interactive rejections, or all rejections even when surge protection is enabled.
Per operation status messages for surge protection
When surge protection is active, background requests are rejected. In the Fabric capacity metrics app, these requests appear with status Rejected or RejectedSurgeProtection. These status messages appear in the Fabric capacity metrics app timepoint page. See Understand the metrics app timepoint page.
Considerations and limitations
When surge protection is active, background jobs are rejected. This means there's still broad impact across your capacity even when surge protection is enabled. By using surge protection, you're tuning your capacity to stay within a specific range of usage. However, while surge protection is enabled, background operations might be rejected, and this can impact performance. To fully protect critical solutions, we recommend isolating them in a designated capacity.
Surge protection doesn't guarantee that interactive requests aren't delayed or rejected. As a capacity admin, you need to use the capacity metrics app to review data in the throttling charts and then adjust the surge protection background rejection threshold as needed.
Some requests initiated from Fabric UI are billed as background operations or depend on background operations to complete. These requests are rejected when surge protection is active.
Surge protection doesn't stop in progress jobs.
Background rejection threshold isn't an upper limit on 24-hours background percentage. This is because in progress jobs continue to run and report additional usage.
If you pause a capacity when it is in an overloaded state, the system events table in the capacity metrics app may show an Active NotOverloaded event after the Suspended event. The capacity is still paused. The NotOverloaded event is generated due to a timing issue during the pause action.