Troubleshoot Azure Managed Redis (preview) server issues

Article
11/19/2024

This section discusses troubleshooting issues caused by conditions on an Azure Managed Redis (preview) server or any of the virtual machines hosting it.

High CPU
High memory usage
Long-running commands
Server-side bandwidth limitation

Note

Several of the troubleshooting steps in this guide include instructions to run Redis commands and monitor various performance metrics. For more information and instructions, see the articles in the Additional information section.

High CPU

High CPU means the Redis server is busy and unable to keep up with requests, leading to timeouts. Check the CPU metric on your cache by selecting Monitoring from the Resource menu on the left. You see the CPU graph in the working pane under Insights. Or, add a metric set to CPU under Metrics.

Following are some options to consider for high CPU.

Scale up or move to a higher performance tier

For higher performance, consider scaling up to a larger cache size with more CPU cores. For more information, see Performance tiers.

Rapid changes in number of client connections

For more information, see Avoid client connection spikes.

Long running or expensive commands

For more information, see Long running commands.

Scaling

Scaling operations are CPU and memory intensive as it could involve moving data around nodes and changing cluster topology. For more information, see Scaling.

Server maintenance

If your Azure Managed Redis underwent a failover, all client connections from the node that went down are transferred to the node that is still running. The CPU could spike because of the increased connections. You can try rebooting your client applications so that all the client connections get recreated and redistributed among the two nodes.

High memory usage

Memory pressure on the server can lead to various performance problems that delay processing of requests. When memory pressure hits, the system pages data to disk, which causes the system to slow down significantly.

Here are some possible causes of memory pressure:

The cache is filled with data near its maximum capacity
Redis server is seeing high memory fragmentation

Fragmentation is likely to be caused when a load pattern is storing data with high variation in size. For example, fragmentation might happen when data is spread across 1 KB and 1 MB in size. When a 1-KB key is deleted from existing memory, a 1-MB key can’t fit into it causing fragmentation. Similarly, if 1-MB key is deleted and 1.5-MB key is added, it can’t fit into the existing reclaimed memory. This causes unused free memory and results in more fragmentation.

If the used_memory_rss value is higher than 1.5 times the used_memory metric, there's fragmentation in memory. The fragmentation can cause issues when:

Memory usage is close to the max memory limit for the cache, or
UsedMemory_RSS is higher than the Max Memory limit, potentially resulting in page faulting in memory.

If a cache is fragmented and is running under high memory pressure, the system does a failover to try recovering Resident Set Size (RSS) memory.

Redis exposes two stats, used_memory and used_memory_rss, through the INFO command that can help you identify this issue. You can view these metrics using the portal.

There are several possible changes you can make to help keep memory usage healthy:

Configure a memory policy and set expiration times on your keys. This policy may not be sufficient if you have fragmentation.
Create alerts on metrics like used memory to be notified early about potential impacts.
Scale to a larger cache size with more memory capacity. For more information, see Azure Managed Redis planning FAQs.

For recommendations on memory management, see Best practices for memory management.

Share via

Troubleshoot Azure Managed Redis (preview) server issues

High CPU

Scale up or move to a higher performance tier

Rapid changes in number of client connections

Long running or expensive commands

Scaling

Server maintenance

High memory usage

Long-running commands

Server-side bandwidth limitation

Additional information

Feedback

Additional resources