How to fix replication lag on Azure managed mysql db

Shorya Shrivastava 0 Reputation points
2025-02-11T13:06:53.3766667+00:00

I am facing Replication lag issue on one of my MySQL replication server while there is not much of CPU , ram or IOPS usage on replication server. Below are the details of the server and my production load.

Master db:-

  1. 16 core 128 GB ram business critical instance

Replication db:-

  1. 8 core 64 gb business critical instance.
  2. Max cpu usage while production is 25%
  3. max ram usage is 80%
  4. IOPS utilization is around 40%
  5. Our production environment hits multiple insert and update statements which are not time taking but are in large count

NOTE:- All queries are hit on writer db. Reader db is only for replication of data.

Can any one help here with some suggestions to improve replication lag performance. We get lag of around 2-3 hours on daily basis, which gradually increases and decreases.

Let me know if any more details are required for this. Thanks for the suggestions in advance.

Azure Database for MySQL
Azure Database for MySQL
An Azure managed MySQL database service for app development and deployment.
894 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 17,336 Reputation points
    2025-02-11T17:21:45.32+00:00

    Hello Shorya Shrivastava,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are experiencing replication lag on Azure managed MySQL Database.

    You might need to take several steps, since this look like a generic or common issues, this will help in diagnosing the specific cause of your replication lag:

    • Run SHOW SLAVE STATUS\G on the replication server and look at the Seconds_Behind_Master, Slave_IO_Running, and Slave_SQL_Running fields, and check the Relay_Log_Space to see if the relay logs are piling up, indicating the SQL thread is lagging.
    • Enable parallel replication by setting slave_parallel_workers to a value greater than 0 in your MySQL configuration file.
    • Adjust slave_parallel_type to LOGICAL_CLOCK for better performance with InnoDB.
    • Ensure that the disk subsystem on the replication server is not a bottleneck. Use tools like iostat to Consider using faster storage solutions like SSDs if not already in use.
    • In case of network issue, use tools like ping and traceroute to check the network latency between the master and slave servers and ensure there are no network issues causing delays in data transfer.
    • Review the queries being executed on the master server and ensure they are optimized and use appropriate indexes.
    • Avoid long-running transactions that can cause replication delays.
    • Set innodb_flush_log_at_trx_commit to 2 on the replication server to reduce disk I/O.
    • Adjust sync_binlog to a higher value to reduce the frequency of binlog synchronization.
    • Continuously monitor the replication process using tools like pt-heartbeat to get real-time replication lag metrics.
    • Adjust configurations based on the monitoring results to optimize performance.

    One or two of the above steps will resolve the issue.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.