SQL Server: How to Prevent Database Downtime
Introduction
One of the major responsibilities of Database administrators (DBA) is to ensure smooth functioning of databases as well as to avoid database downtime in every case scenario. However, even after taking all precautions, DBAs may encounter situations that lead to database downtime due to multiple factors, which may be planned (voluntary) or unplanned (involuntary).
Root cause
Root cause of voluntary or planned downtime:
- System, application or database upgrade
- Network upgrade
- Routine system maintenance
- Application, hardware or database migration
Root cause of involuntary or unplanned downtime:
- Server or storage failure
- Power failure
- Natural disaster or weather events
- Human error
Performing a Root Cause Analysis (RCA) for a database downtime is necessary, as it helps in preventing further downtime. Here, we have illustrated the techniques to prevent MS SQL downtime but these can also be applied to other types of database such as My SQL, etc.
The following techniques help to prevent SQL database downtime;
Prevention techniques
1. Build an Active Standby Database
- Configure active standby/secondary database as backup at a different location. This technique helps in providing uninterrupted data availability during planned and unplanned downtime as well as ensures data protection.
- During planned downtime such as routine system maintenance, the primary database made unavailable. In that case, database failovers to active standby database, which enables database accessibility. With such arrangement, DBAs can prevent SQL database downtime.
- During unplanned downtime, primary database goes down without prior intimation. If active standby database is configured, the services, application and MS SQL database on primary server failover to active secondary server. When primary database server is reconfigured, the database failbacks to primary database.
- Configuring active standby database at a different location and replicating database on that site help in protecting MS SQL database. In case, unplanned downtime is caused due to natural calamity or power failure, the database backup at secondary site ensures data protection.
- If possible, also maintain a physical standby of the active standby database. Maintaining a secondary standby database ensures proper backup, if database downtime extends to business hours.
2. Install Database Monitoring tool
It is important to monitor the health, performance, real-time status, available storage space and other parameters of database server, application and services with the help of monitoring tools. These tools help database administrators to stay updated by monitoring the performance of servers, applications, and database. Regular monitoring of the database informs DBAs about the near-critical state, which further help them to take timely action and avoid the unplanned outage.
3. Review Database Applications and Processes during migration
Database migration and applications pushed to production without prior intimation to DBAs are the two major reasons that could lead to downtime. Perform the following steps to prevent downtime during such migrations and application updates:
- Build and deploy a new production database to migrate existing data.
- Add new Database Access Objects (DAO) to control the flow of writing to the old database and ensure that you are not writing to the new database.
- Continue writing to the old database to avoid any inconsistencies in new database.
- Once data has replicated, make both databases as ‘Read Only’ databases to maintain redundancy.
- Make the new database as Primary database and ensure that MS SQL users are not stuck in different database versions.
- Stop writing to old database and migrate all data from old to the new.
- Delete old DAO from code and leave only the new DAO that uses the new database.
Note: Perform step 3 and step 4 and test them in production atmosphere. If these are not performed and tested properly in the production environment, then DBAs may encounter a few issues such as:
- On the write path, holding large objects in memory may cause issues during peak traffic.
- Database replications may not be configured as expected.
- Monitoring may not be accurate on Primary or Secondary database.
- On the read path, there may be issues such as missing index. Also, an inefficient data model may cause poor performance of SQL database.
Such issues may lead to corrupt MS SQL database. If this happens, use a proper solution to resolve corruption issues.
4. Schedule Server and Application maintenance hours
Maintenance causes server downtime and database is not available during these hours. Always schedule maintenance in advance and fix this schedule during off-hours when regular operations are almost nil.
5. Manage all projects immaculately
A project needs constant scheduling, communication, resource management, and reporting on the progress. Even the well-executed upgrade and migration projects may not be successful, if not managed properly. So, managing all projects under the guidance of a skilled Project Manager ensures complete project management.
Conclusion
Along with prevention techniques to avoid SQL database downtime, organizations also define disaster recovery strategies (DRS). Such strategies help DBAs in reducing the overall cost of running the database and minimizing the risk of downtime. Also, a DRS includes methods to resolve SQL downtime issues.