Freigeben über


Disaster Recovery in Financial Services

Recently I was asked about what technologies should be considered for Disaster Recovery and Business Continuity  in Banking. I thought to myself, "this is a long and complicated discussion". Primarily this is not a technology discussion. It is more about process and portfolio management than anything. The enabling technologies for a system(s) disaster recovery (DR) should really be determined once you understand some core areas.

Disaster Recovery is a popular subject for banks these days.  There are many reasons for this. For one, all the natural disasters around the globe has triggered thought around the subject. These catastrophes have impacted businesses in customer retention, monetary impacts, business SLA's with partners and much more.

Below is a table of regulatory bodies and their stance on Business Continuity.

Federal Financial Institutions Examination Council (FFIEC) Handbook, 2003-2004 (Chapter 10)

Specifies that directors and managers are accountable for organization wide contingency planning and for "timely resumption of operations in the event of a disaster."

This chapter — on an operational level — supplants many other BCP guidelines. It covers examination requirements for all companies regulated by the Federal Deposit Insurance Corp. (FDIC), Federal Reserve Bank (FRB), Treasury Department, U.S. Office of the Comptroller of the Currency (OCC), Office of Thrift Supervision (OTS) and National Credit Union Administration (NCUA).

Basel II, Basel Committee on Banking Supervision, Sound Practices for Management and Supervision, 2003

Requires that banks put in place BC and DR plans to ensure continuous operation and to limit losses.

After 2007, influence of Basel II will be limited to about 30 U.S. banks but will spread as a best practice via "audit creep."

Interagency Paper on Sound Practices to Strengthen the Resilience of the U.S. Financial System, 2003

More focused on systemic risk than individual enterprise recovery. Requires BCPs to be upgraded and tested to incorporate risks discovered as a result of the World Trade Center disaster

Influences companies that are regulated by Securities and Exchange Commission (SEC), OCC and Board of Governors of the Federal Reserve System (FRS). Authorizes the OCC to take action against banks that fail to comply with requirements for DR by the U.S. financial system.

Expedited Funds Availability (EFA) Act, 1989

Requires federally chartered financial institutions to have a demonstrable BCP to ensure prompt availability of funds.

 

The first question you have to ask yourself is what are the goals you are trying to achieve (i.e., disaster recover verses availability). The two are treated very different both from a process and a technology perspective. Availability or SLA's are for localized application failures due to application crashes, network connectivity, DoS or someone trips over the power cord. DR is when an entire data center is down due to the environment (i.e., hurricane, earthquake, etc.) or virtually down (e.g., power outage in the data center).

Once you determine if you are looking for building redundancy from a DR or SLA perspective or maybe your looking for both. These are not exclusive, you should rate each category for each system.  On the technology side of things, it really depends on the organizational goals of the BCP plan. Some banks go back to paper in the case of a disaster (not that I would recommend that). I have used the many technologies in the past to achieve this. They all worked well for what we wanted to accomplish. An example and a simplistic view one of the architectures is that we used the published apps feature of Citrix that connected to a Win2k3 server farm in our DR location. The data was replicated with a multi-hop approach over dark fiber for near real-time data replication. At the network layer we used Global DNS to help with the redirection of applications to the DR environment. RSA Cleartrust and RSA hard tokens were also used to authenticate users.

There are several considerations to keep in mind:

1. Determination of the RTO and RPO objectives
2. What applications should be available?
    a. Meaning that there will need to be some rating system
3. Creation of that Disaster Recover Environment (What ever technology that may be)
4. Data Replication
    a. “Multi-hop” Replication
    b. Asynchronous Replication
    c. Host-based Replication (XRC)    
5. Application Replication and most importantly making sure the applications work with the replicated data.

I would take a look at the documents linked below. One from FFIEC examiners handbook BCP inspection guide. This should give you some more context around the business drivers to lead to the technology solutions.

https://www.ffiec.gov/ffiecinfobase/html_pages/bcp_book_frame.htm

Comments