Before the Disaster
Why focus on disaster recovery, when effective business continuity management could keep recovery to a minimum?
On January 17, 1994, the ground surrounding Northridge, California shook for
15 seconds, leaving 51 dead and over $44 billion in damage. Cleaning up after
that disaster involved more than personal and physical disaster recovery, as
any business in the vicinity would confirm: In a mere 15 seconds, the business
continuity of untold numbers of establishments in Southern California had been
seriously disrupted and damaged.
Yet, earthquakes and acts of nature are not the only disasters lying in wait
for the IT folks who have to pull things back together. To put it simply, bad
things happen: You’ve organized a trip to a key vendor and most of the
institution’s senior management are on a charter plane when severe icing
forces a desperate emergency landing. Or
A hazardous substance leak forces
immediate evacuation of the data center rendering your carefully planned, graceful
shutdown procedures useless. Or
You inherit a mission-critical financial
system running on 20-year-old hardware for which repair parts are no longer
available and software is no longer supported. Or
vs. Business Continuity
“And God said to Noah
make thee an ark of gopher wood
of every sort shalt thou bring into the ark, to keep them alive with thee
People are often confused about the difference between disaster recovery and
business continuity management (BCM). Yet, while disaster recovery is the act
of recovering from a disaster, BCM is a broader term that includes anticipating
and planning for bad things, as well as the actual disaster recovery process.
Let’s put it this way: After the flood, Noah was practicing disaster recovery;
before the flood, he was practicing business continuity management. Basically,
business continuity management attempts to answer two questions: a) What can
go wrong? And b) How can an institution reasonably prepare to minimize the impact?
Fortunately, higher education is an intrinsically resilient institution. The
success of our endeavor—education and research—is measured over
years, decades, or even longer. Fortunately, too, in the event of a disaster,
relatively few of our systems require remediation within hours or days, vastly
simplifying the task of business recovery. As result, the elaborate (and expensive)
BCM methodologies developed for the corporate sector are not entirely applicable.
The model outlined here is a simplification of several traditional BCM methodologies
and is tailored to the needs of higher education.
BCM for Higher Ed
The model is an ongoing process: After the basic four steps are completed, they
are repeated, because the process must be continuously adjusted to adapt to
changes in the environment. The model also includes centralized management.
Finally, the complete process must be clearly and openly communicated to the
entire institutional community.
1—Initiate and Organize the Project
This first step is the most difficult and important. As it currently stands
in higher education business continuity, project initiation and organization
are usually handled in an ad hoc fashion, and segmented into departmental silos.
The IT unit worries about backup power supplies, and campus police administrators
worry about the locks on the doors. Regrettably, no one tries to answer the
basic questions: What can go wrong? How can our institution reasonably prepare
to minimize the impact, from an institution-wide perspective?
A campuswide team must be created to guide the project; no one unit has sufficient
knowledge of the institution’s operation to lead the project alone. Clear
lines of command and responsibility must be established, and the team (including
a project manager) must be given the training and clout to do the job. Finally,
senior management must believe that a comprehensive, institution-wide approach
to BCM is important.
2—Identify and Assess Risk and Vulnerability
The second step comprises the process of identifying events that can harm the
institution, the probability of the event occurring, and the event’s direct
impact on the institution. These events include things such as natural disasters,
loss of key personnel through death or illness, terrorist attacks, computer
hardware and software failures, and criminal damage. The resulting matrix forms
the basis for further planning.
3—Business Impact Analysis and
Risk Reduction Strategies
This third step g'es beyond identifying an events-direct result and considers
how multiple events interact in a synergistic fashion to impact the institution.
For example, if the financial aid system g'es down, what is the long-term impact
on the institution’s reputation and student recruitment? What systems
are critical to the institution’s operation, and in what time frame?
Everyone likes to believe that his function is essential, but in reality,
relatively few things are so essential that they cannot be handled within the
institution’s normal operational procedures. Common sense is required
in determining what is and isn’t time-critical. For instance, on a residential
campus, online course management systems (e.g., Blackboard) can be replaced
by whiteboards and class lectures. On the other hand, if most of the institution’s
students are distance learners, the system becomes more time-critical.
An integral part of this step is to devise and implement strategies to reduce
risk where it makes financial sense. For instance, emergency power and redundant
computer and network equipment may make sense for life-safety systems, but not
for classroom instructional systems. The key to these decisions is an honest
assessment of probability and net impact.
4—Business Recovery Planning and Testing
This step addresses what happens if after reducing the risks, bad things still
happen. For example, if an explosion destroys the campus computing center, is
there an off-site facility that includes the hardware, software, and data to
continue providing critical IT functions?
If there is a natural disaster, how
will key employees get to work? Will they be secure in leaving their families?
Many of the plans and strategies developed in this phase will have multiple
applications. For example, plans for evacuation of portions of the campus (designed
for the event of a bomb threat) also could be used in a natural disaster. Finally,
all of the plans and strategies need to be regularly tested to the extent possible.
There’s no getting around it: Business continuity management is neither
easy nor cheap. But in an increasingly complex world driven by intertwined systems,
it is essential.
Doug Gale is president of Information Technology Associates, LLC (www.itassociates.org),
an IT consultancy
specializing in higher education.
It's not just for admission recruiting.
Could your campus business continue?
Take this quick quiz to find out if you’re “four-step” prepared.
Check all boxes applicable.
Step One: Project Initiation and Organization
Step Two: Risk and Vulnerability Identification
Step Three: Business Impact Analysis and Risk Reduction Strategies
Step Four: Business Recovery Planning and Testing
It's not just for admission recruiting...
Don’t reinvent the wheel
BCM is complicated, and only the foolish take on the task without reviewing
the lessons learned by others. Head to these resources to learn more:
- Continuity Assurance International (www.continuityassurance.com)
- Practical Guide to Business Continuity Assurance, McCrackan, Andrew. Artech
House Publishers, 2004.
- Avoiding Disaster: How to Keep Your Business Going When Catastrophe Strikes,
Laye, John. Wiley, 2002.
- “Contingency and Recovery Planning: Checklist for Information Systems,”
- “CCS Begins Business Resumption Planning,” istpub.berkeley.edu:4201/bcc/Nov_Dec2002/infr.busresumption.html
(Defines business recovery parameters)
- “IS&T Critical Business Functions,” cns-pao.berkeley.edu/DOC/1999/bus-fcns.shtml
(Defines critical business functions in a higher education institution)
- “Campus-wide Planning for Business Continuity and Emergency Operations,”
(How the University of Michigan linked business continuity planning to emergency