Tuesday, February 7, 2012

SaaS or On-Premise?
What Happens in Case of a Disaster?

DisasterLet's look at what happens with the ITSM service when a disaster strikes. Most organization are still hosting their IT service management application in their corporate data center. Their continuity manual typically dictates that the most business critical business service gets recovered first when the data center is struck by a disaster. There may not even be a recovery plan for the ITSM service. All this made sense in the days before the cloud.

Back then a lot of effort went into making sure that the latest version of the disaster recovery plans were kept at several offsite locations, often in hard-copy format. This included all emergency contact information, the order in which the services were to be recovered and the detailed steps that had to be taken to recover each of those services.

Since the ITSM service was being delivered from the same location as the other services, the assumption had to be made that the ITSM information would not be available when the business critical services were to be recovered.

ContinuityThis information, especially the configuration management database, would be able to streamline the disaster recovery effort. To accomplish this, however, the ITSM solution should not be hosted at the same location as the services that the business needs to make money. That allows the people performing the recovery to look up how the business services were configured, and which configuration items at other locations should be used to replace the hardware that has been lost due to the disaster.

So when we consider the continuity management aspects, SaaS is clearly preferable over an on-premise ITSM solution. Using a cloud-based ITSM solution also means that it can be used to coordinate the recovery effort without the need for everyone to physically be in one location (getting together can be difficult, depending on the type and severity of the disaster). As long as people are able to get an internet connection, they can log on to the ITSM service to see what needs to be done next.

The odd thing about this post is that it was triggered by a statement I came across last week. It warned against moving your ITSM tool to the cloud. The author reasoned that, when everything is down, you need your ITSM service to manage this situation. For some reason this convinced him that the ITSM service was 'too important' to be moved to the cloud. In my experience, when everything is down, the on-premise ITSM application is included in the ‘everything’.

It is important to note, however, that getting ITSM as a service does not absolve you from ensuring that it gets recovered when your service provider's production facility is struck by a disaster. In our case, we have made sure that the infrastructure of the ITRP service is distributed over two separate physical locations. It is configured to automatically fail-over in case one of the facilities goes down. And we did not stop there. In the extreme event that both facilities are unavailable, the most recent daily backup of the data and the hardened server images will be obtained from a third facility to recover the service there.

From a continuity perspective, that may just be the best thing about ITRP; now you can feel good about not maintaining or testing a recovery plan for your ITSM service.