IT disaster recovery, cloud computing and information security news

Cloud: organizations should ‘design for failure’ to prevent outages

Recent cloud outages highlight the need for organizations to start designing for failure on public cloud services, says Databarracks.

The recent Amazon Web Services (AWS) outage, which impacted a number of high-profile websites and service providers, has highlighted the importance of specific skillsets to support public cloud services. This is according to Radek Dymacz, head of R&D at disaster recovery and AWS consulting partner Databarracks, who states that organizations should adopt a ‘design for failure’ approach to prevent outages.

Gartner has forecasted that the worldwide public cloud services market will grow 18 percent in 2017, with additional research from the Cloud Industry Forum (CIF) showing that the overall cloud adoption rate in the UK now stands at 88 percent. For Radek, this growth has contributed to a shift in the cloud marketplace.

He explains: “The growth of hyperscale cloud services has led to an increase in managed services for these clouds. We have seen telecoms providers, data centre owners and managed service providers launch their own cloud services and, in many cases, pull out of the market. Many of these businesses are now focusing their efforts on providing managed services for the hyperscale public clouds of AWS, Azure and Google. However, platforms like AWS need a different approach to traditional hosting.

“The ability to design for failure is essential to the value proposition of public cloud platforms, and yet organizations are still consuming AWS services as though they’re building a traditional hosting environment. The great strength of platforms like AWS is that you can build in resiliency in a way that scales depending on your budget. At the larger end of the spectrum, this might involve using object storage across multiple Availability Zones and even Regions to provide an extra layer of resilience. This is expensive but, for large organizations, so is downtime. We recommend that all organizations adopt a ‘design for failure’ approach. This means that if any single element fails then there is an easily-identifiable specific cause, with a known resolution.

“What we’re seeing in customer demand agrees with this trend as businesses are now more mature in their use of cloud services. They have gone beyond testing, so they are now seeking help to increase resilience, optimise cost and support it round-the-clock. Therefore, when looking for support, organizations must select a supplier with genuine expertise, rather than a cowboy. To do this, one trick is to listen to the naming conventions they use as this is a surprisingly effective way to identify people who have not changed their approach to infrastructure. For example, consultants with little experience within the AWS ecosystem will use terminology such as ‘server’ instead of ‘instance’.

“Also, don’t be fooled by brand champions; almost anyone can pay their way through certification with AWS or Azure so you should always ask your provider how long they have been working with their chosen platform and ask to see multiple and specific case studies. This should help you find an experienced public cloud provider, but if in doubt always opt for shorter contracts.

“Although launching services in AWS is simple, it’s maintaining them that requires a highly specialised skillset. Working with a demonstrably experienced AWS provider typically involves redesigning the way your applications work, specifically around decoupling services and resources. But there’s a huge grey area between infrastructure, resource provision, application functionality and service delivery. You should therefore always choose a provider who has the developers and a support team to occupy this grey area, and who can work collaboratively with you to keep things running,” concludes Radek.

www.databarracks.com


Want news and features emailed to you?

Signup to our free newsletters and never miss a story.

A website you can trust

The entire Continuity Central website is scanned daily by Sucuri to ensure that no malware exists within the site. This means that you can browse with complete confidence.

Business continuity?

Business continuity can be defined as 'the processes, procedures, decisions and activities to ensure that an organization can continue to function through an operational interruption'. Read more about the basics of business continuity here.

Get the latest news and information sent to you by email

Continuity Central provides a number of free newsletters which are distributed by email. To subscribe click here.