Recently a major service we use (not naming names) went down for almost the entire day. We get that downtime can happen unexpectedly. However, this didn’t stop the fact that we had no access to several core functions our team use daily. Thankfully all issues were resolved within a day.
This does highlight why having resilient hosting, with enough redundancy is important though.
Here is part of the (modified to keep names private) email we received explaining the issue mentioned earlier:
What happened?
The application and website are securely hosted on AWS, which normally provides an exemplary level of service.
Unfortunately, AWS incorrectly believed there was an administrative issue with our account that temporarily suspended all access.
As AWS only provides ticket-based support with minimal escalation options this took longer than expected to resolve.
Will this happen again?
Access has been reinstated, and we’re in the process of rectifying with AWS their information.
Additionally, we now have a dedicated point of contact in place to ensure any issues in the future can be resolved more promptly.
We do not expect this particular issue to happen again.
We know that access to your processes and tasks is critical to how you manage your business.
You may be wondering, why mention any of this?
We’re not looking to name & shame anyone. We of all people know that things can go wrong unexpectedly.
Instead, we thought it would be good to highlight the importance of making sure the service providers you use have the appropriate measures in place for when things do go wrong. Here are a couple of key things we think you should have when downtime occurs:
Ideally, you want clear & prompt communication with your service provider. Often a lot of the frustration in these situations can come from the lack of information about why you can’t access the services you need – especially if the service in question impacts your customers.
Some transparency, in a timely manner, goes a long way in our experience.
From our own experiences, we’ve found that providing regular updates about how the solution to the issues is coming along, helps a lot with managing frustrations and the feeling of waiting around.
If possible, getting estimated times when a fix might be in place can do wonders. This means you can let your customers know when they can expect for normal services to resume.
However, from first-hand experience, these estimates can often change as work is done, so take them with a pinch of salt. Sometimes fixes are quicker than expected and sometimes things are more complicated than first thought.
As much as we’d like to avoid it, downtime does happen to us sometimes. When it does, this is what we like to do:
The reason we like to make use of the Status Page is that it gives all our customers one place they know they can go to. This means we can update it quickly, and get on with fixing the issues at hand.
You can find the Status Page by clicking here or find it at any time by going to the top of the Homepage of the HA website.
Plesk 17 Migration Update (May 2019) Hey all, we hope you’re well! Here is another quick update on something that’s happened behind the scenes at here at HA Hosting this […]
Read MoreColo: 11 Things You Should Know! 1 – What is Colocation? Colocation (or Colo) is a service where you can keep your servers in a Data Centre. The Data Centre […]
Read MoreUpdate: Electrics & UPS (May 2019) Hey everyone, just a quick update about a couple things going on at HA Hosting at the minute. We’ve been reviewing and improving […]
Read MoreWhy Is Data Centre Access Important? Access to Your Data is Of Course Very Important Have you ever not been able to access you server when you really needed to […]
Read MoreAre you using the wrong type of Virtual Server? There are two main types of Virtual Server that you can buy but many vendors aren’t clear which type they are selling. We believe […]
Read MoreColo-what? So, we’ve chatted about Colocation before and it’s one of the core services we provide here at HA Hosting. But what exactly is Colocation? A Quick Rundown Colocation […]
Read MoreBackup. Lots of people talk about why backing up your data is the best way to stay protected from data loss. What a load of nonsense. Here is 5 reasons […]
Read MoreBackup issues do in fact exist.
Read MoreWhat is a Local Speed Vault? Do you have a slow internet connection? Does this affect how long it take to restore your data from an Online Backup? How would […]
Read MoreJust a quick update about the core spanning tree – – 06/02/2019 As communicated out to customers last week via email, we’re doing some Spanning Tree updates within the core […]
Read More