Home / Channel / Amazon blame ‘human error’ for Netflix outage

Amazon blame ‘human error’ for Netflix outage

Many users complained of an outage on Christmas Eve, which stopped them watching their TV shows and movies via Netflix. Amazon have blamed ‘human error' for the server downtime.

Amazon have said that a developer mistakenly deleted part of the ‘ELB state data' which handles the load balancing, streaming content across multiple servers. When the issue happened, it took the company several hours to work out exactly what was going wrong.

Amazon said “The service disruption began at 12:24 PM PST on December 24th when a portion of the ELB state data was logically deleted. This data is used and maintained by the ELB control plane to manage the configuration of the ELB load balancers in the region (for example tracking all the backend hosts to which traffic should be routed by each load balancer). The data was deleted by a maintenance process that was inadvertently run against the production ELB state data. This process was run by one of a very small number of developers who have access to this production environment. Unfortunately, the developer did not realize the mistake at the time. After this data was deleted, the ELB control plane began experiencing high latency and error rates for API calls to manage ELB load balancers.”

Initial efforts to take a snapshot of the system configurations prior to the accidental deletion, a process which took several hours, did not work. A second method worked better, however it took some time to implement correctly.

Amazon's AWS team has to merge the new ELB state data with the old, a process which took 3 hours along. They then had to spend five more hours gradually re-enabling all of the service workflows and APIs in a way that didn't cause problems for the correctly running processes. Amazon said the system was operating normally by 12.05PM PST.

The company said “Last, but certainly not least, we want to apologize. We know how critical our services are to our customers’ businesses, and we know this disruption came at an inopportune time for some of our customers. We will do everything we can to learn from this event and use it to drive further improvement in the ELB service”.

They have since implemented new policies to ensure this can not happen again. The ELB state data is now harder to delete without specific approval. The team say “We are confident that we could recover ELB state data in a similar event significantly faster (if necessary) for any future operational event.”

Kitguru says: A lesson learned.

Become a Patron!

Check Also

uwgamespecialist.nl is STILL scamming customers

Regular readers will likely recall the stories we published earlier this year, based around a raft of emails we received from unhappy customers who purchased RTX 5090 cards from Dutch etailer uwgamespecialist. Unfortunately these issues appear to be still ongoing based on communications we have received from various readers in the last few weeks. Time for quick recap, for the record.

We've noticed that you are using an ad blocker.

Thank you for visiting KitGuru. Our news and reviews teams work hard to bring you the latest stories and finest, in-depth analysis.

We want to be as informative as possible – and to help our readers make the best buying decisions. The mechanism we use to run our business and pay some of the best journalists in the world, is advertising.

If you want to support KitGuru, then please add www.kitguru.net to your ad blocking whitelist or disable your adblocking software. It really makes a difference and allows us to continue creating the kind of content you really want to read.

It is important you know that we don’t run pop ups, pop unders, audio ads, code tracking ads or anything else that would interfere with the KitGuru experience. Adblockers can actually block some of our free content, such as galleries!