On october 4th, Facebook inc. had a massive service outage that took out all of its services, including facebook, instagram, and whatsapp.
It was reported that despite all precautions, failsafe, and redundancies, Facebook Inc. had to resort to finding someone with an angle grinder to cut the lock off a server cage in a company owned datacenter on the east coast in order to restore services.
Facebook is not the first company to do this. The analogy here is like locking the keys in your car from 1985. (You know, before cars were electronic.)
If you ask most network techs about working on edge devices and how not to mess things up too badly their answer is simply “reload in 10”.
I learned this while at the cisco academy in college. It’s a safeguard against my poor decisions or typos. So, if I make a routing change that breaks everything or creates a loop that floods the network and stops all packets in place, I know that in 10 mins the firewall or router will reboot. It will revert the changes that I just made to a point that I can get back in review my own mistake. If the route change is good and everything works as intended, then I can cancel the reload and all is well.
In the world of working remote, almost 100% of network repair and configuration is done remote over SSH or some other secure connection. This presents a problem if you get locked out of a network or system. I have done it so many times in my home lab that I finally just ran a serial cable from my basement to my office. I could also just walk down to my basement network rack and poke a button.
So, what happens when there is no option to reload in 10? The solution is to break the glass. If you have a system or application that uses authentication to be able to manage it, you need to create a break glass account. An account that is used in a worst-case scenario similar to Facebook getting a guy with an angle grinder.
Planning here is the key. Building the process out in advanced so that when that day comes you and your employees know what to do. The application could even be Office 365.
With office 365 you may have an admin account that can make changes to the tenant but what happens when the admin account gets locked out or deleted. The solution is to create a break_glass account that is being monitored for use. Someone must check out the account with a security manager and the process is recorded. Then once the account it checked back in, the account is re-setup for the next event. Think of changing the password here as replacing the glass plate for a fire alarm after the fire is out.
Until next time, stay safe and don’t get locked out of your own server room!
If your company does not have a break_glass account, please let us know and NuWave would love to discuss how we can help!