Witness Downtime June 18th 2018

in #witness7 years ago (edited)

This post is a quick update about my current witness downtime on June 18th 2018, as both my main-server and backup-server are replaying the blockchain.

What happened?

I'm currently in Switzerland for a big blockchain / cryptocurrency event (more on that in another post) and before I left home for the airport, I wanted to make sure that both of my servers were up-to-date in regards to the Ubuntu packages. So i ran sudo apt update & upgrade

While the updates went completely fine and in the past updates like that never caused any problems & my time was running short - I went on my way.

But just when I was in the train to the airport I got a notification on my mobile-phone that my witness missed a block. Well, you can imagine how perplex I was, but luckily I've prepared for such an event.

My Reaction

The first thing I did was to login remotely into both servers (while still on the train), where I found out that both steemd instances stopped working.

It seems that the updates for Ubuntu packages were so fundamental, that somehow the docker-image stopped working - which caused my witness to miss a block.

Now, as soon as I realized that both servers weren't able to produce blocks, I disabled my witness, restarted both servers and started to replay the blockchain.

However, this process takes a lot of time - the older the blockchain gets.

The more blocks the blockchain has (a new block is getting added each 3 seconds), the longer it takes to get to the current block-number (replaying is always starting from 0).

Right now, my server with the highest sync-progression has reached about 82% sync with the blockchain.

My estimation is that the blockchain should be fully synced in the next hours, when I'll activate the witness again.

What I've learned

Now, I'm a big believer in honesty, transparency and learning from everything I do.

Out of this experience, I'm taking away two core things:

1.) I should always check double - even if out of 100 times, the same action never caused any problems.

2.) I should take my time for important things. Updating packages for ubuntu isn't a great thing to do, when the train leaves in 30 minutes.

While these 2 lessons are important, I'm also proud about my reaction time. I successfully disabled the witness fast enough, even while being on a train-ride to the airport, to miss not more than 1 block.

Steem is a huge priority in my life and I'm always giving my best!

I will give an update as soon as my witness is activated and producing blocks again (should be within the next hours).


Update 1:

Both replays stopped. One with an error and one without. Changed some things up and restarted replay. Let's see how it goes this time.


Are going to Crypto Valley Conference 2018?

Good Job Mr. Holmes - that's where I'll be!


Thanks for explaining parts of the technical background. And it's almost the same...never commit and run ;)