My random thoughts as a witness...

in #witness-update7 years ago (edited)

From a technical point of view, preparing a consensus node is the simplest thing we have to do as witnesses. The setup can be done in a few hours, it's just a running piece of software on a decent hardware. The software is Open Source and published on GitHub, tolerably stable hardware we can get from a small hosting company for $40-50 per month and all we need to do is install Linux and copy&paste few commands. That's it! We are witnesses now.

Are we?

"It's not in production unless it's monitored"

This is the thumb rule we should follow. Running a server without monitoring is not a good idea, the systems crash from time to time and it is ok, we just need to know how to handle it. Murphy's law works all the time, anything that can go wrong will go wrong.

If we can afford the costs of running hardware for witness node, we should be also ok to buy small VPS and run simple monitoring system for our own purposes. We should act proactively and take actions before it's too late. The Monitoring system can send us notification when our server is not happy and also temporarily disable our witness from the block producer pool.

If you don't know where to start, please check Zabbix, Sensu, Nagios and conductor.

What to monitor?

Definitely, we want to monitor the disk usage, maybe it sounds stupid if we have a storage with a lot of room to grow, but sometimes one broken script or application can instantly fill up the disk, and the system becomes unstable in just a few minutes. Isn't "stupid" issue?

Another critical factor is the memory. steemd loves RAM because of its architecture. The shared_memory.bin file is constantly growing, every day is getting bigger and bigger, and to keep our nodes in a good shape we need to be sure that /dev/shm and SWAP are large enough to accommodate it and let us survive the next 24 hours ;-)

If the memory usage is close to the limit, there is one cool thing we can use, it's called ZRAM. Zram works as a kernel module and creates a compressed block device in RAM, so the part of shared_memory.bin file can be compressed before storing it in RAM, of course, it will cost additional CPU cycles to compress and decompress data, but the advantages seem to outweigh the disadvantages.

Error log

./steemd by default prints the output to the STDOUT and after some time we lose the history. In case of any issues with the node, we won't be able to check what went wrong. This can be easily fixed by redirecting the output to the file,

./steemd 2>&1 | tee -a witness.log

Don't be afraid to get your hands dirty with code

Probably many of us don't code, especially in languages like C or C++, but at least we can try to be as close to the code as possible and start watching steemd issues repository on GitHub. There are many interesting discussions between users, contributors and development team, it's also a very valuable knowledge base about Steem.

I missed the block but it's not a big deal

Sure... another witness will handle transactions and produce the block, but the real question should be, why do we miss it? If our server is completely down and we don't know about it, there are no excuses for that.

If the situation is really bad, and we don't want to miss any blocks, it's good practice to temporarily disable a witness from the producer pool by pushing the STM1111111111111111111111111111111114T1Anm as new signing key.

The Steem is as stable as our nodes are, people votes for us because they put trust in us and this is our responsibility to keep the platform reliable.

Price feed

As witnesses, we should update the price feed at least once per day and there are many tools we can use for that. No matter if we are on the top or bottom of the witnesses list, it is something we are responsible for.


(https://steemian.info/witnesses, @drakos)

Witness node as a SAAS...?

We are lazy and love to automate everything. We can run witness node as Docker instance in just one click and this is a real time saver, but at the same time we should know the basics, how to run steemd manually, where is the configuration file, what are the arguments, what is the replay or resync, where are the shared_memory.bin or block_log files or where are the logs, because we never know when we have to back to the roots... ;-)


If you think I can be a good witness, please vote for me.

Thank you!

Sort:  

Checkout PRTG network monitor. Although is unfortunately windows based, it's the best monitoring tool ever. Checkout the sample Here.

I am using it to monitor my seed nodes as well to determinate analytics from others. It has full ssh access / process monitoring and a tons of features.

Really recommended. Light years better then MRTG/Monit alternatives.

For graphs I’m using Graphana and I didn’t find anything more flexible so far ;-) for monitoring/alerting Zabbix. ;-)

Very tight in competition. Anyway, great work! Nice to see in depth knowledge. Voted for your W :)

Thanks ;-)

Monitoring and logging is too often overlooked, nice work.

In your first paragraph:

The software is Open Source and published on GitHub

The github link is going to steemit/steemd rather than steemit/steem (without 'd') in case you wanna update.

Thanks, it’s fixed now. ;-)

That's a lot of technical issues there. Good job maintaining it!

Thanks, handling the issues is the best part of this game ;-)

The Steem is as stable as our nodes are, people votes for us because they put trust in us and this is our responsibility to keep the platform reliable.

Well I vote for who got humor. I don't care if steem will be alive or dead, in the end we'll be laughing.. :)

IMHO you need to be more active and play to fields. After I read this post I though for a second and checked your profile to see why I followed you, that's because the post about bandwidth issue.

*edit: I'm not sure about that saying, play to the crowd? maybe play to the stands.. something like that.

Monitor or GTFO :)