Introduction
As some may have seen, the site has been going off line when the WebSocket server throws an internal service error 500. This is due to a slowloris type attack against the WebSocket connection.
It was originally believed that nginx was not vulnerable to this attack but that is not true. Nginx does in fact have a connection limit even if it is not explicit in the configuration. Scarce resource is the maximum number of simultaneous worker connections. This number can be calculated as (worker_connections * worker_processes) and equals to 512 in default nginx configuration.
There are a few methods to prevent this attack, the details on how best to prevent this can be found at the bottom of this post.
Why this is important
This is an important issue to patch because it allows for a single machine to take down the server, this is not a distributed denial of service that requires a bot network. It is not expensive to run this attack.
Since this is the only existing UI, it would be misinterpreted as more damaging than it actually is. I'm actively working on creating a secondary open source UI so the service becomes federated and therefore more resilient. This will be released open source, so others can do the same.
Technical Explanation
Slowloris tries to keep many connections to the target web server open and hold them open as long as possible. It accomplishes this by opening connections to the target web server and sending a partial request.
Affected servers will keep these connections open, filling their maximum concurrent connection pool, eventually denying additional connection attempts from clients.
This technique not only works currently for POST and header attacks but it particularly works well against this server for WebSocket connections.
Recommendations
- If the WebSocket connection fails, the site should fail gracefully. If necessary you can poll for reconnection.
- Increase the limit of workers or worker connections, choice should be made based on the resources of the server.
- Limit WebSocket connections per IP, limit the number of HTTPS connections per IP. It should not be 1 but it should also not be infinite.
Commentary
I was sad to see my last security report got very little up votes compared to self referential posts about price or woman taking pictures of themselves with little clothes. I was hoping the community would value input that makes the platform more stable and secure.
Thankfully it was still seen by the developers and server administrators who were grateful and I'm glad to see the issue was fixed quickly before it could be abused. But honestly I had expected the community to use their weight to reward people who are actively improving the security so I could be rewarded without having to put more demand on the developers funding.
I hope this one does better. Unfortunately I will not be answering questions because I have limited time and I will be focusing my remaining time today on improving the steem golang library.
I really have not had time to do a comprehensive tests of both the code base and the server, so I expect to find more. Anything that is too dangerous to share publicly will be disclosed privately to the developers directly.
You should go talk about this on the Github or on the Slack!
Someone wants to bet that this post doesn't get the attention it deserves? Communication is so broken on this project. Between the Slack that's way too chatty with important info getting lost in the noise and completely ignored by everyone, and Steemit where devs and witnesses are more interested in optimizing payout by upvoting popular posts than reading actual discussions about the project, Steem is really having ADHD. There isn't even a forum where messages can, you know, just stay around until someone pays enough attention to read and reply. Already told that to @ned, but he can't care less and wants to eat his own dogfood by using Steem nevermind the fact that the 24h focus makes it totally unpractical to have any sort of constructed discussion over a longer period of time. I already see how this project is racing itself into the wall at full speed. There is just no way to ring an alarm bell and be listened or have any sort of structured discussion.