[Systems Geek Series] Re-Factor design for ingestron

in #beyondbitcoin8 years ago (edited)

Okay, so this is going to by necessity involve the way that I design cluster compute systems as well as my code design for ingestron. So in ingestron 0.1 we learned that the limiting factor for dumping the chain is the steemd. It didn't keep up with the other pieces of the system. So, I went back to the drawing board to try and use goroutines and channels to multithread ingestron. That didn't work out so well, because I didn't keep it simple. Now, I'm going to attempt to do the same thing but I will follow a pattern:

What my Clusters Look like when viewed through https://weave.works 's scope:

Processes On a Single Node:Screenshotfrom2016-09-1705-19-417ddb5.png

My 4-Machine ClusterScreenshotfrom2016-09-1705-22-567690a.png

Containters on a ClusterScreenshotfrom2016-09-1705-23-13469d0.png

All the Processes on a ClusterScreenshotfrom2016-09-1705-24-54e7650.png

Here's how ingestron needs to work to be fast (~3h for an ingestion, currently at 24h)


Clusterpattern0564a.jpg

Clusterpattern11c19a.jpg

So please know that building ingestron involves a couple of distinct steps:

Making the basic reads and writesdone
Cleanly reading all json
Making it concurrent in golang
Providing a config file
Making a single instance read from multiple servers
Designing a trigger system for backups

Here's my current dev plan:

Before I wasn't really aware of the depth and breadth of Golang's library ecosystem. Now I am, so I will try to implement this using one of the following:

First Attempt will be this pair (but that doesn't handle interacting with specalized DBs-- but then again with enough RAM anything is possible)

https://github.com/tidwall/gjson
https://github.com/tidwall/buntdb

I'm pretty darned sure that I can get that up and running.

Then, I'm going to worry about concurrency, meaning I will need to drop buntdb, because it only supports a single concurrent write.

  • gjson to read the json
  • rethinkdb for real-time info
  • cockroachdb or crate.io for other info

and then last but not least, cayley graph db for graph analytics.

I'll post a report on how it went in about 24h.

Update#1: 10:53AM: Six hours later, I'm very sure that the layout I used before, won't work to do this right. Need to look into breaking the app into multiple folders and giving it structure.


If you liked this post, follow me, @faddat.... and don't be shy-- upvote it, too!

and if you've got questions, drop a line, [email protected]


This post and others like it made possible by @officialfuzzy's beyond bitcoin community

Sort:  

Very good presentation, excellent work is that you have presented. Thank you so much

Excellent content and empowering me to focus on achieving my goals on steemit

Nice idea mate