You are viewing a single comment's thread from:

RE: An open-ended question to @ned and @dan

in #steemit7 years ago

It's pretty simple to understand really. Blockchains are designed first and foremost to be a tamper-proof, permanent record. Databases need different structure than a log to be quick to search. To confirm a transaction, there has to also be a search of the history, to ensure provenance and ownership is as claimed in a new transaction. It is not complex to implement a relatively fast search by having an index tied to the block log, but it's not an optimal search strategy.

Then, add a forum and votes to this structure and you can see pretty easily how horrendously resource consuming and processing consuming it becomes as it gets past a certain size.

Many forum applications actually use the underlying filesystem to store the files. Most filesystems have 1, 2 or 4kb filesystem blocks, which is a good size for blog posts. Usually 4kb these days. It makes a lot more sense to store posts as individual files, or, at least, using a sparse 'shared memory' type file, create 4k blocks that store one post each. Most posts will fit in one block. The actual underlying varies from 512bytes to 8kbyte blocks, so, depending on this, one can make it very efficient for retreival. A usual strategy would be to partition the storage area into several different size blocks. you could have 512 byte, 1kb 2kb and 4kb regions in your shared memory file. These then require an allocation table with the hashes and the addresses of each block/page, and retreiving them is very very fast.

But a Key/Value or Graph database format does not understand this. It should be coded so the graph only keeps merkle roots of posts, that each post is coded into a string of indexes referring to the initial and subsequent edits, with the hash, and then you do two index searches and you have the data. But I'm pretty sure that dumb people just see a text format like json and don't even consider hardware limitations.

More than likely, this is exactly what the problem is. If there was merkle trees for each post instead of the whole text in the records, it would stay fast until it gets enormous. But if all those posts are strung together willy nilly alongside other, fixed-sized data blocks, it's gonna have a pretty serious problem with indexing and exploding addressing requirements. The indexes are probably designed for graph data blocks which probably are 128/256/512byte page tables. This is gonna clog up the allocation table really fast putting 1-4kb blobs in there, you see what I mean?

Sort:  

I'm not native english speaker nor a professional in any way at computer science so I don't understand all the details and terms you're trying to explain and use. This is something I'll try to fix with further education and I can always come back to this answer to fully understand it at some point in time.

I did however understand(or did I?) that it's not wise to store every post as plain text data and it rather should be coded into a string that could be searched,processed and stored efficiently.

Thank you for the detailed answer!

 7 years ago  Reveal Comment

Interesting. So majority of people isn't informed enough to see this problem at all and those who know aren't voicing it.

It's starting to feel Steem was just hastily put together using tools available and without that much thinking for the long term (+5 years and beyond).

Doesn't really make me want to put money to EOS at all.

How familiar are you with Ethereum? I'd like to hear your view on it.

 7 years ago  Reveal Comment

I remember reading about pruning and ethereum in a same sentence before, they must be working on it.

Only time will tell, but there's really no returning back for Dan if he leaves EOS shortly after, but at that point he can buy an island or two for himself so it'll be fine.

And yes, Dan always mentions how he can spit out these different projects out fast and then he leaves the rest to fix them or to keep together. Bitshares seems to be holding fine though.

So do you think there's any blockchain that's doing "it" right now or have we yet to see such a project? Will any of these projects be running in 10 years.

What you said in the last paragraph is where I think Dan is going with his "Blockchain OS" concept. You raise some great points and are obviously no spring chicken to coding and systems architecture.

Having a solid foundation upon which to build higher layers is an important architectural principle. As things change in the upper layers they impose new requirements on the lower layers that test how flexible and comprehensive the foundation layers were designed. Sometimes a redesign of the foundational layer is more cost effective or quicker to implement than trying to retrofit changes (refactoring) to meet new requirements.

I'm reminded of the way the systems of the Voyager space probe were designed, both hardware and software, and b/c that design was so good the probe had a much longer lifespan that was ever anticipated. This depth of engineering and innovation is what we need for systems whose aim is to make mainstream institutions obsolete. We need long term thinking motivated by the value to humanity, not limited by greed or personal gain. An attitude of altruism but marketed in a way that provides support.

I want to read more of your ideas to motivate coders via 'monetisation of code', but on the surface it doesn't address this altruistic / innovation aspect for the long haul. It does address personal rewards for the coders, but it's divorced from the goal and purpose of the coding effort. It's not easy to anticipate future needs and build flexible systems than are adaptable to those needs in a cost effective way. How that type of innovation is incentivized and rewarded is the distinction I'm getting at.