Exploring Hive Blocks

in LeoFinance4 years ago

hivechain.png Image by @doze

This is not an expert information or educational material on Hive blockchain. This is just an observation by an ordinary participant of the Hive economy and an amateur coder trying to understand Hive blocks. I will be using python and Beem library by @holger80 for this exploration.

I start with the following python code that gets the last block number, gets the latest block, converts into a dictionary format and prints the contents of the block.

from beem import Hive
from beem.nodelist import NodeList
from beem.instance import set_shared_blockchain_instance
from beem.blockchain import Blockchain
from beem.block import Block 
from pprint import pprint 

nodelist = NodeList()
nodelist.update_nodes()
nodes = nodelist.get_hive_nodes()
hive = Hive(node=nodes)
set_shared_blockchain_instance(hive)

bc = Blockchain()
cbn = bc.get_current_block_num()

block = Block(cbn)
block = dict(block)
pprint(block)

The code returns key/value pair information about the block itself and transactions stored within the block. By running a for loop we can get what kind of items or information is stored in each block. Here is the list of 11 items that each block stores:

  1. id
  2. block_id
  3. previous
  4. timestamp
  5. extensions
  6. witness
  7. witness_signature
  8. signing_key
  9. transaction_merkle_root
  10. transaction_ids
  11. transactions

The most interesting item in the above list is transactions, that is where all the transactions like asset transfers, posts, comments, votes, etc are stored. We will explore transactions in more details later. Rest of the items are information about the block itself. If we remove transactions and transaction_ids from the block, the result would look like this:

block.png

id represents the block number. block_id looks like a unique alphanumeric id for the block. previous is a block_id of the previous block. Is this how chaining blocks happen, by including a reference to the previous block? Not sure, but that would be my guess. Then we have witness related information.

witness provides the name of the witness who produced the block. witness_signature how the block producing witness signs the block. And the signing_key is a public witness key. Interesting to note that average users who are not witnesses don't have this key. There are normally owner, active, posting, memo keys. But for witnesses looks like there is an additional key for signing blocks.

Two I items I have no clue about are the extensions and transactions_merkle_root. extensions appear in each block and looks like suppose to have an array/list of data in it. But it always appears to be empty. This is what google says about merkle root:

A merkle root is created by hashing together pairs of TXID s, which gives you a short yet unique fingerprint for all the transactions in a block. This merkle root is then used as a field in a block header, which means that every block header will have a short representation of every transaction inside the block.

That sort of explains the purpose of transactions_merkle_root.

My favorite one is timestamp, provides the date and time the block was produced. HIve blocks timestamp use UTC timezone. Each block is produced 3 seconds apart. However, I noticed sometimes there is 6 seconds difference between block timestamps. My guess for this is that sometimes witnesses miss blocks and when that happens the time difference between blocks is higher than 3 seconds.

The reason timestamp is my favorite because the blockchain uses its internal clock to stamp each block and prevents unexpected actions like double spending, provide transparency and security, and keep the integrity of the chain intact.

transaction_ids stores the list of unique transactoin ids for each transaction stored in the transactions. The following image shows an example of how a list of transaction_ids look like.

transactionids.png

transactions is the most interesting compartment of each block. This is where all transactions are stored, and takes up the most space in a block. It contains a list of transactions and each transactions has the following data within them.

  1. ref_block_num
  2. ref_block_prefix
  3. expiration
  4. extensions
  5. signatures
  6. operations

transaction.png

The main content of the transaction is stored in operations, and rest of the items seems to store information about the transaction itself. I really don't understand how ref_block_num and ref_block_prefix numbers are generated. One thing I noticed though is that ref_block_num never goes higher than 65536 and when it gets that high it starts again from 1. We have extensions again that appear to be an empty array in each transaction. I would guess signature is signed by an account executing the transaction. The expiration stores date and time info like timestamp but the time is in the future. So, my guess is that transaction will expire if not put in a block and produced within the provided time.
Lastly, operations is where all the fun stuff is happening.

operations contains a list of operations. In theory, it seems we can store multiple operations here. However, it looks like there is only one operation stored here most of the time. I explored multiple blocks, and yet to see where there were multiple operations were bundled together within operations.

Each operation has two items: type and value. The way the value data is stored depends on the type of the operations. I wrote a script to go through last 100000 blocks to get the list of unique operation types. Following is the list of operation types I came up with:

  1. custom_json_operation
  2. vote_operation
  3. transfer_operation
  4. comment_operation
  5. claim_reward_balance_operation
  6. feed_publish_operation
  7. limit_order_create_operation
  8. limit_order_cancel_operation
  9. witness_set_properties_operation
  10. claim_account_operation
  11. create_claimed_account_operation
  12. account_update_operation
  13. account_update2_operation
  14. account_witness_vote_operation
  15. transfer_to_vesting_operation
  16. delegate_vesting_shares_operation
  17. convert_operation
  18. transfer_to_savings_operation
  19. account_witness_proxy_operation
  20. delete_comment_operation
  21. withdraw_vesting_operation
  22. update_proposal_votes_operation

There probably are more operations that can be done Hive. They all are self-explanatory. Most common ones we use on daily basis are the vote, comment, transfer, claim_reward_balance operations. The most used operation it seems is the custom_json_operation. If you play games like Spliterlands, you can see those operations stored in a special custom_json_operation. This is probably one of the most powerful features of Hive blockchain to provide for developers of Apps and Games with ability to store and retrieve their data easily on the blockchain.

Feel free to correct me if I made mistakes or wrong assumption about certain parts of the blocks and feel free to share about things I have no clue about yet. Thanks!

Posted Using LeoFinance

Sort:  

I really don't understand how ref_block_num and ref_block_prefix numbers are generated.

Someone who knows what they are talking about:

https://leofinance.io/steem/@xeroc/steem-transaction-signing-in-a-nutshell

Let's discuss the ref_block_* parameters a little: The ref_block_num indicates a particular block in the past by referring to the block number which has this number as the last two bytes. The ref_block_prefix on the other hand is obtain from the block id of that particular reference block. It is one unsigned integer (4 bytes) of the block id, but not starting at the first position but with an offset of 4 bytes.

The purpose of these two parameters is to prevent replay attacks in the case of a fork. Once two chains have forked, the two parameters identify two different blocks. Applying a signed transaction of one chain at another chain will invalidate the signature.


One thing I noticed though is that ref_block_num never goes higher than 65536

This is to stop someone from using a very old block number. They don't want people to sign a transaction using an old block... because if their was a fork a hacker would be able to rebroadcast the public transaction on the other chain.

Me:

https://peakd.com/hextech/@edicted/the-day-of-milestones

So not only does ref_block_num & ref_block_reference stop users from repeating a transaction on the same chain, it also prevents users from repeating a transaction on a forked chain. This was obviously very intriguing for me, because Hive just forked away from Steem and I was totally confused about how this would be accomplished.

Essentially, as long as ref_block_num references a block AFTER a hardfork, it would be impossible to broadcast that same transaction onto another fork. This is because the signatures of the 2 blocks on the separate forks would be completely different. Pretty cool. They really thought of everything, eh?

Because ref_block_num is only the last two bytes of the block number in question, it is impossible to reference a block that is very old. 2 bytes is 16 bits, so the maximum length of an unsigned integer stored in this structure is 2^16 or 65,536. Therefore, every 65,536 blocks the ref_block_num overflows back to zero and the process starts all over again. With 3 second blocks, this only ends up being about 54.6 hours per cycle.

Posted Using LeoFinance

Awesome! Thank you for sharing the posts on this and explaining why and how ref_block_num and ref_block_reference work. Very helpful. I will study the posts more.

Posted Using LeoFinance

My python skillzz might become useful soon

Posted Using LeoFinance

Your skillz are impressive!

Posted Using LeoFinance

The extensions used to store the running version of steemd that signed the block, as well as the upcoming hardforks that a witness has accepted (if any). But since HF23 this information is no longer present.

This is the first block that I signed on the chain which recorded v0.22.1, and the last block that I signed on Steem, which shows that I accepted HF23 at a particular timestamp.

There probably are more operations that can be done Hive.

There are more than 50 different operations on Hive, which can be found here. Some of them were depreciated in previous hardforks, such as creating an account with delegation.

Thank you for sharing. These are very useful. Do you know what extensions inside of transactions for?
That operations list is great. Thanks. It even has virtual operations. Something I was going to look at next.

Anyone who’s interested in exploring hive blocks may enjoy this visualization. It streams Hive blocks and visualizes their contents, including identifying the apps in custom JSON operations.

https://hiveuprss.github.io/hiveisbeautiful/

This is pretty cool tool! Where can I see the code?

Posted Using LeoFinance

Nice! Thank you!