SteemSQL - Revamping the database and table design

in #steemsql6 years ago (edited)

The original goal of SteemSQL was to stick exactly to the content of the blockchain. But over time, it is clear that this option is often penalizing in terms of performance when executing complex queries.

Some values are stored in text format while they contain numeric data and should be separated from the asset symbol that comes with it. Some fields exist in tables whereas they do not contain any data.
Some value are stored in the database but never used by the blockchain, which leads to storage space waste.

Therefore, I have decided to slowly move away from a pure 1 to 1 match between the blockchain format and the SQL table design.

A first step has just been made today with few changes to some tables. Here they are:

Changes to the Accounts table

1. Columns removed

As there are now obsolete or have never been really used by the blockchain activity, the following have been removed from the Accounts table:

owner_challenged
active_challenged
comment_count
lifetime_vote_count
can_vote
average_bandwidth
lifetime_bandwidth
last_bandwidth_update
average_market_bandwidth
last_market_bandwidth_update
lifetime_market_bandwidth
transfer_history
market_history
post_history
vote_history
other_history

2. New columns

The reputation columns contains a big integer number represent the reputation acquired over time by a user. Unfortunately, this number is not very comprehensible and need some computation to be transformed to the human readable number we are used to see close to our name.

Many SteemSQL perform this number transformation either locally or, even better, include this computation in their queries. The problem is that such computation is done over and over again each time a query is issued, which is not very effective as it turns out that the result does not change so frequently.

Therefore, I have added a pre-computed columns reputation_ui that will contains the human readable reputation the accounts with up to two decimals digits.

Changes to the TxTransfers table

Following @themarkymark's request, I have now enabled Full Text Search on the memo column.

This will allow you to use the CONTAINS() and FREETEXT() predicate functions rather than the infamous LIKE, which is a performances killer on tables with millions of records.

Changes to the TxCustoms table

As more and more apps/dApps rely on custom_json transactions for their operation and as more queries are issued against this table to provide statistics on the apps, I have added an index on the tid column to speed up performances of such queries.

What’s next?

In the coming days, I plan to do more and more database revamping like this in the future to improve performances, resources consumptions and ease of use from the SteemSQL user’s perspective.

Thanks for reading!


footer created with steemitboard - click any award to see my board of honor

Support me and my work for the Steem community.

Vote for my witness

Sort:  

.

Thank you @crokkon, your comment is really appreciated!

Hi @arcange!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 6.732 which ranks you at #117 across all Steem accounts.
Your rank has dropped 4 places in the last three days (old rank 113).

In our last Algorithmic Curation Round, consisting of 173 contributions, your post is ranked at #37.

Evaluation of your UA score:
  • You've built up a nice network.
  • The readers appreciate your great work!
  • You have already shown user engagement, try to improve it further.

Feel free to join our @steem-ua Discord server