Querying and Validating JSON in SQL
I was looking for a way to determine how many times I might have followed someone or if a follow was recorded more than once, or perhaps not at all in the blockchain. I was looking at this as corroboration on a bug that I had found.
I had looked at several potential methods at exploring the information but ended up going through MSSMS and into the SteemSQL database to explore.
Background
I found that the information that I was looking for is in the TXCustoms table. This is the table that contains the following fields:
tx_id, tid, json_metadata, timestamp.
The json_metadata is the field that we are interested in searching through. It appears as there is two type of JSON data in the json_metadata column, follow[Unfollow][Mute] and reblog.
- Follow -
["follow",{"follower":"jasonbu","following":"fabien","what":["blog"]}]
- Unfollow -
["follow",{"follower":"jasonbu","following":"fabien","what":[]}]
- Mute -
["follow",{"follower":"jasonbu","following":"geke","what":["ignore"]}]
and - Resteem -
["reblog",{"account":"argd88","author":"sancho.panza","permlink":"concurso-recital-de-poesia-sancho-panza"}]
Querying the Data
Querying the table is pretty straightforward, until you want to isolate your WHERE clause to data points within the JSON text. For instance, I wanted to find out how many times I had followed the user Gmuxx.
Here's what one of the strings from the JSON_metadata column looks like
["follow",{"follower":"jasonbu","following":"pocket-pi","what":["blog"]}]
That's a bit messy, but you can clearly see the elements of the JSON records. Let's have a look at it broken out. Notice the 1 and the 0. Those are key to getting at the data elements in the string.
Luckily SQL has a few built-in functions for dealing with JSON strings.
If you have JSON text that's stored in database tables, you can use built-in functions to read or modify values in the JSON text. +
- Use the JSON_VALUE function to extract a scalar value from a JSON string.
- Use JSON_QUERY to extract an object or an array from a JSON string.
- Use the ISJSON function to test whether a string contains valid JSON.
- Use the JSON_MODIFY function to change a value in a JSON string.
Source: Doc.Microsoft.com
We will use the ISJSON and JSON_VALUE functions to query the** json_metadata** field. Recall the 1 and the 0 from the image above.
We will use the $[0] to specify the first element in the string
and then the $[1] to reference the second element in the string. To get at the "follower" data element we would reference it using $[1].follower and if we want the "following" data with $[1].following.
The Query
My initial query checked that the field contained JSON data. It then checked for the "follow" type and then specifically for the "follower" being me and after the date that I joined Steemit.
** Remember to use NOLOCK
That returned all the records for when I followed someone on the platform. If you recall I wanted to know how many times I had voted for Gmuxx, specifically. I modified the query to include $[1].following and was rewarded with the following results.
As you can see I did actually follow Gmuxx twice. The data is confirmation of the bug.
Hope you found this helpful and let me know if you have any questions.
Posted on Utopian.io - Rewarding Open Source Contributors
Beep! Beep! This humvee will be patrolling by and assisting new veterans, retirees, and military members here on Steem. @shadow3scalpel will help by upvoting posts from a list of members maintained by @chairborne and responding to any questions replied to this comment.
This is an excellent tutorial for those wishing to look into the steemsql database. Well done. I did see you interact on discord with this query :-) happy to see that you got it to work, I know you put a lot of effort into it
Thanks, @paulag. Yes, lots of time tweaking. A good feeling when it clicks. Hoping to dive more into the queries and data... I might tackle your PowerBI generated script from your last JSON post. Let you know if I figure it out.
let me know if you get stuck
Excellent tutorial for those who want to go into the **json_metadata. **
Thanks, @eastmael. Appreciate the comment. It was a challenge and I hope that I am able to assist others if they run across json_metadata.
As i wrote in utopian discord it is not a bug, thats how steem blockchain works. You send transactions but that transaction does not operate on your assets so there is no need to check if you are trying to do it second time (like prevent double spend attack). Such checking would cause unnecessary load for the witnesses.
Thanks for the info @whd. It helps. It looks more so like an interface issue and not validating the state.
thanks @jasonbu
You're welcome. Hope it helps.
Your contribution can't be approved yet because your repository isn't right. Please edit your post and change your Github repository.
You can contact us on Discord.
[utopian-moderator]
Thank you for the contribution. It has been approved.
You can contact us on Discord.
[utopian-moderator]
Hey @jasonbu I am @utopian-io. I have just upvoted you!
Achievements
Suggestions
Get Noticed!
Community-Driven Witness!
I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!
Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x