Analyzing SteemSQL NOLOCK Queries Required - Steemit Business Intelligence

in #bisteemit7 years ago

SteemSQL database is a fantastic resource.  It takes all the data from the STEEM blockchain and organised it into a structured SQL database.

The operator of this service is   @arcange  and there are many posts on his blog documenting the database, I recommend you check the out

You can find the latest post here

https://steemit.com/steemsql/@arcange/steemsql-update-11

(Also @acrange is a witness - consider giving him a witness vote to keep this service going)

At the moment I have to limit my data posts, because using Power BI to query the database is causing a problem.  The M language used to create the query does not convert to a (NOLOCK) query.  This results in the database hanging.

If you are using Power BI or Excels Power Query please note you should also restrict the queries you are running to SteemSQL

After searching the official Power BI forum, the only solution is to this is to use SQL and enter the query as an SQL query.  This can be added in the advanced section when initially connecting to the database 

What if I don’t know SQL?

I don’t know SQL, so if you don’t, well you not on your own.  But if we are going to analyse the SQL database then we really need to know how.

@carlgnash has some really good articles.  I bookmarked them as I knew they would come in handy.  This is a fantastic resource. Kudos to @carlgnash, please do keep these coming.

https://steemit.com/steemit/@carlgnash/curate-like-a-boss-a-beginner-s-guide-to-querying-steemsql-by-a-complete-sql-beginner

In addition to this, for a deeper dive on SQL let me recommend the SQL module form the Microsoft Data Science Professional Program.  You can complete this course free, however if you want a certificate then you will have to pay $99

https://www.edx.org/course/querying-data-transact-sql-microsoft-dat201x

Let’s Share Codes

Moving forward as I learn SQL and use it to query the Steemit SQL database I will share codes and tutorials using https://utopian.io/.

Utopian.io is an app on the Steemit Blockchain.  Created by @elear and can be found @utopian-io.  It allows anyone contribute to open source projects and be rewarded with STEEM

I am also going to do some posts detailing the contents on the tables in the SQL database as an easy reference resource for fellow Steemains

 I am part of a Steemit Business Intelligence community. We all post under the tag #BIsteemit. If you have an analysis you would like carried out on Steemit data, please do contact me or any of the #bisteemit team and we will do our best to help you... 

You can find #bisteemit  on discordhttps://discordapp.com/invite/JN7Yv7j  


Sort:  

Thanks for spreading the word @paulag! And I am really glad to hear about utopian.io, that sounds really cool! Heading over there to check that out now :) Keep up the good work! Cheers - Carl

Thank you @paulag for mentionning my work and the potential of SteemSQL

SQL has a bit of a learning curve, but it's well worth the effort for the more complex queries. The MS SQL Server can also query into the JSON data used often by Steem.

oh I am glad I saw this comment @eturnerx, I just downloaded PowerBI last night and was playing around with it and couldn't figure out why I couldn't see the JSON data. That is actually a pretty major limitation if PowerBI can't query into the JSON data - that is where the other 4 tags of a post are held! So PowerBI can only search by first tag if it can't see into the JSON metadata. Almost all of my queries use the JSON data in fact!

@carlgnash you can see into the json columns, and you can also phrase them into normals columns too with power bi

oh okay I obviously need to read some of your power BI tutorials :) When I was first looking into the Comments table using Power BI, in the data preview I was able to see results for all the other columns, but not the JSON metadata. Good to know you can access those with PowerBI, it would have been a deal breaker for me otherwise. Cheers - Carl

Real content about stemit

it is weird... why does one need to centralize steem blockchain?

SteemitSQL is not a means to centralize the steem blockchain. With SQL queries you have the power to get custom data beyond what the SteemAPI calls offer. It is also useful for getting steem related statistics data. e.g 'get all posts this week that has earned over 20sbd where the comments are less than 3 and where the author registered no earlier than last month with a reputation of not more than 50'

Great post. SQL databases, the nether realms of big data

Analyzing SteemSQL .........thank you useful information steemit.com

In addition to this, for a deeper dive on SQL let me recommend the SQL module form the Microsoft Data Science Professional Program. https://www.edx.org/course/querying-data-transact-sql-microsoft-dat201x

I've done the one on Coursera which Duke Univeristy provides... Is this one more in depth, for programmers maybe? I see that a lot of US univerities skip on the math and details to make it easy but by doing that they don't provide us solid knowledge framework without providing the right background information.