Using hiveSQL to find content

abh12345 (80)in STEMGeeks • 4 years ago (edited)

I was asked recently about using @hiveSQL as a tool to find 'good posts' to curate, and so at the risk of facing the wrath of @arcange (who may appear in the comments to tell me my query is killing the DataBase), here is a sample query with the code explanations.

_{middle right, lol}

First you will need some login details for hiveSQL. Send 0.01 HBD to @hiveSQL and wait for an encrypted memo. I use peakD and Hive Keychain and so there will be a padlock next to the memo to decrypt it.

_{Reaz wants dataaa, hiveSQL obliges}

_{Click the padlock to find out what's inside!}

Next you will need a tool to connect to the DataBase. I use Linqpad which is free for what we require: https://www.linqpad.net/Download.aspx

Once it is downloaded and installed, you can use the details sent to you via the encrypted memo to connect.

_{Check 'Remember this connection' at the bottom of this window to save some time}

Now you are connected it's time to pull some data. But first, a quick run through the buttons you'll need. From left to right:

Run your query, Stop your query, Rich Text results, and Tabular Results. Also, make sure your Language is set to SQL.

And a note before we start - it is not a good idea to select everything from a table. If your query is running for more than a minute or two, it might be an idea to stop it and have a rethink. It is also not good to use 'LIKE' statements to scan lots of text. I have done that here and but only on a small set of data - there is likely a better way!

Let's start with finding out content created in last 1 hour by author having reputation between 30 to 60 and post having minimum 500 words and reward less than 1$?

This is a fairly basic template, and so I've added some more filters in. The full query is below but I want to break it up to explain various parts.


SELECT
    author, A.reputation_ui,'https://hive.blog/'+ url, 
    left(body,500),
    pending_payout_value, 
    C.created, 
    body_length as CharacterCount

This is what will hopefully be returned by your query. The author, their reputation, the URL of the post (with your choice of condenser appended to the front, the first 500 characters of text (for an idea of the content), when it was created, and the number of characters in the content. A word count is not available and so dividing that total by 4 or 5 should give a rough idea.

FROM 
    Comments C
INNER JOIN Accounts A on C.author=A.name
WHERE

The Comments table is where 'posts' and comments/replies are kept. They are treated the same by the blockchain and SQL database, but are distinguishable in the DB using a couple of methods (one is discussed further below). For the author reputation, a join is made to the Accounts table. This is not so important for today but for those who care, the 'C' and the 'A' are aliases given to these tables when referencing them and the columns they hold.

'WHERE'.... stuff ='s other stuff.

    (
    NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"cn"%') 
    AND NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"ru"%')
    AND NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"kr"%') 
    )

This is used to filter out tags/topics. I don't know these languages and so wish to exclude them from the results. $.tags is a section of C.json_metadata and so wildcard searching, although painful, is less of an issue. There is likely a better way to do this, and I hope someone can chip in.

AND NOT category IN ('porn','tits','arse','fanny')

The category is the first tag and fortunately has its own column. You can use this to exclude tribes ('hive-XXXXX') or other topics you'd rather not see in the results.

AND depth = 0

This ensures that only top level comments (posts to you and I) are returned and it's use will save time and DataBase pain.

AND NOT author IN ('abh12345','add_spammer_here')

You know those authors who you are sick of seeing each day? You can exclude them here.

AND NOT active_votes LIKE('%"spaminator"%')

Another part of the query to make arcange cry, a wildcard search through the voters so far. In this case, if 'spaminator' has given the thumbs down to the content, it wont appear in the results.

AND datediff(minute, C.created,GEUTCTUTCDATE()) < 60*4

This compares the creation date of the post to this moment in time, converts that into seconds and compares against 60*4 - 4 hours. I ran the query with the suggested 1 hour window and 0 results were returned, damn!

AND A.reputation_ui BETWEEN 30 AND 60

Joining to the Accounts table and using 'A' as the alias, we can check the author reputation. In this case, the reputation will need to be between..... 30 and 60 to appear in the results.

AND pending_payout_value > 0.0100
AND pending_payout_value < 3.0000

The could also be written as a 'BETWEEN' statement and it is to check the current pending payout of the post.

AND body_length  > 4000

The post must be at least 4000 characters in length, which if we use 4.79 characters as the average word length, is about 835 words. Fairly chunky for a post these days.

ORDER BY created desc

And finally, the way to order the results. I have used the date with the newest post first.

And here is the query in full:

SELECT
    author, A.reputation_ui,'https://hive.blog/'+ url, 
    left(body,500),
    pending_payout_value, 
    C.created, 
    body_length as CharacterCount
FROM 
    Comments C
INNER JOIN Accounts A on C.author=A.name
WHERE
    (
    NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"cn"%') 
    AND NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"ru"%')
    AND NOT JSON_QUERY(C.json_metadata,'$.tags') LIKE('%"kr"%') 
    )
AND NOT category IN ('hive','porn')
AND depth = 0
AND NOT author IN ('abh12345','add_spammer_here')
AND NOT active_votes LIKE('%"spaminator"%')
AND datediff(minute, C.created, GETUTCUTCDATE()) < 60*4
AND A.reputation_ui BETWEEN 30 AND 60
AND pending_payout_value > 0.0100
AND pending_payout_value < 3.0000 
AND body_length > 4000
ORDER BY created desc

The query above has taken less than a second to return a set of results which has been between 8 and 16 items. To me, that doesn't sound like a lot of posts in the past 4 hours, but obviously it is due to the criteria chosen and this can of course be changed.

And change you will need to, unless you speak Spanish or Polish. Although I did find this one:

https://peakd.com/@gubbatv/posts

~~And I'm not convinced it is her so I sent a short message and will comment on the post if I find out more.~~

Edit: Confirmed to be the owner of https://gubbatv.com/

Anyway, I hope this helps @sanjeevm, and anyone else who is interested. If you plan to have a go with the above, check the post again later to see if arcange has commented with some nicer script to replace the wildcard 'LIKE' searches.

Cheers,

Ash

Posted with STEMGeeks

#sql #hivesql #search #palnet #archon #stem

4 years ago in STEMGeeks by abh12345 (80)

$29.21

Sort:

Trending

[-]

slobberchops (81) 4 years ago (edited)

Great! I used to use that @curie tool but it only works on STEEM, I could use a replacement.

EDIT: Got it working.. I will have a mess around with your code and do some custom stuff next!

$0.57

9 votes

[-]

geekgirl (79) 4 years ago

I have a replacement almost ready. Will deploy the App soon.

$0.34

3 votes

[-]

slobberchops (81) 4 years ago

I could get into this.., haven't done any coding for a while, but I'm a geek at heart!

$0.17

1 vote

[-]

sanjeevm (75) 4 years ago

Is @curie dead on hive ?

$0.03

1 vote

[-]

slobberchops (81) 4 years ago

No mate, very much alive but more subdued than before. The witness account is hanging around the top 30 and I curate daily for them (if I can find anything worthy!).

$0.00

[-]

sanjeevm (75) 4 years ago

Good to hear, it has witness vote from me. However, yesterday, when I went to the page, I saw it was referring to Steem. At least let's update that. Also the certificate is not valid, the browser pops up a warning.

Curie is a meritocratic, voluntary organization which supports the development and growth of new and existing authors on the Steem blockchain. We have multiple methods of growth, which include but not limited to: direct curation, supporting curation communities, and supporting community projects and development of applications which use the Steem blockchain.

$0.00

[-]

slobberchops (81) 4 years ago

I will ask the gods for you, it doesn't look good.. I have to admit. I haven't been on the website in a couple of years.

$0.00

[-]

abh12345.stem (48)I like rockets 4 years ago

Well this one is a fair base template I think - message me on discord if you have an queries.

$0.03

1 vote

[-]

slobberchops (81) 4 years ago

Its great, got it working.. and its fast.

$0.00

[-]

abh12345.stem (48)I like rockets 4 years ago

Nice!

Try and keep the criteria tight and with few free text 'wildcard' searches - arcange will kick my ass otherwise!

$0.00

1 vote

[-]

amr008.stem (26) 4 years ago

Congrats, you were upvoted from this account because you were in Top 25 engagers yesterday on STEMGeeks .
You made a total of 11 comments and talked to 8 different authors .
For more details about this project please read here - link to announcement post
You can also delegate and get weekly payouts.

$0.00

1 vote

[-]

slobberchops (81) 4 years ago

Haha, Only ran it twice.., @steevc might be doing more than me, he's also got it all working.

$0.00

[-]

steevc (79) 4 years ago

I'm exploring some of the tables. Lots of potential uses. I find enough to vote on anyway.

$0.00

[-]

meesterboom (80) 4 years ago

I am tempted myself. I used to write loads of queries on my old job and still occasionally do the odd one to keep my hand in.

Simpler times!!

That dB is gonna take a pounding if everyone has a go!! :0)

$0.48

8 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

Yeah it will which is why I tried to make clear the faults and gave something nice and speedy!

$0.06

1 vote

[-]

meesterboom (80) 4 years ago

You did. It's a really good example to kick off from!!

$0.00

[-]

abh12345.stem (48)I like rockets 4 years ago

Cheers :)

And now you can find out cool stuff like how many times downvoted KC :D :O

$0.00

1 vote

[-]

meesterboom (80) 4 years ago

The downvote counts!! Actually, as soon as I get some time free at work I'm gonna go for it!

$0.00

1 vote

[-]

sanjeevm (75) 4 years ago

So am I responsible for the sin 😁

$0.04

1 vote

[-]

meesterboom (80) 4 years ago

Hehe, it looks like it!! Heads will roll 😃😃

$0.00

[-]

bluerobo (69)Rockets 4 years ago

I really have to get to it!First @geekgirl is making posts about using @hivesql, now you?

Happy to see what you can come up with in the mean time.

$0.17

7 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

Thought I'd follow the trend, I've been using the thing for 3 years :)

$0.00

[-]

steevc (79) 4 years ago

I really ought to play with this. I have done plenty of SQL in the past, so I should be able to get into it.

$0.17

6 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

Nice, then it's just a matter of getting used to the views and what data is where :)

$0.00

[-]

steevc (79) 4 years ago (edited)

I've logged into the database, but don't see the Hive stuff, just those that define the standard system. I am more used to Oracle than SQL Server, but the principles should be the same.

I'm using a thing called DBeaver on Linux.

Update: Aha! Needed to specify the schema

$0.00

1 vote

[-]

abh12345.stem (48)I like rockets 4 years ago

Glad you found it cause I wouldn't be able to help much i think!

$0.00

1 vote

[-]

steevc (79) 4 years ago

I've not used this tool before. For my Brit List I used Python and beem. SQL may be better for some other things. It's certainly quick. Need to think about what queries I can use. Could investigate some abusers. There's one I know of who is creating lots of accounts. Could check if any have rewards.

$0.10

3 votes

[-]

mellofello (68) 4 years ago

Awesome post! Thank you for sharing this information. It was easy to follow and I was able to query after about 15 minutes. My only comment was when I transfered 0.01 HBD to @hiveSQL, I received an error, so I needed to change the name to @hivesql.

I didn't realized the hiveSQL database is free. That's so cool. Do you know who is funding it to be free? Also, do know if I can query Splinterlands DEC transfers with this database? I am able to query this information through the Splinterlands API, however the transfers data is limited to a 30 day window. I do not know of another way to obtain transfers that are older than 30 days. Do you?

$0.03

2 votes

[-]

abh12345 (80) 4 years ago

Ah cool, thanks for checking it out and for that correction.

I didn't realized the hiveSQL database is free. That's so cool. Do you know who is funding it to be free?

It didn't used to be free, I paid 40 SBD/HBD a month for years, but:

https://peakd.com/me/proposals - see about 7 down.

Also, do know if I can query Splinterlands DEC transfers with this database?

This might be in TxCustoms, but I think it is on the sidechain. I don't deal with that one but perhaps @dalz or @gerber could be so kind to post a guide or a comment here :)

$0.14

2 votes

[-]

mellofello (68) 4 years ago

I appreciate the guidance and for reaching out for more help.

$0.00

[-]

dalz (80) 4 years ago (edited)


SELECT
timestamp,
json_metadata

FROM Txcustoms
WHERE 
CONVERT(DATE,timestamp) BETWEEN '2020-12-01' AND '2021-02-28'
AND [tid] in ('ssc-mainnet-hive')
AND [json_metadata] like ('%"DEC"%')
AND [json_metadata] like ('%"null"%')

I use this query for DEC transfers to null ... you need to parse the json data as well

$0.12

3 votes

[-]

abh12345 (80) 4 years ago

Ah nice, so you could do something like this then...

SELECT top 10 
timestamp,
json_value(json_metadata, '$.contractPayload.symbol') as symbol,
json_value(json_metadata, '$.contractPayload.quantity') as quantity,
json_metadata
FROM Txcustoms
WHERE 
CONVERT(DATE,timestamp) BETWEEN '2021-01-01' AND '2021-02-28'
AND [tid] in ('ssc-mainnet-hive')
AND json_value(json_metadata, '$.contractPayload.symbol') = 'DEC'
AND json_value(json_metadata, '$.contractPayload.to')  = 'null'

$0.13

2 votes

[-]

dalz (80) 4 years ago

That will do it :)
Testing it soon!

$0.00

2 votes

[-]

mellofello (68) 4 years ago

FYI - when we use [tid] = ('ssc-mainnet-hive'), we get transfers on the Hive engine.
When we used [tid] = ('sm_token_transfer'), we get in-game transfers.Thanks abh12345, you and @dalz really came through for me today. I like how you parsed some of the json string.

$0.00

[-]

abh12345 (80) 4 years ago

Good to know :)

I've not looked at this table much, mainly out of fear of its size and the json parsing, but I think I'll pay a bit more attention to it in future. Thanks for opening up the conversation, and cheers dalz!

$0.00

[-]

mellofello (68) 4 years ago

Thanks so much for your prompt reply. Your query sample was very helpful. I was able to update the where clause to obtain exactly what I needed. I had no clue all this data was available all this time. This is very cool.

$0.00

[-]

orkin420 (60) 4 years ago

heck yeah this is super neat, Always good to help people find stuff.

$0.02

2 votes

[-]

abh12345 (80) 4 years ago

Cheers!

$0.00

1 vote

[-]

bashadow (69) 4 years ago

Well I got as far as down loading the program, and then two updates for the program to work, then was actually able to connect to the database. Now all I need to do is figure out how to get stuff out, that will be another day. I need to work on my spreadsheet a little bit and figure out how to use the data base to populate it with just the stuff I want.

Such as from my account only rewards for the last seven days, how to get it for the entire month and how much since just the beginning of Hive.

these numbers kind of.png

Each month has a different number of days, so I really don't want just the 30 day results but the whole month results. So a sample query might help me if you can work one up for me, I have a little bit of HBD saved, (3.793) I haven't converted it lately, and some non powered up Hive, (40). Give me a price, I know I don't have much but I can scrimp and save.

There is a lot of information available from him.

$0.02

2 votes

[-]

sharkthelion (64) 4 years ago

That sounds very complicated. Haha. It would be better if Hive just worked the same way Leo does as far as curation goes.

$0.02

6 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

It's not so bad, if I can do it :)

$0.00

[-]

aneukpineung78 (56) 4 years ago

You're teaching sorcery ... 🚶🚶🚶

$0.00

6 votes

[-]

abh12345 (80) 4 years ago

🤣😁

$0.00

[-]

sanjeevm (75) 4 years ago

Finally......👌I am going to explore more. I use Squirrel SQL Client which is also free - do you think a detailed post on how to use that would help ?

$0.00

6 votes

[-]

abh12345 (80) 4 years ago

The tool isn't so important, I only use the buttons I mentioned and haven't used Squirrel. I hope you find the example useful!

$0.00

[-]

geekgirl (79) 4 years ago

Nice! I was working on the exact same thing yesterday. I will deploy searching posts as an App today or tomorrow. It’s almost done. Just having issues displaying body content properly.

$0.00

6 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

Sounds good :) I think people are looking for something simple to use as a search.

$0.00

[-]

gillianpearce (67) 4 years ago

I was all excited for a minute there and then you got to the example code. 😂

$0.00

5 votes

[-]

abh12345.stem (48)I like rockets 4 years ago

It's all there, line by line!

$0.00

[-]

gillianpearce (67) 4 years ago

Ah. Yes. But I was thinking I'd be able to find out other stuff. 😁

$0.00

[-]

abh12345 (80) 4 years ago

Discord me with a list 🙃

$0.00

[-]

nickyhavey (74) 4 years ago

I was following to the top 3rd but I think I concede defeat in SQL! The queries we (you) made for HMRC will be my limit, this is next level wizardry!

$0.00

5 votes

[-]

abh12345 (80) 4 years ago

Looking at different views/tables here with more filter criteria but if you know what the columns mean it's not so complicated. Glad you got started with a practical example though!

$0.00

[-]

nickyhavey (74) 4 years ago

Yes, I think it's just a case of knowing the names of columns that are used to select the data - but you can find that just by looking in to the views in that SQL software you shared. Pretty happy we went through what we needed to - now we just need HIVE designed fountain pens and we'd be sorted 😃

$0.00

1 vote

[-]

abh12345.stem (48)I like rockets 4 years ago

now we just need HIVE designed fountain pens and we'd be sorted

Don't look at me as the seller, I cannot be arsed accounting for another income stream!

$0.00

[-]

nickyhavey (74) 4 years ago

Hahaha! Stick to biro!

$0.00

1 vote

[-]

amr008 (67) 4 years ago

Amazing.
I always use python to connect to HiveSQL .

Just curious , if you want to retrieve all the comments for past 30 days, can I know how much time it takes?

$0.00

1 vote

[-]

amr008.stem (26) 4 years ago

Congrats, you were upvoted from this account because you were in Top 25 engagers yesterday on STEMGeeks .
You made a total of 3 comments and talked to 2 different authors .
Your rank is 3 .
For more details about this project please read here - link to announcement post
You can also delegate and get weekly payouts.

$0.00

[-]

steemstem (73)Notification 4 years ago

Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!

Please consider supporting our funding proposal, approving our witness (@stem.witness) or delegating to the @stemsocial account (for some ROI).

Please consider using the STEMsocial app

app and including @stemsocial as a beneficiary to get a stronger support.

$0.00

1 vote

[-]

abh12345 (80) 4 years ago

Thanks guys!

$0.00

[-]

themarkymark (81)STEMGeeks Team 4 years ago

This is used to filter out tags/topics. I don't know these languages and so wish to exclude them from the results. $.tags is a section of C.json_metadata and so wildcard searching, although painful, is less of an issue. There is likely a better way to do this, and I hope someone can chip in.

The way I do it is expand the json out and use it as a join and cross apply, kind of a pain in the ass but works good. Every query that uses this is a brain teaser.

$0.00

1 vote

[-]

abh12345 (80) 4 years ago

Yeah that sounds nasty, but probably better/quicker. Got an example? :D

$0.00

1 vote

[-]

themarkymark (81)STEMGeeks Team 4 years ago

SELECT Json_value(voters.voters, '$.voter')   AS voter,
               Json_value(voters.voters, '$.time')    AS time,
               Json_value(voters.voters, '$.rshares') AS rshares
        FROM   (SELECT votes.[value] AS voters,
                       c.author,
                       c.permlink
                FROM   comments c
                       CROSS apply Openjson(c.active_votes) AS votes
                WHERE  c.author = 'themarkymark'
                       AND c.permlink =
                           '315-pairs-of-socks-have-a-marketcap-of-usd28-million-dollars') AS voters

$0.00

1 vote

[-]

abh12345.stem (48)I like rockets 4 years ago

I like it, cheers!

$0.00

1 vote

[-]

amr008.stem (26) 4 years ago

Congrats, you were upvoted from this account because you were in Top 25 engagers yesterday on STEMGeeks .
You made a total of 3 comments and talked to 3 different authors .
Your rank is 5 .
For more details about this project please read here - link to announcement post
You can also delegate and get weekly payouts.

$0.00

[-]

hivebuzz (74) 4 years ago

Congratulations @abh12345! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You received more than 66000 HP as payout for your posts and comments.
Your next payout target is 68000 HP.
_{The unit is Hive Power equivalent because your rewards can be split into HP and HBD}

_{You can view your badges on your board and compare yourself to others in the Ranking}
_{If you no longer want to receive notifications, reply to this comment with the word STOP}

Check out the last post from @hivebuzz:

	Hive Tour Update - Advanced posting
	Valentine's day challenge - Give a badge to your beloved!

$0.00

[-]

blockchainyouth (63) 4 years ago

Wicked work

$0.00

[-]

abh12345 (80) 4 years ago

Thanks!

$0.00

[-]

drniche (45) 4 years ago

Great post! Thank you for the illustration!Wondering if it is possible to check power down power up metrics using hive SQL..

$0.00

[-]

abh12345 (80) 4 years ago

Yes I think that is there.

Power ups

select * from txtransfers where [from] = 'drniche'
and type = 'transfer_to_vesting'
order by timestamp asc

Power downs are in Vests, and so the HIVE amount is only approximate (as the number VESTS per HIVE is constantly increasing).

select , vesting_shares525/10000000 as approx_HIVE from TxWithdraws
where account = 'abh12345' and vesting_shares <> '0.000000'
order by timestamp asc

From the Market Info tab on https://hivetasks.com/@drniche

$0.00

1 vote

[-]

drniche (45) 4 years ago

This is wonderful!! Thank you so much! I will try it out in the database tonight.

$0.00

[-]

magnacarta (67) 3 years ago

https://ecency.com/hive-168869/@arcange/re-1657869089260.)(This is for @arcange to continue from comment left at

@hivesql sent me the encrypted memo (along with a refund of 0.001 HBD).

Hive Keychain is required to view the encrypted memo, so for memo purposes I switch to Peak D before I launch Keychain.

Keychain doesn't know about my Memo Key, so I need to import it; however, when I go to import the Memo Key I get this error message from Keychain:

This is not a private WIFI.

I'm using a mobile hotspot, so I don't understand why I need to use WIFi of any kind. So at the moment I'm at a dead end. I don't know how to proceed from here.

$0.00

[-]

arcange (78) 3 years ago

I don't think it's a good idea to comment on an unrelated post.

Your problem is related to Keychain, not HiveSQL.
If you read the error message carefully, you will see that it says "This is not a private WIF (and not WIFI) which means that the encoded private key is not valid.

Support for Keychain is provided on their Discord
Support for HiveSQL is also provided on Discord

$0.00