Active users stats on LeoFinance - January Month - DATA and CHARTS + Python codes

in LeoFinance4 years ago (edited)

Good morning to everyone , I hope you are having a great day.
Today's post will be about how many posts were made using #Leofinance or 'hive-167922' tag.

How many users have posted over 100 comments , over 10 posts and much more .

For those who doesn't care about the code , just right to the last heading - DATA and CHARTS .
For nerds like me :) Please cross check the code and see if I can improve the code.

Basic codes

import json
import pandas as pd
import pyodbc 
conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=vip.hivesql.io;'
                      'Database=DBHive;'
                      'uid=Hive-amr008;'
                      'pwd=--hidden--;'
                      'Trusted_Connection=no;')

cursor = conn.cursor()

The above code is to just establish connection with HiveSQL .

Test_Query = pd.read_sql_query('''select * from TxComments where timestamp > GETDATE()-30 ORDER BY ID DESC ''',conn) 

I am getting all the posts + comments for past 30 days and storing it in DataFrame Test_Query .

Output of the Test_Query looks like this -

image.png

Posts/ Comments count with LeoFinance tag

posts_count=0
comments_count=0

complete_list=[]

for i in range(0,len(Test_Query)):
        json_tags=json.loads(Test_Query['json_metadata'][i])
        if 'tags' in json_tags:
            if('hive-167922' in json_tags['tags'] or 'leofinance' in json_tags['tags']):
                if(Test_Query['parent_author'][i]==''):
                    posts_count+=1
                    if 'app' in json_tags:
                        complete_list.append([Test_Query['author'][i],pd.to_datetime(Test_Query['timestamp'][i]).date(),'post',json_tags['app']])
                    else:
                        complete_list.append([Test_Query['author'][i],pd.to_datetime(Test_Query['timestamp'][i]).date(),'post','None'])
                        
                else:
                    comments_count+=1
                    if 'app' in json_tags:
                        complete_list.append([Test_Query['author'][i],pd.to_datetime(Test_Query['timestamp'][i]).date(),'comment',json_tags['app']])
                    else:
                        complete_list.append([Test_Query['author'][i],pd.to_datetime(Test_Query['timestamp'][i]).date(),'comment','None'])
                
   
print(posts_count,comments_count,posts_count+comments_count)

Steps -

  1. Go through each row ( post/comment)
  2. check if 'hive-167922' or 'leofinance' tag is used in that post/comment
  3. if yes , check if it is a post or comment
  4. if post increase the post count otherwise increase the comment count.
  5. Store all in 'complete_list' list.
df_data=pd.DataFrame(complete_list)
df_data.columns=['Author','Date','Type','Front-end']


df_posts= df_check[df_check['Type']=='post']
df_posts.columns=['Author','Date','posts','Front-end']

df_comments = df_check[df_check['Type']=='comment']
df_comments.columns=['Author','Date','comments','Front-end']

Steps-

  1. Take the whole list and convert to DataFrame
  2. Store only posts in df_posts
  3. Store only comments in df_comments
df_posts_count=df_posts.groupby('Author').count()
df_comments_count=df_comments.groupby('Author').count()

I have just grouped it by Author to get the count . This will give me how much a particular user has posted and commented .

Output-

image.png
image.png

count_1to10_posts=0
count_10to30_posts=0
count_30_posts=0

count_1to100_c=0
count_101to250_c=0
count_251_c=0


for i in range(0,len(df_posts_count)):
    if df_posts_count['posts'][i] >0 and df_posts_count['posts'][i] <=10:
        count_1to10_posts +=1      
    elif df_posts_count['posts'][i] >10 and df_posts_count['posts'][i] <=30:
        count_10to30_posts +=1
    else:
        count_30_posts +=1

for i in range(0,len(df_comments_count)):
    if df_comments_count['comments'][i] > 0 and df_comments_count['comments'][i] <=100:
        count_1to100_c +=1
    elif df_comments_count['comments'][i] > 100 and df_comments_count['comments'][i] <=250:
        count_101to250_c +=1
    else:
        count_251_c +=1

        
print('Number of users who have -')

print('Posted 1 - 10 posts:'+str(count_1to10_posts))
print('Posted 11 - 30 posts:'+str(count_10to30_posts))
print('Posted over 30 posts:'+str(count_30_posts))

print('\n')

print('Posted 1- 100 comments:'+str(count_1to100_c))
print('Posted 101 - 250 comments:'+str(count_101to250_c))
print('posted 251 or above comments:'+str(count_251_c))

Steps -

  1. I have just gone through each row and have segregated based on posts/ comments count .
  2. I am just increasing the count for posts/ comments if the condition holds true.

DATA and CHARTS

To all those who skipped the above code part , this is what you need -

Output -
Number of users who have -
Posted 1 - 10 posts:1103
Posted 11 - 30 posts:261
Posted over 30 posts:105

Posted 1- 100 comments:1745
Posted 101 - 250 comments:69
posted 251 or above comments:33

To get better picture , let me show the same data in charts -

Posts

image.png

So the users who have made just 1-10 posts are huge ( 1103 ) followed by 11-30 ( 261 ) .
I have taken 30 as the dividing point because 30 days = 30 posts = 1 post per day and its good to see 105 users here .

Comments

image.png

Those who have commented over 250 times with leofinance tag = just 33 :(

Engage more people , engage moree.

Date wise posts+comments with LeoFinance tag.

image.png

Average is 1960 posts + comments per day with leofinance tag . That is quite amazing :)

Top 20 Authors ( Posts count )

Top 20 Authors ( Comments count )


Let me know if you want your data in the comments, I will get it for you .

df_posts_count[df_posts_count['Author']=='amr008']
df_comments_count[df_comments_count['Author']=='amr008']

Output =
20 posts :)
537 comments .


image.png

Credits:@mariosfame


Follow me on noise.cash where I promote Leo/Hive posts - https://noise.cash/u/AMR
Follow me on twitter - https://twitter.com/NaveshSapad


Regards,
MR.

Posted Using LeoFinance Beta

Sort:  

What kind of Data?

With respect to this particular post - Your post count and comment count ( using leofinance tag) .

If you want any other data , let me know I will try to fetch it.

Posted Using LeoFinance Beta

Really an awesome statistics for the month of January. I hopefully would be included in the mentions when looking up February numbers.

Posted Using LeoFinance Beta

Definitely :) I would love to see you in top 20 too.
!WINE

Posted Using LeoFinance Beta


Cheers, @amr008 You Successfully Shared 0.100 WINE With @uyobong.
You Earned 0.100 WINE As Curation Reward.
You Utilized 1/3 Successful Calls.

wine-greeting


WINE Current Market Price : 9.899 HIVE

Congratulations @amr008! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You received more than 15000 upvotes. Your next target is to reach 20000 upvotes.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out the last post from @hivebuzz:

Next Hive Power Up Day is February 1st 2021

Hi!
Great Article!
I would love to see my data!
Thank you!
@shortsegments

8A27FE2B548E47C588DD5991BE72F713.jpeg

Thank you @shortsegments I will send it to you as soon as I hop on to my computer. Hope that's okay.

Posted Using LeoFinance Beta

Hey here is your data :)

Posts and comments with leofinance tags =
posts = 56
comments = 351

Posted Using LeoFinance Beta