Introducing ReplyTally.py: A Python Script for Counting Comments on Hive Posts

in Python2 years ago (edited)

Coding things ahead, beware...

Here I'm introducing a new Python script designed to count comments in specified Hive posts. The goal of this tool is to provide a simple method of counting the activity of users and rewarding the most active members. The script was created specifically for use in a President quiz on Hive, as an example see this link: https://ecency.com/quiz/@gamer00/which-u-s-president-quiz-2a5f0bdd40abe.


AI generated rat, coding on his computer.

The script will first prompt the user to input any excluded usernames. These usernames will be stored in a file called "exclusion.list" and can be used to exclude the author of the post and any bots from the comment count. The user will then be asked to input the URL of the post from which to count the comments.

The script is a work in progress, and currently prints out dictionaries in addition to the scores for each post, before calculating the total score for each author. The end goal is to have the commenters sorted based on their score, however error handling and optimization have not yet been implemented.

I hope you find this tool useful and we welcome any feedback or suggestions for improvements.

'''
ReplyTally.py

This script counts comments in given Hive posts, and then prints out the totals for each commenter.

The idea is to provide a tool for counting 'a score' for most active commenters for rewarding
purposes. I originally created this script for helping me with my President quiz on Hive, example:
https://ecency.com/quiz/@gamer00/which-u-s-president-quiz-2a5f0bdd40abe

It first asks for excluded usernames, to create an "exclusion.list" file with those names. This
is a nice way to exclude author themselves, and bots from the comment count.

It then asks for a url of a post, which from to count the replies from.

The script is still work in progress, so it has its quirks. It currently prints out dictionaries
in addition to the scores for each post, then calculates a total score per author.

The end-goal is to have it sort the commenters based on their score. Error handling is still
not implemented yet, and the script is also awfully slow.

Anyway, I hope you like it.
'''


# Import the essential libraries, lighthive to fetch the post and replies, collections to work with dictionaries.
import lighthive
from lighthive.client import Client
from collections import defaultdict

# Initialize the connection to the blockchain via the LightHive RPC interface:
client = Client()
account = client.account("gamer00")

# Create an empty set object for the author exclusion list.
exclusionlist = set()

# Use the 'exclusion.list' file to read excluded author names, and store new ones.
# The names get stored in a set(), so there won't be doubles, and it should be faster to query from.
def get_excluded_authors():
    exclusionlist = set()
    try:
        with open("exclusion.list", "r") as f:
            for line in f:
                author = line.strip()
                exclusionlist.add(author)
    except FileNotFoundError:
        pass

    # Ask the user for names to exclude:
    excluded_authors = input("Enter the excluded authors (comma-separated): ")
    if excluded_authors:
        excluded_authors = excluded_authors.split(",")
        for author in excluded_authors:
            author = author.strip()
            exclusionlist.add(author)
        with open("exclusion.list", "w") as f:
            f.write("\n".join(exclusionlist))
    return exclusionlist

# Ask for post URLs, check for "https://":
def get_post_urls():
    post_urls = input("Enter post urls or paths, separated by a comma: ").split(",")
    post_urls = [url.strip() for url in post_urls]
    if not all(url.startswith("https://") for url in post_urls):
        print("Invalid input. Please enter valid post urls starting with https:// and separated by a comma.")
        return None
    return post_urls

# Go through the urls one by one, call for function 'count_comments()' to count the comments.
def process_post_urls(post_urls, exclusionlist):
    author_counts_per_post = []
    for post_url in post_urls:
        author, permlink = extract_author_permlink(post_url)
        author_counts = count_comments(author, permlink, exclusionlist)
        author_counts_per_post.append(author_counts)
    print("Type of author_counts before passing to count_author_scores:", type(author_counts)) # DEBUG
    author_scores = count_author_scores(author_counts_per_post) #Debug change 2.
    #author_scores = count_author_scores([author_counts_per_post]) #Debug change 1.
    #author_scores = count_author_scores([author_counts])
    print("Type of author_scores after returning from count_author_scores:", type(author_scores)) # DEBUG
    print(author_scores)

# Extract 'author' and 'permlink' parts from the URL string.
def extract_author_permlink(url):
    parts = url.split("/")
    author = parts[-2][1:]
    permlink = parts[-1]
    print("author:", author, "permlink:", permlink) # DEBUG
    return author, permlink

# Use LightHive to recursively extract the post and its replies from given URLs
# (extracted data: 'author', 'permlink', etc.)
def get_all_replies(author, permlink, replies):
    comment = client.get_content(author, permlink)
    replies.append(comment)
    comment_replies = client.get_content_replies(author, permlink)
    for reply in comment_replies:
        get_all_replies(reply['author'], reply['permlink'], replies)
    return replies

# Count the comments.
def count_comments(author, permlink, exclusionlist):
    replies = []
    all_replies = get_all_replies(author, permlink, replies)

    author_counts = {}
    print("Type of author_counts before the 'count_comments' for loop:", type(author_counts)) # DEBUG
    for reply in all_replies:
        author = reply['author']
        if author in exclusionlist:
            continue
        if author in author_counts:
            author_counts[author] += 1
        else:
            author_counts[author] = 1
    print("Type of author_counts after the 'count_comments' for loop:", type(author_counts)) # DEBUG
    print(author_counts)
    return author_counts

# Tally the count into totals based on 'author_counts_per_post' from 'author_counts' in 'count_comments()'
def count_author_scores(author_counts_per_post):
    if not isinstance(author_counts_per_post, list) or not all(isinstance(d, dict) for d in author_counts_per_post):
        raise ValueError("Expected a list of dictionaries, but got {}".format(author_counts_per_post))

    author_scores = defaultdict(int)
    for author_counts in author_counts_per_post:
        for author, count in author_counts.items():
            author_scores[author] += count
    return author_scores

# Main loop. This one runs all the other functions in order.
def main():
    exclusionlist = get_excluded_authors()
    post_urls = get_post_urls()
    process_post_urls(post_urls, exclusionlist)

if __name__ == '__main__':
    main()

Thanks for checking it out!



Join the Hive community and be a part of a growing decentralized platform that values your contributions. Hive is a social blockchain that connects content creators and fosters engagement.

Sign up and discover the limitless possibilities of Hive.