(Previous Post: Part 4)
Tags
When someone submits a story to Steemit, they get to create up to five tags for the article (the first tag is the story's main category). On Steemit.com, tags (or topics) work like forums or Subreddits, if you are familiar with that other social media platform. Steemit will allow you to browse the list of recent posts within a topic, hot posts, trending posts, and promoted posts.
However, tags are not moderated, so consistency becomes a problem. I have discovered that there are many tags which probably would have been consolidated if this was a moderated platform. For example, there is a #steem-dev and a #steemdev tag, the difference being a single hyphen in the topic name. If someone wanted to post to both of those, they would need to use up two of their five possible tags.
Note: This is a problem that chainBB attempts to solve.
Retrieve Tags
Similar to the getFollowers()
API in Part 2, there is an API function that allows you to chunk through all of the tags 1000 at a time in order of popularity. getTrendingTags()
takes a starting tag name and the number of results to return (max 1000). If the starting tag name is empty, then it will begin at the top.
We can create a function to fetch all of the tags (~20,000 or so when this was written) into a single array by using a generator function that yields on each call to getTrendingTags()
:
//Tags most popular first. First argument is for chunk start
let getTagsList = P.coroutine(function* () {
let start = ''
let count = 0
let allTags = []
do {
yield steem.api.getTrendingTagsAsync(start, 1000)
.then(function (result) {
if (result == null || result.length == 0) {
count = 0
} else {
count = result.length
start = result[count - 1].name
Array.prototype.push.apply(allTags, result)
}
})
.catch(function(err) {
// There seems to be an error in the data or node code that causes requests
// at the bottom of the list to fail (i.e., data after tag "yunks"). For the
// sake of this demo, we will just exit
count = 0;
})
} while (count === 1000);
return allTags
})
getTagsList().then(console.log)
Each item in the returned array will contain data resembling:
{
comments: 48673,
name: "steemit",
net_votes: 478456,
top_posts: 29484,
total_payouts: "3946608.019 SBD",
trending: "243461804"
}
What Can We Do With This?
If you just want to create a tag cloud, then retrieving all 20,000 tags would be pretty wasteful - the important tags for a tag cloud would definitely be in the top 1000 or less. But, having the entire list of tags could be used to offer suggestions for similar tags that are more popular.
For instance, suppose that someone creates a post with your web app and wants to tag it with "steem-dev". While this is a category with great content, the "steemdev" one is even more popular, so your app may want to suggest that as an alternative and let the author decide.
Using string-based filtering logic to trim non-alphanumeric characters, or normalize accented characters, may resemble this:
function findSimilarTags (allTags, tag) {
const input = tag.toLowerCase()
const filtered = _.filter(allTags, function(t) {
return t.name === input || t.name === _.deburr(input) || t.name === input.replace(/\W/g, '')
})
console.log(filtered)
}
findSimilarTags(allTags, "steem-dev")
Results:
[{
comments: 225,
name: "steemdev",
net_votes: 4971,
top_posts: 82,
total_payouts: "72089.717 SBD",
trending: "959477"
},
{
comments: 64,
name: "steem-dev",
net_votes: 365,
top_posts: 18,
total_payouts: "18893.227 SBD",
trending: "256195"
}]
(Next Post: Part 6)
Thanks for writing this series. I really appreciate the detail and examples you put in these posts. I plan on going back through all of your posts on this topic when I'm a little bit better at programming.
Thanks for the post and explanation. This kind of detail helps me understand how Sterm works under the hood. I've not been here ling, and already I can see how devs can add more value by better serving different communities.
Well done, this is exactly the problem I've been considering while posting here, and exactly the solution I've been wishing had been implemented already - a tag analysis & suggestion script.
That really would help, a lot, especially if it suggested similar tags used in hot or trending posts. Now that would be useful.