Ever since I saw STEEM become #35,637 most visited website online today, I just had to take a look at what exactly was going on. Alexa is an Amazon-owned company that provides commercial web traffic data and analytics, and let’s just say they must be really impressed with STEEM.
But what I decided to do, is to go the extra mile. What exactly is going on INSIDE Steem?
So I wrote a little Python application which allowed me to extract a whole lot of data. For example, here are the top 10 tags sorted by the highest payouts.
Not surprisingly steem and introduceyourself are at the top, followed by steemit. But here is where we discover the first patterns - travel, life and photography are making the most money.
But you may be wondering, who is getting the most replies?
Interesting! Seems like the three forerunners are again at the top, followed by bitcoin, photography, and money. And it seems that photography and money have reached a healthy medium up until now.
But being the nervous ADHD brain that I am, I wasn’t satisfied by this cursory glance at the inner workings - so I decided to grab the Top 100 Trending Posts! Here is what I found out.
The average earnings for a Top 100 post are: $1881.4283168316824
Woah. That’s a lot more than I thought! What about average votes and responses?
Votes: 141.97029702970298
Responses: 33.59405940594059
What did the most successful “trending” post earn?: $15951.16
But this kind of data isn’t THAT interesting. What I didn’t tell you is that I also grabbed all the text from those posts. So let’s take a look at the most commonly used words in a trending title.
That looks about right, but most of these are common words that don’t tell us anything. What we have to do is get rid of those pesky stopwords (I, me, is, this, that). And a close look at the graph will show you that it’s only 20 examples and not 25 as the title says. Oops! Let’s try again.
Ahh, much better. “Looks like today, new Bitcoin posts on Steemit come first and have the most community importance, especially if they have some nice photography... This could get big. Better check your Ethereum accounts, northerners.”
But what about the trending posts themselves? What are the most common words there? Hold on tight, this is a big one.
“I would like one steem, people, and I’d also like to know about those good passwords…”
What I want to know now, is there anyone I should be jealous of? Which users have more than one post in the trending top 100?
Looks like simba and thedashguy are kicking ass. Unfortunately, I’ve read somewhere that katecloud has been hacked and can’t get the account back. Sorry to hear that, and I hope it gets resolved.
Which posts are getting the most votes, the most dollars, and the most responses?
What about the lengths of posts?
The average length of a trending top 100 post is: 496.35643564356434 words. The shortest is only 7 words long, and the longest is 4472.
Thank you for taking a look. Please upvote and comment if you can. I will be uploading a lot more content in the future.
P.S. If you would like to learn how to do this yourself, I recommend starting with the book Learning Python, or Python the Hard Way which both teach you the basics of Python programming. Then you can look at doing things like html requests, and driving a browser with Selenium. It takes a while, but it's worth it.
I like stats report and this one is no different. Good job. One thing I'd like to see, is the time-span which these stats are relevant. I can recognize the titles of some articles, and thus conclude that they are recent, but the period analyzed should be mentioned.
Those are all great statistics. I think creating an APP showing these in addition to Steem & Steem Dollars prices would be fantastic. In the meantime looking forward to more stat treasures.
This is awesome @filip-martinka
Could you possibly share the python code?
Thank you. Although at the moment it's entirely spaghetti code and a lot was done in the console so its not built-in yet. But in the future I might release a better, more coherent version of the code if there is enough demand.
edit: I have a work in progress at http://steemread.com/ now, but it's a very rough draft. I will include information such as word counts, etc. soon. It's very basic at the moment.
Interesting post!
Check out new service i just released!
https://steemit.com/steemit/@cryptotony/steem-link-first-steemit-short-url-service-with-bunch-of-features
Awesome, upvoted. I have a question for you. How have you gathered the data? Did you just collect it off the stream using something like Piston, or have you used some other technique to crawl/scrape steemit?
Thanks. I used selenium and xpath selectors.
Using the steemd API instead of scraping HTML will make your code much more robust. We know people use the steemd API and we try not to introduce breaking changes to it, but the HTML structure of steemit.com pages obviously isn't an API and we feel free to change it as needed, when needed.
Thanks that was the next step I had in mind, but I wanted to experiment.
Well...
This is a quality post!
At least 4 hours to make (both with the writing, editing an especially the app)
Thank you so much for taking the time to do this. It's really cool to see the numbers in action. It makes it possible for us to also see what categories can be contributed to more.
That you went as far as to provide the titles, categories and an account of the ongoings with a Steemit favorite @katecloud I just had to laugh at the words that English has dwindled down to but useful, informative data nonetheless. Thanks so much @filip-martinka This is so upvote worthy!
Very interesting well researched post. Good to see some steemians are still taking time to release quality material. kudos . webocel its kind of bad form to beg for votes then use boardline emotional blackmail to try and secure it. If your post is worthy you will have nothing to worry about my friend . Trust in your post and have trust in our community.
Similar to SP, SMD tokens cannot be purchased directly on an external exchange. SMD are primarily earned through contributing but can be purchased by converting STEEM tokens to SMD tokens.
Actually Steem Dollars can now purchased on external exchanges !
https://poloniex.com/exchange#btc_sbd
https://bittrex.com/Market/Index?MarketName=BTC-SBD
PS Abbreviation of SBD = Steem Backed Dollars
or just SD = Steem Dollars (not SMD please edit)
Nice recommendation! Thanks, I'll keep it on my radar.
Have you heard of Learn Python The Hard Way? Most of the content is free in the form of an e-book but you can also buy it if you want and the video explanations. I don't really care for video lectures though.
I used the same author to 'Learn Ruby The Hard Way' as a start off point after my disappointing experience with CodeCademy. I definitely got my shit pushed in. It was satisfying.
That's a good book too. And after a few projects, reading documentation and examples becomes especially useful as well. In general I believe that it's best to learn by practical example.
I guess people want to talk about steem and steemit more than any other topic!
Right, and that's because that's what the whales are interested in collectively. Until there is a change in how things are done, and/or they wake up, it's going to continue and it's not helping the community either.
Nice quality post my friend!
Excellent Job 👍
bullionstackers
Never argue with a fool, they will lower you to their level, and then beat you with experience.
Will be very keen to see the code for this :) Thanks for opening me up to this possibility
Great job! Very interesting read.
Sometimes i feel like python is Magic
Actually, in many ways you are right. In ancient history, magic was not defined as a suspension of the natural order. Rather, it was the act of using written and spoken language to influence surroundings and the outcomes of history and wealth. By writing code into an interpreter or compiler, you are 'casting spells' which essentially summons thoughtforms and events to occur on their own.
Awesome post! Really insightful.. the best posts on steemit seem to be how to be successful on steemit posts..
find the best STEEMIT sites all in 1 Place
https://steemit.com/steem/@nioctib/best-steemit-links
Love me some metrics and nifty graphs. Well done.
Very useful! I am going to convert my rig to TEXT mining immediately. TEXT to the moon!
Good work, very useful for every member in the community, thank you
Good stuff. Shit payment. Where are @dan and @ned to upvote thi?
very good points @filip-martinka awesome!
Very informative and useful. This post deserve to get more upvote.
Awesome, #bookmarked.
Cool statistics! I also wanted to do such thing but now it is already done!)
Wow! Interesting data can be found. So i need to use bitcoin, photography, travel And steem in the same post and that should put me in the top right? Lol thanks for sharing.
I Like your post! Thank for your analitic!
well done bro, good job :)
Steemit.Com is now in the KiK, PaxFul, Primedice, The Merkle, & BTC-Jam SEO range of ~30-50k global Alexa ranking - hovering in at the upper edge of the limit beyond which lie only non-mainstream, less popular, 'second or third option' services (in crypto) ; Or fully niche/fringe websites (outside of crypto).
Nice work so far. Now to build up some buywall support.. ;p
Very detailed report. Awesome job. That's worth a hard day's of work.
Alot of research
Please re-post if you get this information up on a webpage with interactive abilities. Nice job!
Will do.
good analises , cheers!
Interesting!
Can you explain this? I don't get how you reached this conclusion
"Not surprisingly steem and introduceyourself are at the top, followed by steemit. But here is where we discover the first patterns - travel, life and photography are making the most money."
Wow! That is a lots of information. Great work!
Fantastic data mining work , thank you for compile and share with community all can learn a lot with it
Here is an Archive of Cryptocurrency App building Code on Github for anyone creating a Steemit app
https://steemit.com/steem/@marsresident/github-cryptocurrency-app-creation-archive
Thanks for the inside information, it is really much apreciated maybe it'll help someone make his first 1thousand dollar post !
Wowww! We are definitely living In a brand new world of "word mining"! lol
That's why... https://steemit.com/steemit/@acec/most-voted-posts-in-steemit-are-about-steem
Watson analytics :)
https://steemit.com/steemit/@samether/steemit-winning-personality-insights-from-ibm-s-watson
Good work.
Nice! I made a infographic from some of this data ;
https://steemit.com/money/@worstdevever/top-100-new-steem-posts-a-visual-infographic-to-show-a-quick-overview-of-steemit-feel-free-to-share-it
you did not HACK anything... Maybe people should spend more time on dictionary.com ???
if you think writing a program is hacking you have a long way to go...
I vote your post and I ask please that you see my post https://steemit.com/crowdfunding/@webocel/58kd3g-my-dream-needs-your-vote-crowdfunding with your vote I can put my business and thus have a job and be able to feed my children. With just a few clicks you can change my life.
(voting this comment you are also helping)
sorry for the spam but I'm desperate for funds to be self employed
Thanks in advance. Nicolas