[Steem Study] Yes, More Flows Out Than In

in #technology7 years ago (edited)

Pretty Pictures Tell a Thousand Tales

Yes, yes – I know. I'm supposed to be writing more about role-playing games and other aspects of ludology.

But here's the thing. Once upon a time, many moons ago, I used to be a professional programmer. People actually paid me to look at bits of code, figure out where they were broken, and come up with solutions – not necessarily ones that required modifying source code.

Crazy, I know.

Unfortunately for me, that was representative of a deeper appreciation of enjoying the chase, hunting the elusive bit of information, finding new ways to present it. I have acquired a certain set of tools in my life; tools that make me a nightmare for people like you.

You know, people who would rather read things that aren't in-depth analyses of the processes behind the steem blockchain.

Nevertheless, I persevere!

It occurred to me as I was tinkering about with yesterday's imagery that I had never seen anyone do a serious, data-heavy study of the relationships between the top end of steem blockchain accounts. This might be because the tools are a little abstruse and not exactly easy to wrap your head around. It might be because to do a real, serious study requires some serious computing platform power in order to actually look at the data. It might be because it's boring as crap.

Regardless, that's what I've been up to.

First Cut: No Removals

Yesterday (has it really been so long), I posted a fairly significant analysis of the activity of the top 78 accounts which account for well over 50% of the SP in the steem blockchain.

There were some very, very pretty pictures generated as a result. Not surprisingly, it revealed some definite loci of activity just talking about transactions.

As nice as that was, I thought I could do better.

Firstly, let's expand our scope out to the top 200 accounts by SP. Why not? We've got plenty of horsepower sitting here next to my knee and it's not like the data is hard to cull.

Of course, if we're going to pull the activity of the top 200 accounts by SP, we're going to be looking at transactions. If are going to be looking at transactions, we are also going to need to be looking at all of the accounts that have engaged in transactions with those top 200, which widens our scope just a little bit.

While we're at it, let's color each of those transactions depending on which direction they're going. If they're going between any of the top 200 accounts, let's make that transaction black. If they're going between accounts in the top 200 and accounts out of the top 200, that is value which is moving away from the core, let's make that transaction green. If the transaction is going from one of those ancillary accounts into the top 200, let's make that transaction red.

I admit to this not being a completely arbitrary decision. I want to look at the flow of value in and out of the people that hold the most of it on this blockchain.

So let's look at that naïve solution.

(Click image to go to scrollable, zoomable gigaimage.)

Well. That's enormous.

It also has certain problems. I mean, those might be representative of certain problems in transactions on the blockchain, but they're problems nonetheless.

Immediately, we can see two enormous nexi of activity, one in the south (@ramta) and one to the east (@ipromote). Our southern node at least seems to do a fair amount of fund output compared to the amount going in. The east one is less obvious as to comparative volume, though clearly there is a lot of funding flowing around in that circle. Interestingly, that is a very tightknit group, explaining why it seems to be very isolated to the side.

There are a few other very obvious features which draw the eye. @curie very obviously has a fair amount of input from a narrow number of users, though not insignificant, but fans out rewards – and remember, we're talking about specifically fund transactions here, not votes.

@hr1 in the close northeast is another interesting node. Almost all of the transactions in the last month have been inbound to it. Now, as I said, we can't actually speak as to the value of any of these transactions, but in sheer quantity, this is a hotspot.

@acidyo represents another very interesting feature here, because his pattern of behavior doesn't really match that of any other active whale which doesn't appear to be some sort of curation group or a bot. Like the former, the hands a lot of outgoing fund transfers who don't appear to have any other transfer interactions with other whales.

But this is kind of a mess. Can we do better?

Second Cut: A Good Pruning

After studying the previous chart for a fairly extended amount of time, I had to give serious consideration to cutting out a handful of accounts which we had previously considered as "of interest."

@blocktrades, @minnowbooster, @jerrybanfield, and @ipromote .

All four of them or simply overwhelming my ability to make readable graphs out of the content in the database. One of them is the biggest exchange on the blockchain. Two of them are essentially promotion systems. And the last of them is Jerry Banfield, who is a little bit of all of the above.

This is really where we made the jump from studying about 100 accounts to the full 200 top SP-rated accounts because I thought that if I was going to be trimming some of the big users anyway, I might as well add some more. Considering that cutting just those four accounts from our accounts of interest reduced the number of transactions that we were studying by a literal order of magnitude (that's a factor of 10 for you nonscientific folks), it seemed like a reasonable trade-off.

But that brought us to a little bit of a different place.

(Click image to go to the zoomable, panable, gigapixel version.)

This is still a really busy place, but we again see certain patterns of behavior and islands of isolation.

In particular, @ramta is back in the southwest and to his particular isolated island of affect is even more pronounced in the absence of the other massive players. There are a lot of transactions flowing in and out of his account to others who are definitely not in the whale zone.

It's not surprising to see @sweetsssj at the locus of a lot of transactions. Especially in light of my previous discussion of the ongoing Whale Wars, her position is going to be at the focus of a lot of stuff. What I didn't expect to see is that she is just bombarded with incoming transactions with little to no outgoing transactions.

In fact, the relative absence of transactional energy flowing out of the central network gave me a certain amount of pause. There are an absolute shit ton of transfers going on here, but we don't really have enough context to make good decisions about what we are seeing.

Which leads to yet another round of deep cuts in the data supply.

Third Cut: Charming?

Alright. We have the top 200 accounts by SP marked as of interest, minus a few whose transactional load is largely unrelated to activity that we're interested in. We know what they are.

Most of those transactions are transfers pushing some sort of fund, whether it be STEEM or SBD up into those rarefied reaches.

What if we make two further cuts?

Firstly, let's ignore any transaction of less than 0.5 of whatever is being transferred. That will drop out all of those 0.001 SBD memo spams that I understand whales receive a ton of and help clear out our research space.

Secondly, let's filter out any ancillary account valued at less than 100 vests. That won't actually cut the data feed very much, but it should help cull some of the bots that do nothing but send messages all day with delegated funds.

What's that look like?

(Click on the image to go to the somewhat smaller than the others gigapixel version.)

Right off the bat, there are some obvious differences.

This image is a lot smaller, for example. It's only 23 megapixels while the other two were pushing 150 megapixels each. This is a much tighter, more constrained space.

We can actually see that some of the whales haven't actually engaged in a qualifying transaction of transferred funds in the last month. Or at least nothing that qualifies to trip our particular breaker for noticing. This is awesome. We're getting somewhere.

You can actually make out some isolated transactional spaces. @vortac in the southeast may be the most obvious, receiving a fistful of notable transactions from non-whales and that's it. @gtg has a similar island nearby, though that one appears to just be giving funds on the outbound decide to @gandalf.

Again, @ramta is often a world of his own with what looks like a massive network of smaller accounts feeding transactions up to him. I haven't checked, but that could be evidence of some kind of bot activity.

I find of the central position of @bitrex to be interesting because it tells me that they are extremely active as a recipient of transactions from quite a number of whales.

Locality on these graphs represents centrality in terms of distribution of transactions. Nodes that are close to one another tend to transact with nodes that are close to them.

Now having corrected for magnitude of transfer transaction, what do you see?

Even correcting for that and allowing for that, most of the transactions involving whales on the blockchain are red. That is, they represent people who are not whales transferring some amount of currency greater than 0.5 units to the whale.

At least as far as literal fund transfers, we can tell that the majority of the power players, the top 200, sorted by SP, are definitely not dispensing funds to other people on the blockchain. Funds are moving up to them, not down away from them.

That's definitely interesting.

Epilogue

Look, I know I'm no kind of #BusinessIntelligence guy. I leave the serious job of making sense of financial aspects of blockchain operations to other people who are more inclined to be awesome than I am.

Know your strengths.

But I think that I have a few tools to bring to bear on the question of how transactions are flowing across the public blockchain. That's probably a curiosity that's worthwhile to pursue.

The obvious next step is to do some sort of study of the way votes are traveling on the blockchain. That's going to be a much larger and more intractable problem because of the sheer number of votes which happen per minute.

We may can cut down on the amount that we abuse the backend database by deploying some kind of clever query. It should be possible to push the filtering work off onto the server by sending it a list of the accounts we are interested in in terms of vote transactions, but that will require some more research on my part.

If this sort of thing is interesting to you or if you learn something, let me know. We will probably all look askance at you as a result, but it's the sort of thing that keeps me face down in database coding for days at a time, and I think we can all agree that keeping me occupied and off the streets is the best for society in general.

Tools

Sort:  

I now have a water-cooled Ryzen 7 1800. Is there any way I could lend some horsepower if you ever need it for more extensive graphs?

Unfortunately, no.

I'm working on moving to some graphing solutions which are much more heavily parallelized, but few actually make use of multi-computer shared processing clustering, at least outside of some serious academic locations where they have "real hardware."

But that's cool. If it really comes to that point, I have enough hardware lying around the house to build a fairly nice distributed network.

But I'd rather write up some more stuff about RPGs and tabletop wargaming then sink the time necessary to turn my house into a vast, distributed cluster. [grin]

I appreciate the offer, however.

The addition of colored edges makes these graphs a lot easier to see.

I find of the central position of@bitrex to be interesting because it tells me that they are extremely active as a recipient of transactions from quite a number of whales.

You cut out blocktrades, but not bitrex? Oh, I see, it's with 1 T. Is this just a highly lucrative typo squatter?

Edit: Upon reviewing their wallet, they look to send back all the typo transfers... or most?

Interestingly, @blocktrades is in the "top 200" accounts; did green square. @bitrex is just an ancillary account, one of the accounts merely connected to the top 100 via being in a transfer with one.

I'm not sure exactly what they're doing out there. That's a fine question.

If nothing else, this kind of analysis is good for giving the human brain, the best pattern matching and pattern noticing system known to mankind, the opportunity to look at something and say, "that looks a little weird." It's a way to direct the attention.

If nothing else, this kind of analysis is good for giving the human brain, the best pattern matching and pattern noticing system known to mankind, the opportunity to look at something and say, "that looks a little weird." It's a way to direct the attention.

Yeah, but look at all the random crap that will do the same thing... as awesome as we are at picking out patterns, we just can't cope when there isn't a pattern. Though I suppose philosophically one might argue there is always a pattern.

Philosophically, there is definitely not always a pattern. Paradelia is a real cognitive failure. Unfortunately, it happens all the time.

Also unfortunately, there are patterns all over the place that really exist. How do we differentiate the two? Well, the first step is to try and recognize what patterns that we can – even if some of those are going to be false positives.

You miss 100% of the patterns you don't spot.

Is it just me who can't get to the gigapixel version? I have signed up.

Very odd. How is it failing?

403 Forbidden
Access was denied to this resource

Tried from phone and desktop

Bbl... baby business

Very odd. That's after the account was confirmed?

Yes after.

I got nothin'.

This seems like one of those really good opportunities to email the host.

I've got the same issue, could you put up a different mirror ?

Really interesting analysis. I have thought about doing something like this for some time. It's amazing how much of a story this tells. I'd love to see the votes version of this.

Very tedious and technical analysis. I really wonder how you searched it out and made a detailed post about it. Hats off to you buddy.

It really wasn't all that hard to search it out. Most of the code is in my previous post which I linked to near the beginning.

Honestly, the worst part was getting the graphviz settings just right so that it didn't blow up in the middle of some of the larger data manipulations. The actual underlying code dealing with database stuff is all pretty brute force. Nothing with a lot of finesse going on there.

Fascinating stuff and petty pictures too! I have absolutely no idea how you worked all the information, but thanks for the layman's explanation.

As for keeping you off the streets being the best thing for society, I'd be interested to see the results of unleashing you on society! ;)

I just end up in places like your local Friendly Neighborhood Gaming Store, lurking around the sci-fi miniatures and trying to hustle people into playing indie role-playing games.

It's a bad scene, man. Bad.

Oddly enough I just left a comment for @paulag regarding her forensic data analysis regarding delegated voting behavior.

https://steemit.com/steemit/@paulag/steemit-inc-and-misterdelegation-distributing-power-and-delegatee-ranking

I think that you can definitely add some horsepower to the work she's doing (if you aren't already).

Transparency is a good thing.

She and I have passed a few words regarding some potential collaboration. It could happen.

Transparency is a good thing, but sometimes it can lead to analysis paralysis and simple informational overload. It's the latter that we have a real problem with on the steem blockchain. Information is available and it's not hard to dig up – but there's so much of it and it's all related in fairly complex ways that interpreting what you see is very difficult.

Turning data into information is a skill, and it's one that I haven't quite mastered yet. I'm giving it a go, however.

Turning data into information is a skill, and it's one that I haven't quite mastered yet.

I have to disagree, this post does a very professional job of transforming data into information.

But I tend to compartmentalize jobs and think interpreting that information is the next step in the process. (Actually, interpretation is probably more of a dark art than a science.)

Luckily, the blackest of sorceries have always fascinated me.

I suppose that if I specifically limited my data to things older than six months, stuff like this would probably qualify as necromancy. I'm willing to accept that designation.

It's a little bit complicated trying to determine what level of interpretation one should apply when you have information this dense. Step one is figuring out what you can throw away, and that alone can sometimes require more knowledge than you have available.

The ability to distinguish the wheat from the chaff is a scarce commodity these days.

And with the "Wild West" aspect of cryptocurrencies these days, there's a very strong possibility that things move in inexplicable ways. Seems like the people most knowledgeable about crypto are the ones who also have the ability to stay one step ahead of the crowd.

I figure if I can stay two or three steps behind the leaders, I can still be ahead of the herd.

Personally, I'm coming to think that because of the overlapping emergent processes involved in this particular type of market, there are no people who are "most knowledgeable."

There are people who are really good at lying about it, but no one who actually has actionable knowledge. Rules of thumb, approximations, some ideas, cultist tendencies, systems that won't work – everything you find in a population of gamblers.

But not predictive knowledge.

Would it make a difference to you if the state of affairs was accurately reflected by what I just said? Would you choose to act in a different manner?

there are no people who are "most knowledgeable.

I agree, 100%

There are people who are really good at lying about it, but no one who actually has actionable knowledge.

Humans like to follow and I don't think people even have to be good liars, all they have to do is sound confident in their viewpoint and you'll see a lot of followers latch on.

I simply try to process as much data as my little brain can handle, weed out as much fact from opinion as I can and try to make the best wild ass guess I can.

Excellent analysis.
I like the way you speak through your reasoning, filtering out noise to look for trends.
Really appreciate it.
Resteemed.