I've come on over from one of the other threads discussing dealing with spam and general trust issues, and being familiar with the original PageRank and other search engine gubbins, I'll say up front that this is an impressive application of those systems and I believe it very well would produce a useful tool.
I would love to see the results of running it on the current Steemit population as a practical test on a snapshot to see if we can actually visually determine where the bots are. It's important to actually have an understanding on real-world data, which I'm sure that you know. Since all that information is publicly available on the blockchain and a static study would be sufficient proof of concept, I think this is something you should definitely pursue.
However – and compared to the actual usefulness of this tool, this objection is going to be weird and minor. I accept that. However, it doesn't take into account individual interest or provide an individuated view of the rest of the population. In theory, it would be possible to build a bot swarm that actually did actively upvote content that you like. (We don't see it happening now because there's no benefit to actually providing things that people will like compared to gaming the system, but it's a theoretical possibility.) Because your method describes a top-down authoritarian imposition of rank, that eliminates the possibility of beneficial bot swarms.
I also wonder if it's not vulnerable to the same failure mode of the original PageRank, where the assumption is that the more tightly connected a set of pages are to one another as long as they were connected to the main continent of connections, as it were,, the better they're assumed to be. This meant that isolated islands of content which weren't linked to above the threshold of notice of PageRank were considered not valuable, independent of actual content. Applying that to groups of user-agents concerns me that isolated groups of friends who have little interaction with the main body of Steemit because their interests don't overlap with what is commonly thought of as "good content" will be seen as undependable and lower ranked, even though for people who might want to be a member of that group, their interests would coincide but they will never discover them.
Other than that, brilliant idea. Let's see what it looks like and then we'll go from there.
Hi, thx for your in-depth comment!
You said:
That's -- optimistic, sort of hand-waving one of the biggest deals in AI and social networking of the decade to a SMoP (Simple Matter of Programming), but we'll let that stand for now. I'm not sure UA really is that isolated from discovery; after all, the idea is to really keep discovery from happening on user-agents who are of likely bot-nature. What they do is immaterial compared to what people see.
You might be surprised. Thanks to the fact that neophile adoption tends to radiate out from a few trend setters, while often you have people brought in by already-established folk, when the system is obscure you get folks who find it through more organic non-personal search, jump in, and invite friends. They tend to island for quite a while, comparatively. It's not a problem because they'll just as often use New or Trending or whatever the equivalent is to find new content -- but if the activity in the main body is sufficiently monocultural or low quality, they just don't hook up with any real adoption rate.
But is that useful connectivity information? Is that new account a bot or a person? We don't know and, in fact, we can't know until some more activity happens. A static snapshot can't determine the rank of a new agent. It's an open question how much activity it takes to tell them apart.
That sounds more like an undesirable failure-mode to me rather than a desirable property. They should have voting power. At least one other person cares a lot! Those are important votes. To them.
OK, let me rephrase it differently: My article isn't a PhD thesis but a blog post I wrote in about an hour and updated afterwards. I'm sure there are many applications to UA that weren't covered iny my article. I meant to say just now, UA in my article was not intended per se for content discovery. I'm sure there are many more applications for it, and among them content discovery.
I might, we have to first implement a test environment and analyze the resulting data. This was just a first post.
Correct, that's why I included D, the damping factor, which holds a beginner amount of UA.
Failure-mode? I was merely referring to the monetary upvote value a new user has currently: $0.00
The rest I'm on board with but ...
This is absolutely true. I also think of it as a failure mode which really puts off newcomers to the platform who aren't bootstrapped by knowing people. It's hard to get in, hard to find traction, and hard to get rolling and stay rolling.
I'd very much like to see what the current connectivity map of all the users on Steemit looks like, though. Even if UA is not, ultimately, particularly a good method of determining whether a user agent is a bot or not (and there may be factors that neither one of us or is aware of that makes that a harder or easier task than expected), as a form of simple understanding of the community as its architected, the value is definitely high.
Your "connectivity map" === my "follower graph". In my example, I began with the example-graph to visually explain. However, in a real-world application (Steem itself), the follower graph is derived from the follower matrix in stead of the other way around.
PS: I've just published a follow-up article! You might like it!
UserAuthority (UA): explanations, applications and implications