I also have some serious consideration of whether we ought to be trying to come up with a way to filter out "curation trains" of followed auto voting, but I haven't come up with a really good way of detecting that sort of event. My gut says that some sort of clustered timeseries would be about the only way to determine or detect a relationship between events like that, but it's hard to say for sure.
The system does not make it very easy to analyze votes by value and aggregate, so this is always a lot of fun.
I've actually been trying to implement a filter that would simply bias presentation in favor of things that I have personally voted up in order to provide an additional form of weighting and presenting the content available on the blockchain, but that's been going slowly at best. I know that any kind of filtering is a big task.
In this case, because up vote bots are nearly impossible to detect if they're not on the big listings, I'm not sure that it's absolutely useful to trying correct for them specifically as opposed to individualizing the experience of a user who is already involved in seeking out content that they like and signaling to the system that they do. After all, it's theoretically possible that someone might like the kind of content that is consistently voted up by a bot. I hate to make that kind of assumption up front. It's theoretically possible that, at some point, someone might create a bot which consistently votes up content that I'm interested in. Theoretically.
(You really have a bunch of friends who like to follow you around and flag everything you do, don't you? I don't think that I've ever seen a comment with a reasonable content like that get so absolutely stepped on as hard as feet could go. It's really quite impressive. No one cares that I'm a thorn in their side so much that they follow me around quite so slavishly. Good job!)
I have been trying a type of market basket analysis in R - but getting it wrong so far.. Im trying something like if A votes for Y all the time, who else also votes for Y all the time.. Im very new to R, this will take me months to master lol
He and Jeff Leek have a dozen or so courses on R, stats and data science on coursera. I don't know how coursera works right now (haven't used it in years), but a few years back I did enjoy two courses by them:
I suspect that you are going to really enjoy working on analysis in R once you get your feet wet. Thinking about these problems from a procedural point of view really throws certain aspects into sharp relief. I find that it really tests my assumptions about what I should be seeing and expectation versus what I am seeing and why I'm seeing those things.
Though you have to be careful with the "Alice votes for Bob all the time, who else votes for Bob all the time?" form of inquiry, because it is perfectly reasonable for human beings to act like that. Especially on Steemit, where providers of anything outside of talk about cryptocurrency in general and steem in particular are rare, it is very easy for real communities of people to end up largely voting for each other if they are interested in the same niche subject.
But that's okay, because you would notice that very quickly once you started pulling those clusters out. This is how we learn.
.
I also have some serious consideration of whether we ought to be trying to come up with a way to filter out "curation trains" of followed auto voting, but I haven't come up with a really good way of detecting that sort of event. My gut says that some sort of clustered timeseries would be about the only way to determine or detect a relationship between events like that, but it's hard to say for sure.
The system does not make it very easy to analyze votes by value and aggregate, so this is always a lot of fun.
I've actually been trying to implement a filter that would simply bias presentation in favor of things that I have personally voted up in order to provide an additional form of weighting and presenting the content available on the blockchain, but that's been going slowly at best. I know that any kind of filtering is a big task.
In this case, because up vote bots are nearly impossible to detect if they're not on the big listings, I'm not sure that it's absolutely useful to trying correct for them specifically as opposed to individualizing the experience of a user who is already involved in seeking out content that they like and signaling to the system that they do. After all, it's theoretically possible that someone might like the kind of content that is consistently voted up by a bot. I hate to make that kind of assumption up front. It's theoretically possible that, at some point, someone might create a bot which consistently votes up content that I'm interested in. Theoretically.
(You really have a bunch of friends who like to follow you around and flag everything you do, don't you? I don't think that I've ever seen a comment with a reasonable content like that get so absolutely stepped on as hard as feet could go. It's really quite impressive. No one cares that I'm a thorn in their side so much that they follow me around quite so slavishly. Good job!)
I have been trying a type of market basket analysis in R - but getting it wrong so far.. Im trying something like if A votes for Y all the time, who else also votes for Y all the time.. Im very new to R, this will take me months to master lol
R Programming for Data Science by Roger D. Peng might be handy :-)
He and Jeff Leek have a dozen or so courses on R, stats and data science on coursera. I don't know how coursera works right now (haven't used it in years), but a few years back I did enjoy two courses by them:
I suspect that you are going to really enjoy working on analysis in R once you get your feet wet. Thinking about these problems from a procedural point of view really throws certain aspects into sharp relief. I find that it really tests my assumptions about what I should be seeing and expectation versus what I am seeing and why I'm seeing those things.
Though you have to be careful with the "Alice votes for Bob all the time, who else votes for Bob all the time?" form of inquiry, because it is perfectly reasonable for human beings to act like that. Especially on Steemit, where providers of anything outside of talk about cryptocurrency in general and steem in particular are rare, it is very easy for real communities of people to end up largely voting for each other if they are interested in the same niche subject.
But that's okay, because you would notice that very quickly once you started pulling those clusters out. This is how we learn.