I've read some comments and understand how the current financial incentives bias your training data. What if we bootstrapped the effort with manual data collection? For example, we could have some volunteers label communities into somewhat broad categories. Then when new users onboard, we let them check any of these categories they are interested in (similar to how other socials work). They are then presented with some of the communities from that category (maybe even based on metrics like general activity - although that might bring in hive farming bias). Then we collect metrics on which communities they participate in and refine our suggestions/matrix completion/whatever?
Maybe we could even flip the incentive problem around to an asset by providing some incentive for the manual work? Just spitballing. I hope this effort is successful!
Bootstrapping with manual tagging of posts to train the ai, so users can select topics they're interested in, is already in progress!
Need help?
I got sidetracked with trying to implement keychain login in PHP, everything else is prepared. I'll announce when it starts!