WEB3 needs an open source algo; bad

in LeoFinance11 months ago

algorithm-circuit-math.jpg

Weird that it's not a priority, eh?

But there's also a good reason for it as well. Creating a good algorithm is very difficult and all the WEB2 social media sites develop them behind close doors in secret. It's weird to think about but the ability to organize and disseminate content is very much intellectual property. After all, all these corporate entities are very much in competition with each other. Can't be giving up trade secrets willy nilly.


The monetization of data.

The goal of a good algorithm is two fold. The first and foremost reason to evolve this kind of tech is to make sure the platform can scale up without becoming bloated and unwieldly. If users are constantly served random content that doesn't capture their interest then the platform will lose business within the attention economy to another network that can captivate them.

The ultimate side-effect of a good algorithm is that it acts as a very important information feedback loop for the company that issues it. Every WEB2 service learns as much as they can about their users because that information ends up equating to actual dollars down the road. By monitoring what users like and respond to within this datamined-loop Big Tech can further refine what they know about the people that utilize their systems.


Volume vs overhead cost.

So why do WEB3 platforms like Hive lack this kind of fundamental development? Well, first of all it can be very difficult and expensive to create something that's useful and actually works. In fact, it can be expensive to create an algorithm that doesn't really work at all.

This is compounded by the fact that the algorithm organizes data within the database into useful modules of grouped information. In many cases an algorithm won't even work unless the userbase is big enough to accommodate it. No point in organizing data if there isn't enough data their to organize in the first place. This is something that projects like Hive and Leo suffer from quite a bit. We simply haven't hit that necessary threshold to begin a nice snowball effect. It takes a lot of data to make a little bit of profit. That's just the nature of the attention economy. Volume matters.

snowball.jpg

Who benefits?

Another big problem with creating a decentralized algorithm is that it would need to be open source. This makes it difficult, if not impossible, to monetize because none of the frontend participants can get an edge, so who's going to put the work in if the developers of the thing have nothing to gain from doing the work? This is a classic open source problem in terms of sustainability and scaling up. This is actually one of the main reasons why crypto is so impressive because it maintains an open source nature while still threading the needle of monetization; something that few codebases have succeeded in doing.

In theory we could fund something like this with DAO money but surprisingly enough I've never seen anyone even attempt such a thing. Again, it's quite possible that there simply isn't enough data on WEB3 systems yet for this to be worth it, yet. In fact we can think an algorithm as a kind of centralized curation, and we already have a basic system set up based on the reward pool. Perhaps that will have to until we hit that scaling wall.

Gaming the system.

Another reason why having an open source algorithm is a potentially self-defeating strategy is that it can be exploited by content creators looking to hit all the required metrics. There are many instances of this happening across all platforms. For a while there were 2-3 minute engagement farming videos about creating ridiculous food dishes.

Ultimate algo farming

Make a 2 minute video with an attractive girl doing something absolutely ridiculous while at the same time acting like it's a totally normal life hack. These videos tick all the boxes. They are short, but not too short. People who watch the video tend to watch them to the end, and they create tons of engagement because people are like, "What the fuck did I just watch." An ice cream scooper on Mexican food? What? Most people had no idea why this trend was happening, but later we found out that it was all about gaming the algorithm and getting as many views as possible.

image.png

TikTok has the best algo?

There seems to be some consensus around the idea that out of all the WEB2 algorithms, TikTok has the best one hands down by exponential margins. Many will attest that TikTok knew things about them before they even realized it about themselves. Pretty crazy stuff. How did they do it?

Well, first of all, TikTok is a Chinese entity and we all know how good China is at surveillance and prediction (although this might be an overly reductive take). Also there have been multiple lawsuits in which TikTok was sued for scraping every last bit of data from devices that they possibly could: including pictures and audio directly from the microphone. While they may not employ these tactics any longer they sure as hell didn't delete the data they had already collected.

TikTok was also in the right place at the right time during the COVID lockdowns. Many users were depressed and locked down within their own home and engaged in "doom-scrolling". Many TikTok users would just be on it all day trying to get those 1-2 minute dopamine fixes. I can only imagine the subtle damage that was done to us collectively during that time. I mean I handled it fine but others not so much. Everybody's different, and I guess that's the entire point of an algo in the first place.

Conclusion

Going back to the topic of volume, Tiktok still has a billion users worldwide. Imagine how much data that is. Compare that to WEB3 and it's clear we are still just a tiny drop in the bucket, and maybe not even that. Tack that onto all the other problems of WEB3, like the inability to monetize open source product (unless it's tokenized which can then turn into an unsustainable scam).

Regardless of all the hurdles in the way, this is still something we very much should be considering going forward. The point of social media is, in essence, content curation; being served the information we want to see. Sometimes this is direct and obvious functionality like text messaging between two friends online, but it gets much trickier when trying to organize massive clumps of public data, choosing who gets seen and who doesn't. We've all witnessed the resulting corruption of centralized systems.

Unfortunately decentralized system have their own problems to content with as well. It's unclear how we are going to resolve them going forward, but I think we can all agree that something has to give. WEB2 is cannibalizing itself and that will leave us to pick up the pieces.

Sort:  

Agreed, but two questions:

  • Do you think Hive has enough fresh content to require a curation algorithm?
  • Isn't the lack of algorithmic curation one of Hive's selling points? If I remember the Steem Whitepaper from back in the day, the main idea behind sharing some of the rewards for curators is to incentivize platform users to discover exciting content early, so they benefit if the content becomes popular.

Great questions

Hive might not have the most content compared to WEB2 services, but that doesn't make it statistically easier to find the content one is looking for. If 1 in 100 posts are really good, it will take 100 tries to find a good post no matter what the total number of posts is. An algorithm can act as a filter that makes it easier to find those gems. Good posts still get buried by shitposts here simply due to the sortation of content by payout.

The Steem whitepaper is a joke and an utter failure for a lot of reasons.
Could write an entire post on that and then some.
It is not a selling point that the network lacks basic tools.

In fact an algorithm isn't a selling point at all for any blockchain, because an algorithm is centralized in nature no matter what. It does not empower the blockchain, but rather the frontends connected to the blockchain, which are all a single server and single controlling entity. An algo on Peakd would not be the same as an algo on Leo or Ecency or Liketu or Hive.Blog. The entire point of the algo is to differentiate different front ends to fill a certain niche. The initial code for them could be open source and cloned, but they'd need to be tweaked a bit for each frontend to get the desired result.

Ultimately an algo for WEB3 would have completely different incentives than WEB2. WEB2 algos exist to hook the user and farm data, whereas a WEB3 algo could have much more pure intentions and generate value solely for the users.

The Steem whitepaper is a joke and an utter failure for a lot of reasons.
Could write an entire post on that and then some.
It is not a selling point that the network lacks basic tools.

True, it probably made sense back in 2016, when it was released, but latest after the hype of 2017, it was clear that some of the assumptions in there do not hold. It would have been good, if someone went back and proposed an updated version, but @dan was already gone and STINC was already lacking in vision and execution.


In fact an algorithm isn't a selling point at all for any blockchain, because an algorithm is centralized in nature no matter what. It does not empower the blockchain, but rather the frontends connected to the blockchain, which are all a single server and single controlling entity. An algo on Peakd would not be the same as an algo on Leo or Ecency or Liketu or Hive.Blog. The entire point of the algo is to differentiate different front ends to fill a certain niche. The initial code for them could be open source and cloned, but they'd need to be tweaked a bit for each frontend to get the desired result.

Centralized or decentralized is not an issue, as long as you have consensus on the data you're going to use (which is exactly what the blockchain does). The idea of different algorithms is quite interesting and you could even further push it, by giving each user an individual algorithm, running locally on the user's device. That would give a similar or even better experience than centralized social media platforms, but that is technically quite challenging. So different algorithms for different front-ends seem more likely.

Pioneering choice of algorithm would indeed be great indeed a great feature for Web3 social media. However, most users would not care and therefore it would only be interesting for a few users.

oÒ, greedy cannibals.
image.png

Most comments do not need to be stored on a Blockchain. Most of the posts do not need to be stored on a Blockchain. People have been naive about web345 for long enough.

Ironic considering that this comment saying comments don't need to be on chain has generated a reward. How much will that reward be worth in 10 years? It absolutely does need to be on chain if it's going to get rewarded. Seeing as the reward period is only 7 days we could be archiving a lot of data off chain once that period is up, in theory. Of course that just creates a new wave of problems in that it becomes much harder to replay a node from the genesis block. There are no one-size-fits all solutions unfortunately.

The whole "generating" rewards is just the most high level view of them all. If anything it might be interesting to keep a permanent track record of some conversation as part of a digital history.

There is an incentive structure in place and the ecosystem has been shrinking since the steam from the STEEM fork cooled down. Hence, the incentive structure is a failure. But that will never be accepted by the biggest whales, therefore the outcome is already predestined. Imagine the market turn around in 2025, $Hive drops another 95%, some people start to gobble up even more and other decide to fork away due to centralization. Hahaha.

Maybe, maybe not. Certain is, all the value of the Hive Blockchain can be view as a projection of energy through all wallets. I believe the Market Cap will at some point drop far below the value of the wallets and strong winds of change will arise. At that point, it is adapt or die for this fork.

that video was disturbing. I didnt understand what was happening to me.... clearly a content trend that I gladly missed out on

Yeah there are dozens if not hundreds of those, many of which were all created by the same guy.

If there were an option to follow tags or mute tags, it would already be very helpful.

Providing that Web3 lacks versatile algorithms for content curation, the challenges arise as follows: In contrast to the giant Web2 players such as TikTok, web3 platforms do not have user base enough for implementing powerful decentralized algorithms. It needs creative approaches and adaptations to overcome such obstacles. What are the ways in which we can generate sustainable responses to content curation as Web3 continues to develop?

It's amazing the pull that platforms like Tik Tok have. Even when it was Muscally or whatever it was, all the kids were using it. I know what you are talking about with the algorithm. I've found myself wondering what I just watched quite frequently these days.

"We simply haven't hit that necessary threshold to begin a nice snowball effect..."

LOLNO.

ActiveUsers.png
IMG source - Peakd.com/@arcange

Pesky users have been persuaded to depart for climes less intemperate through the magic of opinion flagging.

Thanks!

No, I would disagree: this huge jump in uses was during the last bull, and since those users are neither Authors nor Curators, I assume that they are mostly Splinterlands players and that the red graph basically tracks the number of active Splinterlands players.

I think you're right about the Splinterlands during the Covid lockdowns, but after 2017 Hive lost ~1M users that signed up and gave it a shot. Those were authors and curators, until they got flagged off.

Edit: Hive was still Steem back then.

A logarithmic y-axis would be good for this plot: authors and curators are barely visible. But you are right: in 2017, there were many more authors than in 2021 and later on.