How to Build a Winning Tool for Sentiment Analysis of Twitter Data

in #blog5 years ago (edited)


Today, we'll summarize my continued efforts to build a sentiment analysis tool. This tool is using the Twitter API to consume 97 different Twitter handles, that provide Crypto trading signals. This method can easily be applied to traditional markets - effectively capitalizing on new and highly lucrative means for trading stocks online. We'll be reviewing the opportunities to add a neuro evolutionary AI to our bot. You can review the first iteration of this series on Twitter sentiment bots here on my blog.

When it comes to classical issues facing sentiment analysis applications, it's important to consider how algorithms work. If we're using a Natural Language Processing tool to rank the individual words in a phrase based on positive, neutral and negative sentiment, wouldn't a product review like 'All these wonderful new features! Although none of them work!' confuse the machine? Words like 'none' are the true sentiment - being negative - but the overall feeling the programming gets from the message involve words like 'wonderful' 'new' 'work.' This would yield a false positive in the ranking and scoring of phrases.

What we need is a way for the computer to overcome this sarcasm in the tweets. Certain AIs are capable of establishing context by ranking sentiment, emotion and personality. We'll then apply this when we analyze Twitter data. As Chief Liquidity Officer of Coindex Labs, I have access to a new and novel way to train datasets and apply resulting genomes on configurations - using an AI that's naturally resilient to over-fitting.

AI has other advantages when it comes to sentiment analysis, like identifying unique opinions that may otherwise get re-hashed, re-worded, and delivered in a separate 'original content' tweet. This doesn't really apply as much in our trading bot, as we're limiting our input tweets to those 97 Twitter accounts, but with a larger dataset this is critical. To identify the true emotion of the Internet at large, we'd need a way to filter out people (and even computers) with accounts that are used to artificially manipulate people's sentiment bots. It's 'a feint within a feint within a feint,' as Muad'Dib would have said!

Once we've created our new model and use it to weed out the bad apples, our analysis of sentiment will be far more effective. As far as sentiment scores go, this tweet that my Twitter bot picks up on is ahead of the bell curve - but, on manual inspection, it looks as though it's a joke rather than actual sentiment one way or another.

Entries like these should be ignored in the final product. I've toyed with the idea of human intervention before acting on a given signal, but the delays involved would mean that the average trader would miss 70% of the swing. Ideally, the bot would act on the 'good' signals within a matter of seconds or minutes - beating the overall shift in the markets when others pick up on the sentiment, before everyone exits. This is how the goldmoney is made.

This is what a profitable tweet looks like, and currently our win/lose ratio is far higher than 50% - combined with a creative trailing takeprofit, we can establish an edge on the market:

The next steps are to build out the AI's training set with loads of tweets to run analysis on. These will then serve as a benchmark with which to apply to forward-testing data with live, real incoming tweets based on our followed list. Then, our evolutionary part of the neuro evolutionary AI will create competing sets of genomes with the different input variables, and the best-performing genome will become the champion set that runs the live trader. The next generation of genomes will then (likely, by and large) perform better vs the previous generation, and a new champion might be selected.

Are you more interested in how to automagically grow your Twitter following? Check out this article I wrote on consuming the Twitter API to hack your account's growth! All in all, we have a clear plan for how our analysis of Twitter for fun & profit will continue to become more sophisticated and the win/lose ratio will increase significantly.