From Tweets to Tickers: Trading Tesla Using Musk’s Sentiments

Introduction

Natural language processing has recently become a tool with great applications in finance. Companies like JPM, Deutsche Bank, Statestreet are using NLP software to extract alternative data such as social media platforms sentiment, extract key insights from research reports and even flag possible executive deception by analyzing the earning call transcripts.

In this paper, we will apply NLP methods to sentiment analysis, by classifying the tweets of Elon Musk into three categories: positive, negative and neutral. The goal of this research experiment is to test if there is any predictability between the sentiment of Musk’s past tweets and Tesla’s daily stock price via Granger causality and then build a strategy on top of 3 sentiment signals derived from the tweets’ sentiment.

Past literature

To start with, numerous methodologies and conclusions have been drawn in the past in the field of behavioral finance, with a particular emphasis on the influence of social media sentiment on stock market dynamics. One of the earliest studies [1] concentrated on Yahoo! message boards and how they affected stock results. They found that sentiment had a small impact on the performance of individual stocks, but for the Morgan Stanley Technology index, they found a strong association. Another study in this field [2] aimed at predicting the direction of the stock market using dictionary-based sentiment analysis on Twitter data. Their results suggest that certain moods taken from Twitter could Granger-cause movements in the Dow Jones, indicating a strong correlation between public sentiment and stock market values. On the other hand another report [3] found that there are three ways to examine how social media affects the stock market: discussion volume, message content richness, and user popularity and competence. According to this theory, increased engagement on online platforms is correlated positively with greater stock trading activity.

Our study differs from these past approaches, as our model uses a transformer based model known as RoBerta to classify the sentiment expressed in a Tweet. This methodology provides a more accurate and detailed evaluation in contrast to previous, less complex methods that relied on dictionaries, such as Harvard IV-4 Psychological Dictionary for sentiment analysis, and OpinionFinder and the Google Profile of Mood States to evaluate public mood. Also, our focus lies on specifically analysing only tweets related to Tesla, a company we selected because of its widespread retail following, market volatility and a very active insider on Twitter, the CEO Elon Musk. Also, we use Granger causality tests to thoroughly investigate the ability of past sentiment to forecast fluctuations in the current stock price, going beyond the usual correlation analysis seen in past works.

Data Collection

For the research we conducted we had to use two datasets. We first found a dataframe containing Elon Musk’s tweets and likes received between 2010 and 2022 from an existing dataset and then collected data on Tesla’s adjusted closing stock price the same period for a daily frequency. In order to guarantee that we only included the most relevant tweets, we scrapped Tesla’s wikipedia page to come up with the most frequent words associated with Tesla, such as “tesla”: 396, “model”: 115, “company”: 90, “vehicles”: 71, “musk”: 56, ”battery”: 48. With this in mind, we focused on the following filtering words: “Tesla”, “Battery”, “Model S”, “Model X”, “Model 3”, “Cyber Truck”, “Roadster”, and “Semi Truck”. After doing this operation, we found that among Elon’s 17k tweets between 2010 and 2022 only 3k were relevant for our Tesla sentiment analysis, and classified the 3k tweets into negative, positive and neutral.

However, after removing the neutral sentiments from the dataset and plotting the total of positive and negatively labeled sentiments (as shown by the gray series in Charts 1,2 & 3) we realized that sentiment data was most frequent between 2018 and 2022, hence we only focused on this period.

Sentiment Classification

To classify the tweets with their corresponding sentiment we used the RoBERTa (Robustly Optimized Bert Pretraining Approach) model. This model is an advanced version of the BERT architecture, designed to deliver superior and more nuanced performance in tasks, such as sentiment classification.

BERT

First introduced by Google in 2018, BERT (Bidirectional Encoder Representations from Transformers) represents a significant advancement in natural language processing (NLP) with its bidirectional reading capability, which contrasts with the unidirectional approach of traditional models like recurrent neural networks (RNNs) and long short-term memory (LSTM) networks. Unlike its predecessors, BERT can comprehend ambiguous language by considering both left-to-right and right-to-left contexts simultaneously, thanks to its transformer architecture. While BERT was pre-trained on Wikipedia texts, it can be fine-tuned with question-and-answer datasets. Traditionally, BERT has been trained on two tasks: masked language modelling (MLM) for predicting masked words based on context, and next sentence prediction (NSP) for discerning logical connections between two given sentences. However, BERT can also be trained and used as well for downstream tasks such as sentiment classification.

RoBERTa

RoBERTa (Robustly optimised BERT approach) surpasses BERT through multiple advancements. It leverages a more extensive range of pre-training data, including datasets from BookCorpus, CC-News, Common Crawl, and Wikipedia, leading to richer linguistic representations. Additionally, RoBERTa extends training durations and batch sizes, capturing subtler language patterns and relationships. Unlike BERT’s static masking, RoBERTa employs dynamic masking, enhancing model robustness by masking different token subsets in each epoch. By omitting the next sentence prediction (NSP) task during pre-training and employing larger training steps, RoBERTa optimises its training process for better performance on downstream tasks. These collective improvements enable RoBERTa to outperform BERT across various natural language processing (NLP) tasks.

Applying RoBERTa

To classify the tweets we leveraged the deep learning capabilities of Hugging Face’s Transformer library by using a base RoBERTa model pre-trained on 124 million tweets to classify our tweets according to inferred sentiment: positive, negative and neutral. We apply the RoBERTa model only on Musk’s 3k Tesla relevant tweets and use the “yield” generator and batch processing so that we can process our large dataset without running out of RAM memory. A glimpse of our RoBERTa classified data can be seen in Table 1’s “sentiment” column. Also, as shown by the “Tesla relevant ” column the 3 rows shown are part of the 3k rows of data from our 3k Musk tweets that focus on Tesla Inc.

Table 1

Source: BSIC, Kaggle

Signal Development

Our signal is based on the daily sentiment differential between the tweets which we classified as positive and negative, given that we excluded the neutral sentiments from our score. For our first signal (Chart 1) we calculate the daily sentiment frequency differential using the formula:

The second signal (Chart 2) modifies the previous formula by weighting the sentiment frequency by the total number of likes received by positive and negative tweets, according to the formula:

Lastly, the signal (Chart 3) takes the sum between the difference of total likes of positive and negative tweets in a day according to the formula:

In the three graphs below we plot the resulting sentiments together with the stock price and obtain sentiment differential for each signal. We plotted the daily total of positive and negative tweets to justify our decision to focus on the data from 2018 to 2022, a timeframe with fewer gaps between the sentiment data that need to be estimated with interpolation.

Previous
Previous

A Systematic Approach to Credit Investing

Next
Next

Blog Post Title Three