The following repository uses a Kaggle competition dataset as an example of sentiment analysis.
This competition's data provides us with Twitter data. In fact, we have different tweets that have been already classified by sentiment, distinguishing positive, neutral and negative.
However, the main target of this competition is not to classify our tweets by sentiment. No. What we really want is to get those words that have the highest impact on determining the sentiment of the tweet.
The current repository is composed by a set of files that have different analysis.
There is no data due to the fact that we are using directly Kaggle's data.
EDA
It has a fast analysis of all the data. You can get some ideas
First Approach
Gives a first idea of how we can achieve our result.
This part only plays with the data by the use of some functions, but it never tries to give a solution to the competition's problem. It's just an approach.
roBERTa
We use a NN to try to solve the problem.