Skip to content

Code for a comparative analysis of the performance of fine-tuned transformer models on climate change data. The transformer models used were BERT, DistilBERT and RoBERTa.

Notifications You must be signed in to change notification settings

unnamed-catalyst/Fine-Tuning-Transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Fine-Tuning Transformer Models for Sentiment Analysis on Twitter Data

Code for a comparative analysis of the performance of fine-tuned transformer models on climate change data. The transformer models used were BERT, DistilBERT and RoBERTa.

System Specifications

The experiments were conducted on the following system:

  • Processor: Intel Core i5-10500H CPU @ 2.50GHz
  • Graphics Card: NVIDIA GTX 1650 Max-Q
  • RAM: 16GB DDR4
  • Storage: 512GB SSD
  • Operating System: Windows 10
  • Python Version: 3.11.5
  • Frameworks: PyTorch 2.2.0, Transformers 4.33.2

Dataset

The dataset used is a Twitter dataset focused on climate change, containing 43,943 tweets annotated into "Pro", "Anti", "Neutral", and "News". The dataset can be found on Kaggle at this link.

The dataset was initially cleaned using the NeatText Python package before using it to fine-tune the transformer models, the process of data cleaning can be found in Cleaning Data.ipynb. The NeatText package can be found at this link

Hyperparameters Used


Parameter Description Value
Epochs As transformer models converge quickly, a low number of epochs was used 2
Batch Size Due to memory limitations, a smaller batch size was used 8
Gradient Accumulation Gradient accumulation over 2 steps was used to simulate a larger batch size 2
Learning Rate As transformer models are sensitive to the learning rate, a small value was used 2e-5
Weight Decay Regularization term used to prevent overfitting 0.01
Evaluation Strategy Performance was evaluated after a set number of steps instead of after each epoch steps
Evaluation Steps The number of steps before the performance was evaluated with the validation set 805
Warmup Steps Initial training steps where the learning rate gradually increases to the defined value 805
Table 1: Hyperparameters Used for the Transformer Models

Results

Key findings from the project:

  • The BERT model outperformed the other transformer models on the given dataset with an accuracy of 90%.
  • The DistilBERT and RoBERTa models achieved accuracies of 88% and 87% respectively.
  • The Ensemble Model outperformed all the models individually with an accuracy of 93.37% on the given dataset.

Category BERT DistilBERT RoBERTa Ensemble
Pro 0.88 0.85 0.82 0.97
News 0.94 0.93 0.93 0.95
Anti 0.97 0.95 0.95 0.96
Neutral 0.81 0.78 0.77 0.85
Overall 0.90 0.88 0.87 0.93
Table 2: Accuracies of the Various Transformer Models and the Ensemble Approach

Transformer Model Architecture

Figure 1: Ensemble Model Confusion Matrix

About

Code for a comparative analysis of the performance of fine-tuned transformer models on climate change data. The transformer models used were BERT, DistilBERT and RoBERTa.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published