This section describes the format of the data required for the training and evaluation of the datasets using our approach. For all subtasks, the field names should match exactly as shown and have the same datatypes. The fields to be present in the raw dataset are as follows:
-- raw_text
: This is the reviews section (str)
-- aspectTerms
: This is the set of aspect terms and their polarities to be present as a list of dictionaries. Each dictionary will have atleast two keys with the one of the key term
and the value which is an aspect in the corresponding sentence. The second key will be polarity
and its value is the polarity for corresponding aspect. ([{'term':'aspect1', 'polarity':'polarity1'}, ...]
)
Warning When creating the dataset in this fashion and saving it,
.xlsx/.csv
format will convert the aspectTerms column intostring/text
format. But the package will handle that when loading the dataset file.
An example dataset is shown below and also in the Datasets folder.
raw_text | aspectTerms |
---|---|
The cab ride was amazing but the service was pricey | [{'term':'cab ride', 'polarity':'positive'}, {'term':'service', 'polarity':'negative'}] |
I ordered the Barbeque Pizza | [{'term':'noaspectterm', 'polarity':'none'}] |
All the model weights can be found here. The best performing models for each ABSA subtask based on our experiments are presented in the table below:
Task | Model Name | Dataset Trained | Model Type | Instruction Configuration |
---|---|---|---|---|
ATE | ./ate_tk-instruct-base-def-pos-neg-neut-combined | SemEval 2014 Laptops + Restaurants | InstructABSA-2 | Definition + 2 pos + 2 neg + 2 neut examples |
ATSC | ./atsc_tk-instruct-base-def-pos-combined | SemEval 2014 Laptops + Restaurants | InstructABSA-1 | Definition + 2 pos examples |
Joint Task | ./joint_tk-instruct-base-def-pos-neg-neut-combined | SemEval 2014 Laptops + Restaurants | InstructABSA-2 | Definition + 2 pos + 2 neg + 2 neut examples |
A sample inference notebook is found here.
The ATE models can be trained from scratch or alternatively can be used to run inference on your datasets directly. This can be done through CLI (check the Scripts folder) or by adapting your code similar to run_model.py. An example shell command to run inference on individual samples is shown below.
To evaluate the ATE subtask on a single input using CLI run the following:
python run_model.py -mode cli -task ate \
-model_checkpoint ./ate_tk-instruct-base-def-pos-neg-neut-combined \
-test_input 'The cab ride was amazing but the service was pricey'
The ATSC models can be trained from scratch or alternatively can be used to run inference on your datasets directly. This can be done through CLI (check the Scripts folder) or by adapting your code similar to run_model.py. An example shell command to run inference on individual samples is shown below.
To evaluate the ATSC subtask on a single input using CLI run the following:
python run_model.py -mode cli -task atsc \
-model_checkpoint ./atsc_tk-instruct-base-def-pos-neg-neut-combined \
-test_input 'The ambience was amazing but the waiter was rude|ambience'
Note the |
delimiter that is used to pass the aspect term for which the polarity is to be extracted.
The Joint task models can be trained from scratch or alternatively can be used to run inference on your datasets directly. This can be done through CLI (check the Scripts folder) or by adapting your code similar to run_model.py. An example shell command to run inference on individual samples is shown below.
To evaluate the Joint Task on a single input using CLI run the following:
python run_model.py -mode cli -task joint \
-model_checkpoint ./joint_tk-instruct-base-def-pos-neg-neut-combined \
-test_input 'The cab ride was amazing but the service was pricey'