OxVf

The code base of empathetic dialogue generation

dataloader

Without pretrained embedding

the dataloader can automatically read Facebook empathetic data

Assume empathetic data is at last file level:

../empatheticdialogues_data/train.csv
../empatheticdialogues_data/valid.csv
../empatheticdialogues_data/test.csv

Then at anywhere of python file, call:

textdata = Dataloader('fb')

Then, the data is loaded to the instance textdata
There are 5 data structures in the textdata

1, The train, valid, test data is stored in textdata.dataset

textdata.dataset['train'] is a list, each element is a 5-element tuple:

(context_id, sentence_id, emotion_id, raw_context, raw_sentence)

context_id is a list of integers, each of them is a word index. context here means the previous k sentences in the dialogue.
sentence_id is a list of word indexes, which is the current utterance
emotion_id is the emotion's id
raw_context, raw_sentence are the original words of the context and sentence

2, textdata.word2index is a mapping between vocabulary to index,

Example:

index = textdata.word2index['hello']

3, textdata.index2word

Example:

word = textdata.index2word[23]

4, textdata.emotion2index

5, textdata.index2emotion

Similar to above

With pretrained embedding

For example, if embedding is in Glove format:

word d1 d2 d3 ... d100

all split with blank ' '

Then use

textdata = Dataloader('fb', embfile = '../glove.6B.50d.txt')

embfile stores the path to the embedding file

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
submitjade		submitjade
.gitignore		.gitignore
BART.py		BART.py
BERTEncDecModel.py		BERTEncDecModel.py
Decoder.py		Decoder.py
Encoder.py		Encoder.py
Hyperparameters.py		Hyperparameters.py
LICENSE		LICENSE
Model.py		Model.py
README.md		README.md
dataloader.py		dataloader.py
main.py		main.py
requirements.txt		requirements.txt
submit.sh		submit.sh
submitjade.sh		submitjade.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OxVf

dataloader

Without pretrained embedding

With pretrained embedding

About

Releases

Packages

Languages

License

shalei120/OxVf

Folders and files

Latest commit

History

Repository files navigation

OxVf

dataloader

Without pretrained embedding

With pretrained embedding

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages