Encdoer-Decoder Model with Attention

This repository implements both the LSTM Encoder-Deocder Model and the LSTM Encoder-Decoder Model with attention. The models are trained using the Multi30k en-de dataset. Their implementations are based on the following papers: the LSTM Encoder-Decoder Model is based on this paper, and the LSTM Encoder-Decoder model with attention is based on this paper.

Requirements

Install the required packages: datasets, torch, matplotlib, nltk

pip install -r requirements.txt

Additionally, you can utilize your GPU for training via CUDA. However, please note that when using CUDA, you need to ensure that its version is compatible with the version of torch being used.

Datasets

This repository provides implementations for two datasets: WMT14 fr-en and Multi30k en-de. Data preprocessing can be performed using the data.py script. You can adjust the settings by modifying the variables dataset, src_lang, tgt_lang, and reverse_flag. The dataset variable accepts either "wmt14" or "multi30k". For the src_lang and tgt_lang pairs, the WMT14 dataset supports "en" and "fr" (or vice versa), while the Multi30k dataset supports "en" and "de" (or vice versa). When reverse_flag is set to True, the input sequence is reversed during preprocessing. Additionally, due to the large size of the WMT14 dataset, you can pass a percent parameter (a value between 0.01 and 1) as the second argument to the load_data function to download only a fraction of the dataset.

	WMT14	WMT14(1%)	Multi30k
# of sentences	36M	326693	29000
Vocab size(en)	380M	108130	10475
Vocab size(fr/de)	400M	141797	19338

Encoder-Decoder Model

The Encoder-Decoder model is implemented based on the paper. You can train the model using the train.py script. The training configuration can be adjusted by modifying the following variables: set the dataset with pkl_file, src_lang, and tgt_lang; configure the model parameters with max_target_length, hidden_size, num_layers, and batch_size; adjust the training parameters with learning_rate, teacher_forcing_ratio, num_iters, and clip_value; and set debugging parameters with print_every and num_samples. At the end of traning, a random sample of size num_samples is drawn from the validation dataset to devaluate the translation performance. Additionally, the BLEU score is checked, and the training results are visualized using the loss curve and TensorBoard.

BLEU Score	23.97

Encoder-Decoder Model with Attention

The Encoder-Decoder Model with attention is implemented based on the paper. You can train the model using the train_attention_model.py script. The training configuration is similar to that of the previous Encoder-Decoder model, where you can modify the same variables. Additionally, you can adjust the dropout rate via the dropout_p variable. The process after training completion remains the same.

BLEU Score	24.88

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attention_model.py		attention_model.py
data.py		data.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
train_attention_model.py		train_attention_model.py
translate.py		translate.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Encdoer-Decoder Model with Attention

Requirements

Datasets

Encoder-Decoder Model

Encoder-Decoder Model with Attention

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

2j2h5/attn-encoder-decoder

Folders and files

Latest commit

History

Repository files navigation

Encdoer-Decoder Model with Attention

Requirements

Datasets

Encoder-Decoder Model

Encoder-Decoder Model with Attention

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages