Distilling Knowledge for Fast Retrieval-based Chat-bots

Implementation of "Distilling Knowledge for Fast Retrieval-based Chat-bots" (SIGIR 2020) using deep matching transformer networks and knowledge distillation for response retrieval in information-seeking conversational systems.

Prepocessing a dataset:

cd ./Datasets/ubuntu_data/
ipython ./ubuntu_preprocess.py

The ubuntu dataset can be found here.

Training a Bi-Encoder using cross entropy loss:

ipython ./main_bert_bi.py -- --dataset ubuntu_data --batch 16 --nmodel BiEncoderDot2 --epochs 1

Training an enchanced Cross-Encoder (BECA) using cross entropy loss:

ipython ./main_bert_bi.py -- --dataset ubuntu_data --batch 8 --nmodel MolyEncoderAggP2 --epochs 1

Training a Bi-Encoder using cross-entropy loss and knowledge distillation:

 ipython ./main_bi_kd.py -- --dataset ubuntu_data --batch 8 --student_batch 16 --alpha 0.5  --epochs 1  --load 1 --nmodel MolyEncoderAggP2 --nstudentmodel BiEncoderDot2

This uses the outputs of a previously trained enhanced cross encoder (BECA) when training the Bi-Encoder. Be careful to match the hyper-parameters between the two runs otherwise you'll get file not found errors.

You can cite the paper as:

Amir Vakili Tahami, Kamyar Ghajar, Azadeh Shakery. Distilling Knowledge for Fast Retrieval-based Chat-bots. In Proceedings of the
43th International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2020).

Bibtext
 @article{tahami2020distilling,
    title={Distilling Knowledge for Fast Retrieval-based Chat-bots},
    author={Amir Vakili Tahami and Kamyar Ghajar and Azadeh Shakery},
    year={2020},
    eprint={2004.11045},
    archivePrefix={arXiv},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Datasets/ubuntu_data		Datasets/ubuntu_data
models		models
utils		utils
.gitignore		.gitignore
BiEncoder.png		BiEncoder.png
BiEncoderEnhanced.png		BiEncoderEnhanced.png
README.md		README.md
main_bert_bi.py		main_bert_bi.py
main_bi_kd.py		main_bi_kd.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distilling Knowledge for Fast Retrieval-based Chat-bots

About

Releases

Packages

Languages

kamyarghajar/DistilledNeuralResponseRanker

Folders and files

Latest commit

History

Repository files navigation

Distilling Knowledge for Fast Retrieval-based Chat-bots

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages