Skip to content

mahdialibi/DSTC_clustering

Repository files navigation

Deep Short text clustering

DSTC is a method that aims to cluster short text messages using CLIP and deep auto-encoder.

If you use it please cite it correctly

Pre-requisites

pip install -r requirements.txt

Reproduce results

run clip-server

 python -m clip_server 

than

python DSTC.py --maxiter 1500 --pretrain_epochs 200 --ae_weights results/snippets/ae_weights.h5 --save_dir results/snippets/ --dataset search_snippets
python DSTC.py --maxiter 1500 --pretrain_epochs 200 --ae_weights results/biomedical/ae_weights.h5 --save_dir results/biomedical/ --dataset biomedical
python DSTC.py --maxiter 1500 --pretrain_epochs 200 --ae_weights results/stackoverflow/ae_weights.h5 --save_dir results/stackoverflow/

Acknowledge

This code is based on repo from here.

About

Deep short text clustering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages