LATGeO: Image-Captioning

This repository contains PyTorch implemetation of the paper Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning !
If you find our paper or provided codes helpful in your research, then please do not forget to cite our paper. Thank you!
The following architecture represents our proposed model LATGeO for Image Captioning.

Requirements

-python 3.8.8  
-pysimplegui 4.47.0
-pytorch 1.8.1
-torchvision  0.9.1
-numpy 1.18.5
-h5py 2.10.0 
-cython 0.29.23
-cudatoolkit 11.1.74
-pillow 8.2.0
-protobuf 3.17.3
-scipy 1.4.1
-tensorboard 2.4.0
-tensorflow-gpu 2.3.0
-spacy 3.0.6 
-python 3.8.8
-requests 2.24.0
-tqdm 4.60.0

Detection using RCCN: Follow the installation instructions provided by Bottom-Up.

Testing

The testing.py code is provided for predicting the detection of the provided image.
We have also provided the Jupyter Notebook for better visualization of the predicted captions.
Two Jupyter Notebooks are provided :

test_DETR-LATGeO.ipynb
test_RCNN-LATGeO.ipynb

GUI-Demo

We have also provided the code for the GUI-Demo of our method.
Following are some results after running this GUI-Demo file "GUI_Demo_LATGeO_RCNN.py"

You could use the provided GUI-Demo code for your application as well.

LATGeO Evaluation on MSCOCO Dataset

Model	BLEU-1	BLEU-4	METEOR	ROUGE-L	SPICE	CIDEr-D
LATGeO	76.5	36.4	27.8	56.7	-	115.8
LATGeO + RL	81.0	38.8	29.2	58.7	22.9	131.7

Citation

Please cite the following BibTex:

@misc{dubey2021labelattention, 
      title={Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning}, 
      author={Shikha Dubey and Farrukh Olimov and Muhammad Aasim Rafique and Joonmo Kim and Moongu Jeon}, 
      year={2021}, 
      eprint={2109.07799}, 
      archivePrefix={arXiv}, 
      primaryClass={cs.CV} 
}

If you find the paper and this repository helpful, please consider citing our paper LATGeO. Thank you!

License

This project is licensed under Machine Learning & Vision Laboratory (MLV Lab), GIST.

Acknowledgments

We would like to thanks AImageLab, peteanderson80 and facebookresearch teams.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
GUI_Demo_Results		GUI_Demo_Results
caption_model		caption_model
data		data
detection		detection
sample_images		sample_images
utils		utils
GUI_Demo_LATGeO_RCNN.py		GUI_Demo_LATGeO_RCNN.py
README.md		README.md
load_model_3layers.py		load_model_3layers.py
test_DETR-LATGeO.ipynb		test_DETR-LATGeO.ipynb
test_RCNN-LATGeO.ipynb		test_RCNN-LATGeO.ipynb
testing.py		testing.py
vocab_LATGeO_transformer.pkl		vocab_LATGeO_transformer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LATGeO: Image-Captioning

Requirements

Testing

GUI-Demo

LATGeO Evaluation on MSCOCO Dataset

Citation

License

Acknowledgments

About

Releases

Packages

Languages

shikha-gist/Image-Captioning

Folders and files

Latest commit

History

Repository files navigation

LATGeO: Image-Captioning

Requirements

Testing

GUI-Demo

LATGeO Evaluation on MSCOCO Dataset

Citation

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages