Dataset Card for MSCOCO

annotations_creators

language

language_creators

license

multilinguality

pretty_name

size_categories

source_datasets

Dataset Card for MSCOCO

Dataset Description

Homepage: https://cocodataset.org/#home
Repository: https://github.com/shunk031/huggingface-datasets_MSCOCO
Paper (Preprint): https://arxiv.org/abs/1405.0312
Paper (ECCV2014): https://link.springer.com/chapter/10.1007/978-3-319-10602-1_48
Leaderboard (Detection): https://cocodataset.org/#detection-leaderboard
Leaderboard (Keypoint): https://cocodataset.org/#keypoints-leaderboard
Leaderboard (Stuff): https://cocodataset.org/#stuff-leaderboard
Leaderboard (Panoptic): https://cocodataset.org/#panoptic-leaderboard
Leaderboard (Captioning): https://cocodataset.org/#captions-leaderboard
Point of Contact: [email protected]

Dataset Summary

COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features:

Object segmentation

Recognition in context

Superpixel stuff segmentation

330K images (>200K labeled)

1.5 million object instances

80 object categories

91 stuff categories

5 captions per image

250,000 people with keypoints

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure

Data Instances

2014

captioning dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2014,
    coco_task="captions",
)

instances dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2014,
    coco_task="instances",
    decode_rle=True, # True if Run-length Encoding (RLE) is to be decoded and converted to binary mask.
)

person keypoints dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2014,
    coco_task="person_keypoints",
    decode_rle=True, # True if Run-length Encoding (RLE) is to be decoded and converted to binary mask.
)

2017

captioning dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2017,
    coco_task="captions",
)

instances dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2017,
    coco_task="instances",
    decode_rle=True, # True if Run-length Encoding (RLE) is to be decoded and converted to binary mask.
)

person keypoints dataset

import datasets as ds

dataset = ds.load_dataset(
    "shunk031/MSCOCO",
    year=2017,
    coco_task="person_keypoints",
    decode_rle=True, # True if Run-length Encoding (RLE) is to be decoded and converted to binary mask.
)

Data Fields

[More Information Needed]

Data Splits

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

[More Information Needed]

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

[More Information Needed]

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

The annotations in this dataset along with this website belong to the COCO Consortium and are licensed under a Creative Commons Attribution 4.0 License.

Images

The COCO Consortium does not own the copyright of the images. Use of the images must abide by the Flickr Terms of Use. The users of the images accept full responsibility for the use of the dataset, including but not limited to the use of any copies of copyrighted images that they may create from the dataset.

Software

Copyright (c) 2015, COCO Consortium. All rights reserved. Redistribution and use software in source and binary form, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Neither the name of the COCO Consortium nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE AND ANNOTATIONS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Citation Information

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13},
  pages={740--755},
  year={2014},
  organization={Springer}
}

Contributions

Thanks to COCO Consortium for creating this dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
tests		tests
.gitignore		.gitignore
MSCOCO.py		MSCOCO.py
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dataset Card for MSCOCO

Table of Contents

Dataset Description

Dataset Summary

Supported Tasks and Leaderboards

Languages

Dataset Structure

Data Instances

2014

2017

Data Fields

Data Splits

Dataset Creation

Curation Rationale

Source Data

Initial Data Collection and Normalization

Who are the source language producers?

Annotations

Annotation process

Who are the annotators?

Personal and Sensitive Information

Considerations for Using the Data

Social Impact of Dataset

Discussion of Biases

Other Known Limitations

Additional Information

Dataset Curators

Licensing Information

Images

Software

Citation Information

Contributions

About

Releases

Packages

Languages

shunk031/huggingface-datasets_MSCOCO

Folders and files

Latest commit

History

Repository files navigation

Dataset Card for MSCOCO

Table of Contents

Dataset Description

Dataset Summary

Supported Tasks and Leaderboards

Languages

Dataset Structure

Data Instances

2014

2017

Data Fields

Data Splits

Dataset Creation

Curation Rationale

Source Data

Initial Data Collection and Normalization

Who are the source language producers?

Annotations

Annotation process

Who are the annotators?

Personal and Sensitive Information

Considerations for Using the Data

Social Impact of Dataset

Discussion of Biases

Other Known Limitations

Additional Information

Dataset Curators

Licensing Information

Images

Software

Citation Information

Contributions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages