This is an image segmentation project which detects and extracts the masks of cigarette instances in images using Mask R-CNN. The Mask R-CNN model generates bounding boxes and segmentation masks for each instance of an object in the image as the paper suggests.
There are 731 unlabeled images of cigarettes scraped only from google images and 193 of them are labeled pixel-by-pixel. Further scraping will be done from stock websites.
Since the dataset is stored in my personal Google Drive account and the size of the dataset is larger than 100MB, I had to use a script in this reposityory of the user @circulosmeos which directly downloads big files from Google Drive sharing links.
Download the gdown.pl
script
wget -O gdown.pl "https://drive.google.com/uc?export=download&id=1gc-D4ydSBY7jqhlM-Oq2xca5oJx9217B"
Permit the user to run the script
chmod +x ./gdown.pl
Download the data using the script
./gdown.pl 'https://drive.google.com/file/d/1-6QvZMUB13r-bN1eoiF3qlEplS_4imiK/view?usp=sharing' 'data-cigar-731.zip'
Unzip it
unzip ./data-cigar-731.zip -d /path/to/extract
./gdown.pl 'https://drive.google.com/file/d/1nHKQMUbhNBAyn307K5jEJmX2PBK7rlfz/view?usp=sharing' 'cigarette.zip'
Because the dimensions of the images must be 1024x1024 before using Mask-RCNN model;
-
images smaller than the target size got expanded using
np.pad()
function of Numpy module -
images larger than the tarhet size got resized using
resize()
function of OpenCV module
You need to clone the Mask-RCNN GitHub Repository and configure the config files/codes according to your workspace and data. Instructions are given in the repository.
Despite having a very shallow dataset, using transfer learning leads the model having some baseline results.
Since the model has been trained with only ~200 images, it easily gives false results in some cases.