Object detection the project I worked for detect the Coca Cola, here is the article I wrote on the medium, but for the business confidential, only use the smapled dataset
- AlexNet https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks
- ZFNet: https://arxiv.org/abs/1311.2901
- VGG16: https://arxiv.org/abs/1505.06798
- GoogLeNet:https://arxiv.org/abs/1409.4842
- ResNet:https://arxiv.org/abs/1704.06904
- Inception:https://arxiv.org/abs/1512.00567
- Xception:https://arxiv.org/abs/1610.02357
- MobileNet:https://arxiv.org/abs/1704.04861
- RCNN:https://arxiv.org/abs/1311.2524
- Fast-RCNN: https://arxiv.org/abs/1504.08083
- Faster-RCNN: https://arxiv.org/abs/1506.01497 (useful one)
- SSD: https://arxiv.org/abs/1512.02325 (useful one)
- YOLO: https://arxiv.org/abs/1506.02640 (useful one)
- YOLO9000:https://arxiv.org/abs/1612.08242
- FCN: https://arxiv.org/abs/1411.4038 (useful one)
- SegNet:https://arxiv.org/abs/1511.00561
- UNet:https://arxiv.org/abs/1505.04597 (useful one)
- PSPNet:https://arxiv.org/abs/1612.01105
- DeepLab:https://arxiv.org/abs/1606.00915 (useful one)
- ICNet:https://arxiv.org/abs/1704.08545
- ENet:https://arxiv.org/abs/1606.02147
I will first just write something I plan to write for the machine learning user case with Sklearn, Keras and TF here, the high-level, production-ready framework.
Resource: Guide to Semantic Segmentation with Deep Learning A Review on Deep Learning Techniques Applied to Semantic Segmentation
- Login the AWS and select EC2 pane
- Create an ssh key, scroll in the left menu find ‘Key pairs’ then click the ‘Import Key Pair’ button upload the ssh key
or
cp .ssh/id_rsa.pub /mnt/c/Temp/ - Search for ‘deep learning’ and select the first option: Deep Learning AMI (Ubuntu) Version (latest one)
- Scroll down until you find ‘p3.xlarge’ and select it,
- Upon the drop-down menu, select the YOUR SSH KEY and then press ‘Launch’
$ ssh -i ~/.ssh/<your_private_key_pair> -L localhost:8888:localhost:8888 ubuntu@<your instance IP> first get clone the tensorflow models repo
https://github.com/tensorflow/models then need to write this on command to initilize the tensorflow env and make sure the consistency of right kernal chosen in jupyter notebook
$ source activate tensorflow_p36 on the path command
$ ~/models/research
$ protoc object_detection/protos/*.proto --python_out=.
$ export
PYTHONPATH=$PYTHONPATH::/home/ubuntu/YOUR/FOLDER/PATH/models/research:/home/ubuntu/YOUR/FOLDER/PATH/models/research/slim - Git clone Tensorflow Models module
$ git clone https://github.com/tensorflow/models.git
- TFRecord
- LabelImg
$ git clone https://github.com/tzutalin/labelImg.git $ sudo apt-get install pyqt5-dev-tools $ sudo pip3 install lxml $ make qt5py3
- Start the image label program
$ python labelImg.py
- Model Selection model selection
- Model Training
- Model Inferenc export_inference_graph.py
- model evualation mAP(mean Average Pricision), AP, Recall and Precision
- 1. image collection and image dataset preprocessing (use Keras lib)
- 2. model architecture configuration (use TF(transfer learning))
- 3. model training (compare the GPU and CPU, GPU is 4.5 times faster than CPU, on GPU normally takes 2 hours)
- 4. model inference (postprocessing, for object detection need to algorithm like soft NMS), on this part we first need to export the trained model graph, run pre-made file object_detection_tutorial.ipynb for simple sample image test, but for the time I was using I rewrote the code, and implement library boto to connect with AWS s3 for automatically read/download the file, implement the output code via csv format, besides on this step we need to use the image tag tool to get the groud truth class, so that can do model performance analysis to build the retrain pipeline loop, the three metric we used, confusion matrix, ROC AUC curve, and recall and precision
- 5. deploy to the sagemaker as an endpoint and use the api to directly connect to database for massive image usage
-
Formatting training dataset from xml-> csv-> TFRecord, three document need to use
(you can find them on the first level of directory, but you can just select this and build for your own too, the files you need is xml_to_csv.py, split labels.ipynb, generate_tfrecord.py), all the generted training data train/validation.TFRecord data in to the YOUR/FOLDER/PATH/object_detection/data/ -
Config 3 more different files before training, to config the model architecture path and train/validation path
- faster_rcnn_resnet101_coco.config (or any other baseline models you prefer to use) in the training folder
- change .pbtxt file, to implement all your training classes here in the JSON format in the object_detection/data/
- pipeline.config from object_detection/legacy/models/train/YOUR_MODEL_NAME (need download from tensorflow model zoo site
- first in the folder
-Start training for 20k steps and config the tensorboard and check the graph trend for model performance during the training
*be aware of speed-vs-accuracy tradeoff
YOUR/FOLDER/PATH/models/research
$ protoc object_detection/protos/*.proto --python_out=.
$ export
PYTHONPATH=$PYTHONPATH::YOUR/FOLDER/PATH/models/research:YOUR/FOLDER/PATH/models/research/slim
config the training script on the folder path: YOUR/FOLDER/PATH/models/research/object_detection/legacyexample script:
$ python train.py --train_dir=YOUR/FOLDER/PATH/models/research/object_detection/legacy/models/train --pipeline_config_path=YOUR/FOLDER/PATH/training/faster_rcnn_resnet101_coco.config training time normally 2 hours in GPU but takes 10 more hours on CPU, check the tensorboard first to know the model performance for both training and validation dataset, use the graph for the early stop to aviod overfitting problem, one of techniques for the regularization
model validation
$ YOUR/FOLDER/PATH/models/research/object_detection/legacy example script:
$ python eval.py --checkpoint_dir=YOUR/FOLDER/PATH/models/research/object_detection/legacy/models/train --eval_dir='eval' --pipeline_config_path=YOUR/FOLDER/PATH/training/faster_rcnn_resnet101_coco.config model inference use the recreated file on the object detection folder,first export the model trained result
$ YOUR/FOLDER/PATH/models/research/object_detection/example script:
$ python export_inference_graph.py --input_type image_tensor --pipeline_config_path YOUR/FOLDER/PATH/training/faster_rcnn_resnet101_coco.config --trained_checkpoint_prefix legacy/models/train/model.ckpt-YOUR-TRAINING-STEPS--output_directory legacy/models/train then just config the saved_model path and class.pbtxt path on the inference code.for the mAP and recall, the higher the better model performance, normally mAP around 50 is a good one.