Tuning SSD Mobilenet for better performance
I'm using Tensorflow's SSD Mobilenet V2 object detection code and am so far disappointed by the results I've gotten. I'm hoping that somebody can take a look at what I've done so far and suggest how I might improve the results:
Dataset
I'm training on two classes (from OIV5) containing 2352 instances of "Lemon" and 2009 instances of "Cheese". I have read in several places that "state of the art" results can be achieved with a few thousand instances.
Train / validation parameters
Next up I'll list my config file, which is basically the same as the default. The only changes I made were a) changed num_classes and b) doubled the l2_normalizer scale because during training the algorithm was overfitting and validation loss started to increase after only ~25,000 iterations.
model {
ssd {
num_classes: 2
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.05
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00007
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v2'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00007
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 3
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 24
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "gs://MY_DIR/data/train.record-?????-of-00010"
}
label_map_path: "gs://MY_DIR/data/label_map.pbtxt"
}
eval_config: {
num_examples: 870
}
eval_input_reader: {
tf_record_input_reader {
input_path: "gs://MY_DIR/data/val.record-?????-of-00010"
}
label_map_path: "gs://MY_DIR/data/label_map.pbtxt"
shuffle: true
num_readers: 1
}
Cluster setup
I didn't touch the cloud config file, but thought I'd include it for completion.
trainingInput:
runtimeVersion: "1.12"
scaleTier: CUSTOM
masterType: standard_gpu
workerCount: 5
workerType: standard_gpu
parameterServerCount: 3
parameterServerType: standard
Problem / Question
With this setup, I'm achieving a mAP that generally doesn't go above 25%. The best is [email protected], which touches 30% briefly before falling again at 25k iterations.
As mentioned before, validation loss goes from 12 to 7 (arbitrary), but then increases again around 25k iterations as well.
Although I'm not super familiar with what type of results I should expect, these numbers seem wrong. I'm not even sure if I should be looking to improve my dataset or improve my training hyperparams. I'll accept any answer that will help put me on the right track. Please let me know if I've forgotten to include any pertinent information.
Topic object-detection tensorflow
Category Data Science