In this project, I explore different models to identify plant and animal species from photos. I've trained small convnets and compared fine-tuning existing architectures (Inception V3, ResNet, Inception ResNet V2, Xception) with different freeze layers. I also analyzed performance on higher-level categories (e.g. mammals versus insects versus birds).

This is my log of progress as I try different approaches.

Fine-tune standard CNNs on small data

Fine-tune standard CNNs on small data

3-10 epochs of pre-training & more data helps

Notes Fri Jan 18

Got basic code running: load InceptionV3, add a fc layer on top of configurable size, then an output layer of size num_classes. Pretrain for pe epochs, currently using rmsprop. Freeze everything up to fl layer as untrainable. Train the rest of the layers for e epochs, currently using SGD with momentum and a lower learning rate. These defaults are directly from the Keras tutorial on finetuning InceptionV3 on a new set of classes.

Experiment 1


Section 2

Finetuning Inception V3

Freeze fewer layers for higher accuracy

Notes Thurs Jan 24

per-class accuracy tbd: crashed, probably asking for way too many batches--consider dividing actual size of validation set in the keras callback

Section 3

Per-class precision with InceptionV3

More data for kingdoms

How well do we perform on each of the 10 classes? Look at per-class precision for different models (5K examples, hence noisy/jagged plots)

Notes Mon Jan 28



Running commands

Section 4

Varying base models

Use InceptionV3

Fine-tuning a well-known, high accuracy convnet (pretrained on ImageNet) is a great strategy for vision tasks, especially for nature photos (very similar to ImageNet). Which base network should we choose?


Section 5

Vary freeze layer and base model

Long runs; not much difference

This hyperparameter is theoretically interesting but would take a lot of compute to finetune--stick with InceptionV3 for now.