Imdt Project Report
Imdt Project Report
Imdt Project Report
A
the examples, again and again, it is able to identify patterns in
lthough bird classification can be done manually by
order to make decisions more accurately
domain experts, with growing amounts of data, this
rapidly becomes a tedious and time-consuming process. So, by
this model we can identify the species of the birds accurately C. Computation saving
and in less time. In our application, the user needs to input the The ReLu function is able to accelerate the training speed of
image of the bird, and our model will predict the species of the deep neural networks compared to traditional activation
bird. functions since the derivative of ReLu is 1 for a positive input.
Due to a constant, deep neural networks do not need to take
Dataset- We used Birds 450 species Image Classification additional time for computing error terms during training
datasets. It is a dataset of 450 bird species, 70,626 training phase.
images, 22500 test images(5 images per species) and 2250
validation images(5 images per species. This is a high quality D. Depth wise Separable Convolutional Neural Networks
dataset where there is only one bird in each image and the bird
Convolution is a very important mathematical operation in
typically takes up at least 50% of the pixels in the image. As a
artificial neural networks(ANN’s). Convolutional neural
result even a moderately complex model will achieve training
networks (CNN’s) can be used to learn features as well as
and test accuracies in the mid 90% range. All images are 224
classify data with the help of image frames. There are many
X 224 X 3 color images in jpg format. Data set includes a train types of CNN’s. One class of CNN’s are depth wise separable
set, test set and validation set. convolutional neural networks.
II. APPROACH These type of CNN’s are widely used because of the
We used MobileNet from other various models like following two reasons –
VGG16, AlexNet, GoogleNet. We selected last 20 layers from 1. They have lesser number of parameters to adjust as
total 28 layers. Then we augmented the data using the keras compared to the standard CNN’s, which reduces
ImageDataGenerator. We have used adam optimizer and overfitting
categorical crossentropy and accuracy as performance matrix. 2. They are computationally cheaper because of fewer
After setting the epoch size to 64 and batch size to 64 and computations which makes them suitable for mobile
steps per epochs to len(train_data), we were able to reduce the vision applications
training time of the model from 12-15 hours to 3-5 hours at
the cost of 2% accuracy, later we tested our model on several E. Categorical Crossentropy
different bird species and every time we got correct results. Categorical crossentropy is a loss function that is used in
We found out that it is important to remove non-linearities multi-class classification tasks. These are tasks where an
in the narrow layers in order to maintain representational example can only belong to one out of many possible
power. We demonstrate that this improves performance and categories, and the model must decide which one.
provide an intuition that led to this design.
III. RESULTS
B. Machine Learning
Machine Learning is the most popular technique of
predicting or classifying information to help people in making
2
IV. UI V. REFERENCES