Fruit Recognition System and Its Nutrition Level

Report
Project Title :
Fruit Recognition System and its Nutrition measurement
Introduction :
The aim of this project is to propose detect a particular fruit image from a new
dataset of images containing popular fruits. The dataset was named Fruits-360 is downloaded.
Currently the set contains 90483 images of 131 fruits and vegetables. The reader is
encouraged to access the latest version of the dataset from the above indicated addresses.
Having a high-quality dataset is essential for obtaining a good classifier. Most of the existing
datasets with images contain both the object and the noisy background. This could lead to
cases where changing the background will lead to the incorrect classification of the object.
In our proposed system, we use colour, size, shape and texture features for proper
result and accuracy. We design a method to apply Kmeans clustering technique for improving
the accuracy of food images. Also create an hybrid combination of segmentation and
classification. For performing this, simply take an enhanced image to do segmentation after
that performing an classification through clustering. Means this is hybrid technique combined
of segmentation and classification to give an appropriate result with high accuracy.
In our systems we were developed which would measure the nutrition level of the
fruit we detect and helps the patients and dietitians to control their obesity. This system
reviews the different systems which had taken the food images to measure the calorie and
nutritional level in the food sample. As such, this system is used to measure the amount of
calories consumed in a meal would be of great help not only to patients and dietitians in the
treatment of obesity, but also to the calorie conscious person.
In the following we will give a brief overview on fruit detection and recognition
method. Colour and texture are the fundamental character of natural images, and play an
important role in visual perception. Colour has been engaged in identifying objects for many
years. In colour classification process the information concerning the spectral properties of
object surfaces are extracted, first.
In the classification phase, some features (such as colour and texture) for each training
picture will be extracted, and stored in a feature library. In recognition phase, each captured
image will be compared with the training set, using minimum distance measure. Among the
training set pictures the one with minimum distance will be selected as the best match.
Finally, based on the detection result and its size, we can calculate the calories of the food
and fruit from the model.
Fruit Recognition System Algorithm :

In our system generally has two steps named segmentation and classification. The
methodology consist of two main parts first one is segmentation using Kmeans, HOG and
second one is classification by SVM and Random Forest.
Recognition and Image Measuring

Add Fruit
Segmentation Classification Nutrition
Image Feature Extraction
Kmeans
Segmentation
Methods Block Diagram
Segmentation :
Segmentation is the process of partitioning a digital image into several segments. The
goal of segmentation is to simplify an image into something that is more meaningful and
easier to analyse. Image segmentation is typically used to locate objects and boundaries
(lines, curves, etc.) in images. The pixels in the same region have similar characteristics, such
as colour, or texture. In this application food with similar ingredient will be placed in the
same segment. So dividing the fruit in separate parts, aid the classifier to find the correct
result. The goal of image segmentation is to cluster pixels into salient image regions, i.e.,
regions corresponding to individual surfaces, objects, or natural parts of objects. A
segmentation could be used for object recognition, occlusion boundary estimation within
motion or stereo systems, image compression, image editing, or image database look-up. The
quality of the segmentation depends on the image. Smoothly shaded surfaces with clear gray-
level steps between different surfaces are ideal for the above algorithms.
The assessment of segmentation algorithms therefore needs to be done on
standardized datasets. If an image has been pre-processed appropriately to remove noise and
artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a
process in which regions or features sharing similar characteristics are identified and grouped
together. Image segmentation may use statistical classification, thresholding , edge detection,
region detection, or any combination of these techniques. The output of the segmentation step
is usually a set of classified elements, Most segmentation techniques are either region-based
or edge based. In image processing, Segmentation is the process of partitioning a digital
image into several segments. The goal of segmentation is to simplify an image into
something that is more meaningful and easier to analyse. Image segmentation is typically
used to locate objects and boundaries.
Clustering of numerical data forms the basis of many classification and system
modelling algorithms. The purpose of clustering is to identify natural groupings of data from
a large data set to produce a concise representation of a system's behaviour. Kmeans is a data
clustering technique in which a dataset is grouped into n clusters with every data point in the
dataset belonging to every cluster to a certain degree. For example, a certain data point that
lies close to the centre of a cluster will have a high degree of belonging or membership to that
cluster and another data point that lies far away from the centre of a cluster will have a low
degree of belonging or membership to that cluster. This iteration is based on minimizing an
objective function that represents the distance from any given data point to a cluster centre
weighted by that data point's membership grade. Using of K-means clustering for
segmentation This method is one of the most popular clustering techniques, which are used
widely, since it is easy to be implemented very efficiently with linear time complexity.
However, the K-means algorithm suffers from several drawbacks. The objective function of
the K-means is not convex and hence, it may contain many local minima.
Feature Extraction :
Once the image is ready, the segment is applied, with this we extract the different
portions of food present in the dish. Over each portion a set of characteristics are obtained
based on the average colour, texture, size and shape of the food portion. All these
characteristics are used to feed the classification procedure that makes use of SVM method,
and nutrient database.
Nutrition Measurement :
After recognizing the fruit, the nutrient be extracted easily. At first, the size of
recognized fruit in centimetre is extracted by using the thumb and recognized food size in
terms of pixels. Then, using nutrition table the appropriate calorie of each food will be
reported.
Classification Methods:
1. SVM(Support Vector Machine ):
SVM is widely used for many pattern recognition problems including face recognition, 3-
D object recognition, and so on. SVM is a statistical learning theory, which uses training data
as inputs to generate a decision function as an output to classify unknown data. Support
vector machine includes linear separable issue and non-linear separable subject. For binary
linear separable classification, the basic idea of SVM is to find an optimal hyperplane
between two kinds of example settings to classify them and make their distance furthest. For
example, in Figure 1, solid dots and empty round stand for the first and the second training
examples, respectively. H is the optimal hyperplane, H1 and H2 are parallel with H. The
points on H1 are the first samples whose distance to H are the shortest, and the points on H2
are the second samples whose distance to H are the shortest. The points on H1 and H2 are on
the edge of separation belt, these examples are called support vectors, and they determine the
separation belt. Suppose there are N training data, Xi € Rn , i= 1, 2, 3, 4 -------N where each
of data belongs to one of the two classes labelled Yi € {1,-1} . As a decision function to
classify input data, SVM finds a hyperplane separating two classes with maximum distance
from the hyperplane to support vectors. The difficulty of linear regression consists in finding
a linear function. An important benefit of SVM is to deal with non-linear separable issue.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as Support Vector
Machine. Consider the below diagram in which there are two different categories that are
classified using a decision boundary or hyperplane:
Random Forest Classifier :

The random forest is a classification algorithm consisting of many decisions trees. It
uses bagging and feature randomness when building each individual tree to try to create an
uncorrelated forest of trees whose prediction by committee is more accurate than that of any
individual tree. Random forest is a supervised learning algorithm. The "forest" it builds, is an
ensemble of decision trees, usually trained with the “bagging” method. The general idea of
the bagging method is that a combination of learning models increases the overall result.
Random forest has nearly the same hyperparameters as a decision tree or a bagging classifier.
Fortunately, there's no need to combine a decision tree with a bagging classifier because you
can easily use the classifier-class of random forest. With random forest, you can also deal
with regression tasks by using the algorithm's regressor.
Project Flowchart :
Start
Data Acquisition
Pre-processing
Feature Extraction Implementing K-means, Hog,

Colour Quantization
For each
tree
Choosing training data subset
Yes
Stop condition
holds each node
No
Build next node
Calculate prediction
error
end
Nutrition Recognition Algorithm :
Start
Read Image
Pre-processing
Segmentation
Image
Fruit size and

weight
Is
properly
labelled
Non Nutritious
Nutritious
we proposed a measurement method that estimates the amount of calories and

Nutrition from a food image by measuring the food portions using skull stripping to measure
the amount of calorie and nutrition in the food. And if Calorie or Nutrition one of the
parameter is high in the image then it will shows that it is an energies food if not then low
energies food. With that it will also show amount of calorie in given fruit
Features Extraction :
Input parameters
samples : It should be of np.float32 data type, and each feature should be put in a single
column.
nclusters(K) : Number of clusters required at end
criteria : It is the iteration termination criteria. When this criteria is satisfied, algorithm
iteration stops. Actually, it should be a tuple of 3 parameters. They are ( type, max_iter,
epsilon ):
3.a - type of termination criteria : It has 3 flags as below:
cv2.TERM_CRITERIA_EPS - stop the algorithm iteration if specified accuracy, epsilon, is

reached. cv2.TERM_CRITERIA_MAX_ITER - stop the algorithm after the specified number
of iterations, max_iter. cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER
- stop the iteration when any of the above condition is met.
3.b - max_iter - An integer specifying maximum number of iterations.
3.c - epsilon - Required accuracy
attempts : Flag to specify the number of times the algorithm is executed using different initial
labellings. The algorithm returns the labels that yield the best compactness. This compactness
is returned as output.
flags : This flag is used to specify how initial centers are taken. Normally two flags are used
for this : cv2.KMEANS_PP_CENTERS and cv2.KMEANS_RANDOM_CENTERS.
Output parameters
compactness : It is the sum of squared distance from each point to their corresponding
centers.
labels : This is the label array (same as ‘code’ in previous article) where each element marked
‘0’, ‘1’.....
centers : This is array of centers of clusters.
Histogram of Oriented Gradient(HOG) :
The Histogram of Oriented Gradient (HOG) feature descriptor is popular for object
detection 1.
We compute the HOG descriptor and display a visualisation.
Compute a Histogram of Oriented Gradients (HOG) by
1. (optional) global image normalisation
2. computing the gradient image in x and y
3. computing gradient histograms
4. normalising across blocks
5. flattening into a feature vector
Methodology of the projects :
1. In our project we have dataset this dataset accompanied with Training and Testing
dataset
2. Acquiring a raw then pre-processing it with filtering techniques
3. To extract the features we implement K-means, Hog, Color Quantization to extract
better features
4. After extracting features we split trained dataset into 2 parts train and testing .
5. For better classification we implement three classifiers Random Forest, KNN and
SVM. We will consider the best possible one who will provide us with better
accuracy. Till now from our implementation with acquire an accuracy percentage
from range of 60 to 70 percent.
6. In the next stage we will implement TensorFlow- Keras Engine for best classification
with bigger dataset.
Improving classifier accuracy
To improve the accuracy of a classifier, there are multiple techniques to achieve better
accuracy. Some of them are listed below.
1. Gather more data for each class. (500-1000) images per class.
2. Use Data Augmentation to generate more images per class.
3. Global features along with local features such as SIFT, SURF or DENSE could be
used along with Bag of Visual Words (BOVW) technique.
4. Local features alone could be tested with BOVW technique.
Further we are going to implement these techniques to improve accuracy.

Interface of the system :
Results :
List of Fruit Images in Database :
Fruit Image Total No of Training Image Testing Image

Images
Apple 300 200 100
Banana 200 100 100
Orange 150 100 50
Accuracy Measurement:
Fruit Image Classifiers Over all Accuracy %
Apple Random Forest 75%
SVM 78%
Banana Random Forest 68%
SVM 78%
Orange Random Forest 72%
SVM 77%

Fruit Recognition System and Its Nutrition Level

Uploaded by

Document Informationclick to expand document informationIt's a Machine Learning project to detect particular fruit from a fruit datasets. The system will also detect the nutrition level of the fruit detected.

Document Informationclick to expand document information

Copyright:

Available Formats

Fruit Recognition System and Its Nutrition Level

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fruit Recognition System and Its Nutrition Level

Uploaded by

Copyright:

Available Formats

Report

Fruit Recognition System Algorithm :

Recognition and Image Measuring

Methods Block Diagram

Random Forest Classifier :

Feature Extraction Implementing K-means, Hog,

Choosing training data subset

Build next node

Fruit size and

we proposed a measurement method that estimates the amount of calories and

nclusters(K) : Number of clusters required at end

3.a - type of termination criteria : It has 3 flags as below:

cv2.TERM_CRITERIA_EPS - stop the algorithm iteration if specified accuracy, epsilon, is

3.b - max_iter - An integer specifying maximum number of iterations.

3.c - epsilon - Required accuracy

centers : This is array of centers of clusters.

Histogram of Oriented Gradient(HOG) :

We compute the HOG descriptor and display a visualisation.

Compute a Histogram of Oriented Gradients (HOG) by

1. (optional) global image normalisation

2. computing the gradient image in x and y

3. computing gradient histograms

4. normalising across blocks

5. flattening into a feature vector

Methodology of the projects :

Further we are going to implement these techniques to improve accuracy.

Fruit Image Total No of Training Image Testing Image

You might also like