TanSiewChing12873 Finaldissertation
TanSiewChing12873 Finaldissertation
TanSiewChing12873 Finaldissertation
By
SEPTEMBER 2012
UniversitiTeknologi PETRONAS
Bandar Seri Iskandar,
31750 Tronoh
Perak Darul Ridzuan
CERTIFICATION OF APPROVAL
By
Approved by,
_________________________________
(Assoc. Prof. Dr. Baharum B Baharudin)
i
CERTIFICATION OF ORIGINALITY
This is to certify that I am responsible for the work submitted in this project, that the
original work is my own except as specified in the reference and acknowledgements,
and that the original work contained herein has not been undertaken or done by
unspecified sources or persons.
__________________
(TAN SIEW CHING)
ii
ABSTRACT
The work presents described the development of Emotion Based Music Player, which is
a computer application meant for all type of users, specifically the music lovers. Due to
the troublesome workloads in songs selection, most people will choose to randomly play
the songs in the playlist. As a result, some of the songs selected not matching the users’
current emotion. Moreover, there is no commonly used music player which able to play
the songs based on user’s emotion. The proposed model is able to extract user’s facial
expression and thus detect user’s emotion. The music player in the proposed model will
then play the songs according to the category of emotion detected. It is aimed to provide
a better enjoyment to music lovers in music listening. The scope of emotions in the
proposed model involve normal, sad, surprise and happy. The system involves the major
of image processing and facial detection technologies. The input for this proposed
model is the .jpeg format still images which available online. The performance of this
model is evaluated by loading forty still images (ten for each emotion category) into the
proposed model to test on the accuracy in detecting the emotions. Based on the testing
result, the proposed model has the Recognition Rate of 85%.
iii
ACKNOWLEDGEMENT
First and foremost, the writer would like to take this opportunity to express her deepest
gratitude and appreciation to project main supervisor, Assoc. Prof. Dr. Baharum B
Baharudin, who continuously monitoring the progress of the project throughout the
semesters. The constructive comments, advices, and suggestions given have leaded to
the successful outcome of the project.
This gratitude also dedicated towards the individuals who participated in the survey. The
author highly appreciates for the feedbacks and cooperation given by the participants
which have eventually facilitate a lot in developing, improving and implementation of
the system prototype.
Last but not least, the author would like to express her acknowledgement to family
members and friends, for the support and verbal constructive opinions throughout the
seven months.
iv
TABLE OF CONTENT
CERTIFICATION OF APPROVAL . . . . . i
CERTIFICATION OF ORIGINALITY . . . . . ii
ABSTRACT . . . . . . . . . iii
ACKNOWLEDGEMENT . . . . . . . iv
TABLE OF CONTENT . . . . . . . v
LIST OF FIGURES . . . . . . . . vii
LIST OF TABLES . . . . . . . . viii
LIST OF ABBREVIATIONS . . . . . . viii
CHAPTER 1: INTRODUCTION . . . . . 1
1.1 Background of Study . . . . . 1
1.2 Problem Statement . . . . . 3
1.3 Project Objective . . . . . 3
1.4 Scope of Study . . . . . 4
1.5 Feasibility Study . . . . . 5
CHAPTER 3: METHODOLOGY . . . . . . 13
3.1 Research Methodology . . . . 13
3.2 Project Activities . . . . . . 15
3.3 Key Milestones . . . . . . 16
3.4 Gantt Chart . . . . . . . 17
3.5 Development Tools . . . . . . 19
v
CHAPTER 4: RESULTS AND DISCUSSION . . . . 20
4.1 Data Gathering and Analysis . . . . . 20
4.2 System Design . . . . . . 22
4.3 System Testing Results . . . . . 32
4.4 Emotion Accuracy Testing. . . . . . 33
4.5 Discussion . . . . . . . 39
REFERENCES . . . . . . . . 44
APPENDIX . . . . . . . . . 47
vi
LIST OF FIGURES
Figure 4.5 : The first interface once the proposed model is launched 28
Figure 4.9 : Figure 4.9: “Default auto load emotion detection once Start” 30
Figure 4.10 : Figure 4.10: The step-to-step function of the proposed model 31
vii
LIST OF TABLES
LIST OF ABBREVIATIONS
viii
CHAPTER 1
INTRODUCTION
A facial expression can be expressed through the motions or from one or more
motions, movements or even positions of the muscles of the face. These movements
transmit of the emotional status of an individual. Facial expression can be adopted as
voluntary action as individual can control his facial expression and to show the facial
expression according to his will. For an example, a person can make the eyebrow closer
and frown to show through the facial expression that he is angry. On the other hand, an
individual will try to relax the face’s muscle to indicate that he is not influence by the
current situation. However, since facial expression is closely associated with the
emotion, thus it is mostly an involuntary action. It is nearly impossible for an individual
to insulate himself from expressing the emotions. An individual may have a strong
desire or will to not to express his current feelings through emotions but it is hard to do
so. An individual may show his expression in first few micro-second before resume to a
neutral expression.
Since the work of Darwin in 1872, the behavioral scientists had actively
involved in the research and analysis of facial expression detection. In 1978, Suwa et al.
presented his early attempt on the idea of automatically facial expressions analysis by
1
tracking the motion of twenty identified spots on an image sequence. After Suwa’s
attempt, there are lots progresses in developing the computer systems in order to help
human to recognize and read the individual’s facial expression, which is a useful and
natural medium in communication.
Source from Ying-Li Tian, Takeo Kanade , and Jeffrey F. Cohn (2003)
The “Emotion Based Music Player” is a device developed aimed to detect the
emotion of an individual, and play the lists of music accordingly. First, the individual
will reflect his emotion through the facial expression. After that, the device will detect
the condition of the facial expression, analyze it and interpret the emotion. After
determined the emotion of the individual, the music player will play the songs which
can suit the current emotion of the individual. The device will focus on the analysis of
the facial expression only which does not include the head or face movement.
2
1.2 PROBLEM STATEMENT
However, most people facing the difficulty of songs selection, especially songs
that match individuals’ current emotions. Looking at the long lists of unsorted music,
individuals will feel more demotivated to look for the songs they want to listen to. Most
user will just randomly pick the songs available in the song folder and play it with music
player. Most of the time, the songs played does not match the user’s current emotion.
For an example, when a person is sad, he would like to listen to some heavy rock music
to release his sadness. It is impossible for the individual to search from his long playlist
for all the heavy rock music. The individual would rather choose the songs randomly or
just “play all” for all the songs he had.
Besides, people get bored with this traditional way of searching and selecting
songs. The method had been implemented since few years back.
The main objective of this project is to develop the “Emotion Based Music
Player” for all kinds of music lovers which aimed to serve as a platform to assist
individuals to play and listen to the songs according to his emotions. It is aimed to
provide a better enjoyment of entertainment to the music lovers.
3
i. To propose a facial expression detection model to detect and analyze the emotion
of an individual.
ii. To accurately detect the four basic emotions, namely normal, happy, sad and
surprise.
iii. To integrate the music player into the proposed model to play the music based on
the emotions detected.
ii. Get information on the tools appropriate for the facial expression detection in
order to build the proposed model for this project. Different tools (software and
hardware) are studied on their feasibility and functionality as well as user-
friendliness in order to figure out the most suitable and applicable tools to
develop it.
4
1.5 FEASIBILITY STUDY
As stated above, the focus of this project will be entirely on the detection of
facial expression and integrates it to the music player. As a prototype, the proposed
model will detect only the basic emotion such as happy, sad, normal, and etc.
The Final Year Project (FYP) course is divided into two parts, which are the
FYP1 and FYP2. As in the syllabus given, FYP1 will focus more on the brain storming
for FYP title, proposal writing, data gathering, and researches well as the report writing.
On the other hand, the development of prototype, implementation and testing of the
developed proposed model will be done in FYP2.
As the phase has been divided evenly between the two semesters which is
equivalent to eight months, the project will be able to finish on time with the proper time
management.
5
CHAPTER 2
LITERATURE REVIEW
2.1 INTRODUCTION
In the year of 2009, Barbara Raskauskas had published an article stating the
music is one of the widely accepted culture and language which can be accepted by any
type of people. She mentioned that "music does fill the silence and can hide the noise.
Music can convey cultural upbringing. Music is pleasurable and speaks to us, whether or
not the song has words. I've never met a person who didn't like some form of music.
Even a deaf friend of mine said she liked music; she could feel the vibration caused by
music. Finding enjoyment in music is universal."
Emily Sohn (2011) stated that “People love music for much the same reason
they're drawn to sex, drugs, gambling and delicious food, according to new research”.
Through the actions and activities carried out by the people around, the statement
mentioned is widely accepted by the public. Study had proved that human brain will
release dopamine, a kind of chemical generated by body which involved addiction and
motivation when people listen to harmony or melody that touch an individual.
Comparison with similar expression can be done in order to detect the facial
expression of an individual. In the year of 2005, Mary Duenwald had published an
article which summarizes that scientists had did several studies and researches and
shown that facial expressions across the globe fall roughly into seven categories:
6
i. Sadness: The eyelids droop while the inner corners of the brows rise. When in
extreme sadness, the brows will all push nearer together. As for the lips, both of
its corners pull down and the lower lip may push up in a mope.
ii. Surprise: Both the upper eyelids and brows rise, and the jaw drops open.
iii. Anger: Both the lower and upper eyelids squeeze in as the brows move down
and draw together. The jaw pushes forward, the upper and lower lip press on
each other when the lower lip pushes upper a bit.
iv. Contempt: The expression appears on one side of a face: One half of the upper
lip tightens upward.
v. Disgust: The individual’s nose wrinkles and the upper lip rise while the lower lip
protrudes.
vi. Fear: The eyes widen and the upper lids rise. The brows draw together while the
lips extend horizontally.
vii. Happiness: The corners of the lips lifted and shaped a smile, the eyelids tighten,
the cheeks rise up and the outside corners of the brows pull down. [5]
7
2.2 FACIAL EXPRESSION DETECTION RESEARCH
Face features detection such as the mouth and the eyes is always one of the key
issues in facial image processing as it involves wide and various areas such as the
emotion recognition and face identification. Joseph C. Hager (2003) stated that face
detection feature is used as one of the input to other image processing functions such as
the face and emotion detection. Different researchers had studies on the different
approaches in facial expression detection. Each approach can be applied effectively in
different situation.
In the year of 2004, W.K. Teo, Liyanage C De Silva and Prahlad Vadakkepat
had proposed a method of combining the feature detection and extraction with the facial
expression recognition into an integrated system which can improve the recognition
output in terms of the recognition process. With this system, the recognition process is
not influenced by the subjective aspects and the bound of the areas are invariant during
the sequence. W.K. Teo, Liyanage C De Silva and Prahlad Vadakkepat (2004) “We
propose a method for facial expression recognition that uses integral projection,
statistical computation, a neural network and kalman filtering. The face feature detection
method uses multi-stage integral projection. Optical flow computation will be used on
the detected feature namely, the eyebrows and the lip to extract movement.”
Source from W.K. Teo, Liyanage C De Silva and Prahlad Vadakkepat. (2004)
Figure 2.1: Block diagram of the proposed facial expression recognition system
The integration projection enables the detection and location of small and precise
features of the face, such as the eyebrows and lips. When the statistical approach on the
8
optical flow field is applied, the overall movement of the features can be detected
without the need to pinpoint the exact location of the features. The usage of this
approach does not require the predefine setting such as the location of the head or the
eyes. The predefine settings are preset using the normalized coefficients obtained using
a facial image database.
Apart from the approach introduced by W.K. Teo and etc., in the year of 2010,
Jagdish Lal Raheja and Umesh Kumar had introduced the Back Propagation Neural
Network technique in human facial expression detection from captured image. The
approach used is based on the usage of add-boosted classifier and finding and matching
the token when detecting the facial expression while applying neural network. In face
detection, the method proposed is by usign the Viola and Jones method. It is a better
implementation when comparing to other techniques as it is feature based. Besides, it is
able to perform the analysis relatively faster as compared to others. Edge detection,
thinning, and token detection are carried out during the image processing process. Edge
detection is aimed at identifying the points in a digital image at which the image
brightness changes sharply or more formally has discontinuities. Thinning is applied in
order to reduce the width of an edge, which is from multiple lines to single line. Token
which generated after the thinning process divides the data set into smallest unit of
information which needed for the following processes. After the three procedures, the
recognition is performed. “It is a tedious task to decide the best threshold value to
generate the tokens. So as a next process or the future work is to determine the best
threshold value, so that without the interaction of user the system can generate the
tokens.”
Besides above, Zhengyou Zhang (1998) had reported on the investigation on the
feature-based facial expression recognition within an architecture based on the two-layer
preceptron. Two types of factors are being derived from the face images during the
investigation, i.e. the geometric positions of a set of fiducial points on a face as well as
the set of set of multi-scale and multi-orientation Gabor wavelet coefficients at these
points. Zhengyou Zhang (1998) “The recognition performance with different types of
9
features has been compared, which shows that Gabor wavelet coefficients are much
more powerful than geometric positions.” Secondly, the sensitivity of individual's
fiducial point to the facial expression detection is examined. Through the sensitivity
analysis, the author found out that the points on cheeks and forehead carry little useful
information. Lastly, the author studied the importance of image scales in the facial
detection process. The experiments show that facial expression recognition is mainly a
low frequency process, and with a spatial resolution of 64 pixels X 64 pixels is probably
enough to carry the process out.
Apart from that, in 2007, Eva Cerezo, Isabelle Hupont, Critina Manresa, Javier
Varona, Sandra Baldassarri, Francisco J. Perales, and Francisco J. Seron presented their
works on an automated real-time system for facial expressions recognition which
functioned by tracking the facial features’ and the simple emotional classification
method. The automatic feature extraction function enables the preamble of dynamic
information in the classification system, which making the study of time evolution on
the evaluated parameters as well as the categorization of user’s emotions through live
video possible. The developed system had been embedded in the Maxine system, an
engine developed by the group for managing 3D virtual scenarios and characters to
enrich user interaction in different application domains to test its usefulness and real-
time operation.
10
2.3 RELATIONSHIP BETWEEN MUSIC AND EMOTION RESEARCH
Many researchers had did research and studies on if the music can actually
influence the emotion of individuals. Throughout the years, the results from the studies
proved that different music style can actually influence individuals in different ways.
For an example, in the year of 1994, Antoinette L. Bouhuys, Gerda M. Bloem, Ton
G.G.Groothuis carried out a study in the relationship between the individuals’ facial
expression after listening to depression music. The results showed that depressing music
bring on a major increase of depressed mood and significant decline if delighted mood.
The study proved that music can actually influence individuals’ emotions.
11
2.4 EMOTION BASED MUSIC RETRIEVAL RESEARCH
Besides the study mentioned above which shows that emotions can be influenced
by music, Wai Ling Cheung and Guojun Lu (2008) presented that automatic music
emotion annotation is an important requirement to research music retrieval by emotion.
Music emotion annotation is the task of embedding an emotional terms to musical terms.
This research proposed a solution which automates a traditionally manual annotation
task using a number of techniques from various disciplines, which is highly original.
The author pointed out that through this research, the automatic music emotion
annotation is possible and workable using hybrid sampling, data-driven detection
threshold and synonymous relationships between emotional. “Our empirical result
shows that training data size requirement is within reach for a workable annotation
system. As music emotion description becomes readily available through automatic
annotation, the development of a music research repository will be more attainable.”
12
CHAPTER 3
METHODOLOGY
Planning
Analysis
Design
Implementation
Prototype
13
The methodology used to complete this project is a structured design. The
waterfall model is a sequential design process, in which progress is seen to be flowing
steadily downwards from the planning, analysis, design, implementation and
maintenance phase. The project is done following one stage to the other from the first to
the last stage.
First of all, during the planning stage, brain storming of ideas is carried out in
order to find out the suitable field to focus on for FYP project. Few researches are
carried out to determine the needs of innovation of systems in the specific field. After
specified the field, some critical thinking is done in finding the problem arising from the
field. Discussion is held with lecturers to determine the most suitable and executable
project title.
Once the analysis part is done, the project moves to the design phase, in which
the analysis models as well as the interface design for the system will be determined.
The development process starts with the design of the framework and interface of the
system. The major focus in design phase is to write program in order to detect the facial
expression. After developing the facial expression detection system, the system will be
integrated with the music player.
Finally when the design of the project is done, the proposed model will then be
tested to find out if there any bugs and to test its functionality and accuracy in facial
detection for various emotions.
14
3.2 PROJECT ACTIVITIES
The project mainly consists of four stages. It starts with the planning and critical
review on similar products and technologies available. In this stage, several of image
processing and facial expression detection technologies are studied and analyzed.
In phase 2, the activity is then focuses on the research analysis. This includes
data gathering process, the analysis of data as well as the development of analysis
models for the system.
In Phase 3, the project will focus on the development of the system. The
different functions of the system is developed and tested accordingly. This is to ensure
the system able to detect and analyze the basic facial expression accurately.
The project will end with the last phase which is Phase 4, which is the review
and evaluation of the system developed. The proposed model will be tested over
different types of images to measure its accuracy in detecting the facial expression.
15
3.3 KEY MILESTONES
Some of the key milestones for this project are as shown in the table below:
16
3.4 GANTT CHART
3.4.1 FYP1
Title Week
1 2 3 4 5 6 7 8 9 10 11 12
Project title
selection/proposal
Proposal submission to
research cluster
Research on the current
technology
Extended proposal
submission
Data Collection: Images on
different expressions from
different individuals
Data Collection: Collection of
different genre of
music/songs
VIVA: Proposal defense and
progress evaluation
Prepare: Interim Report
17
3.4.2 FYP2
Title Week
1 2 3 4 5 6 7 8 9 10 11 12 13 14
System Development
Dissertation
VIVA Presentation
Final Dissertation
18
3.5 DEVELOPMENT TOOLS
Software
Hardware
o 4Gb RAM
Programming Language
i. C#
Operating System
19
CHAPTER 4
RESULTS AND DISCUSSION
In this chapter, all the result and discussion will be briefly present and discussed.
These results are not the final and complete yet – basically these are the result of the
study obtained in order to see whether the system is really feasible.
20
Below is the result for the questionnaire:
Question Result
5%
Yes
No
1. Do you love music?
95%
18%
2. Do you listen to
music/songs when your Yes
emotion distracted? No
82%
5%
4. Do you find it interesting If
there is a system which able
to detect your current mood
yes
and play the songs according no
95%
to your mood/emotion?
21
4.2 SYSTEM DESIGN
Design phase is also considered as one of the most challenging part in software
development life cycle phase as it include the technical aspect of the project. The
objective for this phase is to ensure the proper functions of the system and there is
proper interaction between the system and the user.
4.2.1Proposed Model
From the developer side, as for this FYP, the proposed model will be focusing
on two main functions, first is the expression detection and secondly the list of songs
played for each category of emotion. As for the expression detection, the system is
designed mainly to detect on the four major expressions, which are the happy, sad,
normal and surprise. On the other hand, there will be songs ready available in each
category. After the emotion of the individual is detected, the system will play the ten
songs through music player.
Besides, there will be sets of still image with the four different expressions
available in the database of facial expression detection. It will be used for the
comparison purposes. After the image of the user is loaded, the features (lip and eyes) of
the user will be extracted by the system. The system will then analyzed the condition of
22
the features and do comparison with the sets of emotions in the database. The system
will identify processed image for example as happy when the condition of the features
are nearest to the “happy emotion” in the database.
As for the user side, the user will be able to customize the songs in each of the
category according to their taste. Some might prefer sentimental music when she is sad
but some might prefer some countryside music. There will be no limit of songs to be
store in each category. User will have to launch the system in order to start the proposed
model. Once the system is started, the user can choose to either select songs or to
directly process the current emotion. A list of songs will be played automatically after
the system done with the interpretation. User can choose to change the current emotion
after the list of song is being played by repeating the image loading or capturing
procedure.
23
4.2.2 Flow Chart
Load / Capture Detect the eyes and Pass through the system
the image lips of the image. for comparison with the
image in database.
No Emotion
Detected
Yes
24
4.2.3 As-Is-System and To-Be System
On the computer
25
To-Be-System (User Process)
Launch the
Emotion
Based Music Load/Capture
Player image
Open the
music Wait for
selection couple minute
section for the system
to detect the
Is user wish to face features
customize the
songs in
folder?
Listen to the
songs play by
[Yes] [No] music player.
User customize
the songs in the Is user wish
different to change the
category of emotion?
song folders
No Yes
Save the
changes made
26
4.2.4 Interface Design (User Interface) and System Function
User interface refers to the interface users will see when running this proposed
model. It basically either capturing the image or loading the still image and processes
the image in order to detect and identify the user’s emotion.
Once the proposed model is run, the below interface will appear to the user.
Functions of each button in the first interface are as follow:
i. “Browse” button : Enable the user to select available image from the
local disk.
ii. Webcam “Start” button : Automatically connect the user to the image
capturing device to capture the current emotion of
the user.
iii. “Start” button : To start the image analysis process once the images
is loaded.
iv. “Restart” button : To restart the proposed model. It will start from the
step of image selection.
v. “Setting” button : To save the set of images into the database for
comparison purposes.
vi. “Music Library” button : Enable user to customize the songs list for each
emotions.
27
Figure 4.5: The first interface once the proposed model is launched
The proposed model detects the user’s emotion based on the comparison method.
Thus a set of emotion (normal, sad, surprise and happy) is saved in the database before
the application is used by the user.
First of all, in order to save the set of images of different emotions in the
proposed model, the “Browse” button is pressed. The user will have to choose an image
which can represent the emotion. (Image selected in figure below is “Normal” emotion)
28
Next, the “Start” button is pressed to indicate that the image selected is
confirmed and the application can proceed with image processing.
Then, the “Setting” button is pressed in order to set the loaded image as the
dataset for comparison. After the proposed model done with the processing, the user can
save the emotion type by clicking the “Save My Emotion Data” button. A message box
will pop out once the emotion is successfully saved in the database.
29
After all the four emotions (normal, sad, surprise and happy) are successfully
saved in the database, the application is now ready to detect the user’s emotions. The
proposed model is made to recognize the user’s emotion according to the following
steps:
In order to have a clearer picture on how the proposed model work, the
image below shows the condition of the interface in each step, from the
loading of image until the song executed. (Emotion used in sample is “Normal”
emotion.)
30
Figure 4.10: The step-to-step function of the proposed model
31
4.3 SYSTEM TESTING RESULTS
The user carried out system testing once the completion of the system
development. The purpose of this testing is to check the functionalities system, whether
if it is usable and well-functioned. The results from the functional testing can be seen in
the table below.
Testing Result
Component Expected Function
Positive Negative
Direct the user to the local disk.
“Browse” button Enable the user to select image
from the local disk.
32
4.4 EMOTION ACCURACY TESTING RESULTS
Set of images for the each emotions (normal, sad, surprise and happy) are saved
in the proposed model for the comparison purposes. The newly load images will be
compared with the saved dataset in order to detect the emotion of the users. Table below
showed the set of images that saved in the proposed model.
Images Emotion
Normal
Sad
Surprise
Happy
33
The proposed model is tested with set of images of similar emotion to test on it accuracy
in detecting the emotion. Ten images are tested for each category of emotions and the
results are shown as tables below.
34
Table 4.5: The testing result for “Sad” Expression
35
Table 4.6: The testing result for “Surprise” Expression
36
Table 4.7: The testing result for “Happy” Expression
37
Summary of Result are shown in the table below:
No. of Recognized
Emotion No. of Samples RR
Sample
Happy 10 10 100%
Normal 10 9 90%
Sad 10 7 80%
Surprise 10 8 80%
Total 40 34 85%
RR = 34/40 * 100
RR = 85%
Based on the result above, it shows that the proposed model has the recognition
rate (RR) of 85%.
38
4.5 DISCUSSION
Based on the observation during the prototype evaluation period, there are
several limitations which prevent the proposed model to perform accurately.
The images below are example of images which fail to be detected by the proposed
model.
39
2. The quality of the image, either still image loaded or captured by webcam.
The proposed model will detect the emotion more accurately when the images
loaded or captured are in better resolution, brightness and contrast. There will be
warning message prompted when the image loaded in is too fine or much noise.
The best picture will be picture with upper part of the body, white color
background and in high resolution.
The same limitation applied to the on the spot image capturing through either
webcam or external webcam. In Figure 4.16 it shows that once the “Start” button is
clicked, the error message prompting out indicating the image is too fine to be
processed.
40
Figure 4.15: Image captured through webcam
41
CHAPTER 5
5.1 CONCLUSION
The significant of this project is the emotion detection of the images loaded into
the proposed model. The main purpose is on its emotion detection functionality.
Through the integration between emotion detection technology and music player, the
proposed model is aimed to provide betterment in the individual’s entertainment. The
proposed is able to detect the four emotions i.e. normal, happy, and sad of the images
loaded into it. Once the proposed model compared and detected the emotion of the user,
the music player will play the song(s) accordingly.
As for the usability and accuracy, both system testing and emotion accuracy
testing has been done to the proposed model and return a satisfying result. The proposed
model able to recognized 34 out of 40 images loaded into it, which give a Recognition
Rate of 85%. Besides, the proposed model is a computer application which can works
well in all kinds of windows and computers.
Thus with this Emotion Based Music Player, users can have an alternative way
of selecting songs, which is in a more interactive and simpler way. The music lovers
will not have to search through the long list of songs for the songs to be played but to
match the emotion in the songs selection.
42
5.2 RECOMMENDATION
Besides, the future model can further enhanced by removing or minimizing the
noise if the loaded or captured images. In future expansion, noise reduction software can
be embedded in the model so that the noise for either still or captured image can be
removed and thus increase the accuracy in emotion detection.
Apart from the above, the proposed model can be improved by having auto
adjustment on the resolution or brightness and contrast of the images. The accuracy of
emotion detection for the current application is greatly influenced by the quality of the
images loaded. Hence by having the auto adjustment, the user can load in any quality of
image or capture images with any kinds of webcam. The future model will be able to
adjust the quality of the images which can be detected and processed.
In addition, for the better interactive between user and application, real time
emotion detection technique can be applied to the model. The future model will detect
and extract the facial feature once the application is launched and the emotion can be
detected in real time.
43
REFERENCES
[2] Ying-Li Tian, Takeo Kanade, and Jeffrey F. Cohn. (2003): Facial Expression
Analysis. Retrieved from
www.ri.cmu.edu/pub_files/pub4/tian.../tian_ying_li_2003_1.pdf.
[4] Emily Sohn(2011) Why Music Makes You Happy? Retrieved on October 9 2012
from http://news.discovery.com/human/music-dopamine-happiness-brain-
110110.html.
[6] W.K. Teo, Liyanage C De Silva and Prahlad Vadakkepat. (2004): Facial
Expression Detection And Recognition System: Journal of The Institution of
Engineers, Singapore. Vol 44.
[7] Jagdish Lal Raheja, Umesh Kumar. (2010): Human Facial Expression Detection
From Detected In Captured Image Using Back Propagation Neural Network:
International Journal of Computer Science & Information Technology (IJCSIT).
Vol.2(1).
44
[8] Zhengyou Zhang. (1998): Feature-Based Facial Expression Recognition:
Sensitivity Analysis and Experiments With a Multi-Layer Perceptron: Journal of
Pattern Recognition and Artificial Intelligence. Vol. 13(6): 893-911.
[9] Eva Cerezo1, Isabelle Hupont, Critina Manresa, Javier Varona, Sandra
Baldassarri, Francisco J. Perales, and Francisco J. Seron. (2007): Real-Time
Facial Expression Recognition for Natural Interaction: In J. Martí et al. (Eds.),:
IbPRIA 2007, Part II, LNCS 4478, (pp. 40–47). Springer-Verlag Berlin
Heidelberg.
[12] Frijda, N.H. (1986): The emotions. New York: Cambridge University Press.
[13] Matthew Montague Lavy (2001). Emotion and the Experience of Listening to
Music A Framework for Empirical Research. (Unpublished master's thesis).
Jesus College, Cambridge.
[14] Wai Ling Cheung and Guojun Lu. (2008): Music Emotion Annotation by
Machine Learning. doi: 10.1109/MMSP.2008.4665144.
45
[15] Maria M. Ruxanda1, Bee Yong Chua, Alexandros Nanopoulos, Christian S.
Jensen. (2007): Emotion-Based Music Retrieval On A Well-Reduced Audio
Feature Space: In Proceedings of IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP): 181-184.
[16] Laughing On The Other Side Of One's Face [Image]. Retrieved on October 9
2012 from http://blog.inkyfool.com.
[22] Joseph C. Hager (2003). Introduction To The DataFace Site: Facial Expressions,
Emotion Expressions, Nonverbal Communication, Physiognomy. Retrieved 10
October 2012, from http://face-and-emotion.com/dataface/general/homepage.jsp
46
APPENDIX
Yes
No
2. Do you listen to music/songs when your emotion distracted? (When you are
sad/angry/happy)
Yes
No
3. Do you find it troublesome to search for songs that you want to listen from your
music bank?
Yes
No
4. Do you find it interesting If there is a system which able to detect your current
mood and play the songs according to your mood/emotion?
Yes
No
Done
Powered by SurveyMonkey
Check out our sample surveys and create your own now!
47
ii. Part of the code in Filter.cs
unsafe
{
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
for(int x=0; x < nWidth; ++x )
{
p[0] = (byte)(255-p[0]);
++p;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
public static bool GrayScale(Bitmap b)
{
// GDI+ still lies to us - the return format is BGR, NOT RGB.
BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height),
ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
unsafe
{
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
48
for(int x=0; x < b.Width; ++x )
{
blue = p[0];
green = p[1];
red = p[2];
p += 3;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
int nVal = 0;
unsafe
{
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
for(int x=0; x < nWidth; ++x )
{
nVal = (int) (p[0] + nBrightness);
p[0] = (byte)nVal;
++p;
}
49
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
contrast *= contrast;
unsafe
{
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
for(int x=0; x < b.Width; ++x )
{
blue = p[0];
green = p[1];
red = p[2];
pixel = red/255.0;
pixel -= 0.5;
pixel *= contrast;
pixel += 0.5;
pixel *= 255;
if (pixel < 0) pixel = 0;
if (pixel > 255) pixel = 255;
p[2] = (byte) pixel;
pixel = green/255.0;
pixel -= 0.5;
pixel *= contrast;
pixel += 0.5;
50
pixel *= 255;
if (pixel < 0) pixel = 0;
if (pixel > 255) pixel = 255;
p[1] = (byte) pixel;
pixel = blue/255.0;
pixel -= 0.5;
pixel *= contrast;
pixel += 0.5;
pixel *= 255;
if (pixel < 0) pixel = 0;
if (pixel > 255) pixel = 255;
p[0] = (byte) pixel;
p += 3;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
public static bool Gamma(Bitmap b, double red, double green, double blue)
{
if (red < .2 || red > 5) return false;
if (green < .2 || green > 5) return false;
if (blue < .2 || blue > 5) return false;
unsafe
{
51
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
for(int x=0; x < b.Width; ++x )
{
p[2] = redGamma[ p[2] ];
p[1] = greenGamma[ p[1] ];
p[0] = blueGamma[ p[0] ];
p += 3;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
public static bool Color(Bitmap b, int red, int green, int blue)
{
if (red < -255 || red > 255) return false;
if (green < -255 || green > 255) return false;
if (blue < -255 || blue > 255) return false;
unsafe
{
byte * p = (byte *)(void *)Scan0;
for(int y=0;y<b.Height;++y)
{
for(int x=0; x < b.Width; ++x )
{
nPixel = p[2] + red;
nPixel = Math.Max(nPixel, 0);
p[2] = (byte)Math.Min(255, nPixel);
52
nPixel = p[0] + blue;
nPixel = Math.Max(nPixel, 0);
p[0] = (byte)Math.Min(255, nPixel);
p += 3;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
unsafe
{
byte * p = (byte *)(void *)Scan0;
byte * pSrc = (byte *)(void *)SrcScan0;
int nPixel;
53
(pSrc[2 + stride2] * m.BottomLeft) + (pSrc[5 +
stride2] * m.BottomMid) + (pSrc[8 + stride2] * m.BottomRight)) / m.Factor) + m.Offset);
54
m.SetAll(1);
m.Pixel = nWeight;
m.TopMid = m.MidLeft = m.MidRight = m.BottomMid = 2;
m.Factor = nWeight + 12;
55
iii. Screenshot of database for Facial Expression Table
56
Emotion Based Music Player
Tan Siew Ching Assoc. Prof. Dr. Baharum B Baharudin
Department of Computer and Information Sciences, Department of Computer and Information Sciences,
Universiti Teknologi PETRONAS Universiti Teknologi PETRONAS
Bandar Seri Iskandar, Tronoh Perak, Malaysia Bandar Seri Iskandar, Tronoh Perak, Malaysia
[email protected] [email protected]
Abstract — The work presents described the The “Emotion Based Music Player” is a
development of Emotion Based Music Player, device developed aimed to detect the emotion of an
which is a computer application meant for all type individual, and play the lists of music accordingly.
of users, specifically the music lovers. Due to the
troublesome workloads in songs selection, most First, the individual will reflect his emotion
people will choose to randomly play the songs in through the facial expression. After that, the device
the playlist. As a result, some of the songs selected will detect the condition of the facial expression,
not matching the users’ current emotion. The analyze and determine the emotion of the individual.
proposed model is able to extract user’s facial The music player will play the songs which can suit
expression and thus detect user’s emotion. The the current emotion of the individual. The device will
music player in the proposed model will then play focus on the analysis of the facial expression only
the songs according to the category of emotion which does not include the head or face movement.
detected. It is aimed to provide a better enjoyment
to music lovers in music listening. The scope of What music lovers facing now is the
emotions in the proposed model involve normal, difficulty in songs selection, especially songs that
sad, surprise and happy. The system involves the match individuals’ current emotions. Looking at the
major of image processing and facial detection long lists of unsorted music, individuals will feel more
technologies. The input for this proposed model is demotivated to look for the songs they want to listen
the .jpeg format still images which available online. to. For an example, when a person is sad, he would
The performance of this model is evaluated by like to listen to some heavy rock music to release his
loading forty still images (ten for each emotion sadness. The individual would rather choose the songs
category) into the proposed model to test on the randomly or just “play all” for all the songs he had.
accuracy in detecting the emotions. Based on the Besides, people get bored with this traditional way of
testing result, the proposed model has the searching and selecting songs. The method had been
Recognition Rate of 85%. implemented since few years back.
Normal Sad
Surprise Happy
RR = 34/40 * 100
RR = 85%
Based on the result above, it shows that the
proposed model has the recognition rate (RR) of 85%.
V. CONCLUSION AND FUTURE WORK
E. DISCUSSION
A. CONCLUSION
Based on the observation during the
prototype evaluation, there are several limitations The significant of this project is the emotion
which avert the proposed model to perform perfectly. detection of the images loaded into the proposed
model. The main purpose is on its emotion detection
1. Posture of the object in the image. functionality. Through the integration between
The proposed model will perform accurately emotion detection technology and music player, the
if the object of the image is in upright posture proposed model is aimed to provide betterment in the
and the individual’s face is clearly exposed. individual’s entertainment.
Minor slant of individual’s head is acceptable The proposed is able to detect the four
and detectable emotions i.e. normal, happy, and sad. Once the
proposed model detected the emotion, music player
will play the song(s) accordingly. As for the usability
and accuracy, both system testing and emotion
accuracy testing has been done to the proposed model
and return a satisfying result. The proposed model
Figure 4: Image which is accurately detected (a) [17] able to recognized 34 out of 40 images loaded into it,
which give a Recognition Rate of 85%. Besides, the
proposed model is a computer application which can
works well in all kinds of windows and computers.