LPC and LPCC Method of Feature Extraction in Speech Recognition System
Abstract— Automatic speech recognition (ASR) has been This paper is structured as follows. In section 2 the speech
under the scrutiny of researchers for many years. Speech recognition system is described. Section 3 the concept of
Recognition System is the ability to listen what we speak, feature extraction is introduced. Section 4 includes detailed
interpreter and perform actions according to spoken theory of LPC and LPCC. Section 5 introduced the relative
information. After so many detailed study and analysis of LPC and LPCC. Section 6 includes the conclusion
optimization of ASR and various techniques of features based on the comparison between LPC and LPCC and future
extraction, accuracy of the system is still a big challenge. scope.
The selection of feature extraction techniques is completely
based on the area of study. In this paper, a detailed theory II. SPEECH RECOGNITION SYSTEM
about features extraction techniques like LPC and LPCC Automatic speech recognition uses speech signals for
is examined. The goal of this paper is to study the analyzing and processing and converting them into a series of
comparative analysis of features extraction techniques like lexis through a method employ as a computer program. In
LPC and LPCC. simple words, it is process of translation of spoken words into
text. The components of a speech recognition system are
General Terms — Performance; Speech; Technology; Accuracy Feature Extraction, Decoder, Pronunciation Models, Acoustic
Models, and Speech Input.
Models, and Speech Input.
Keywords —Speech Recognition; Automatic Speech
Recognition; Feature Extraction; LPC; LPCC
a. Voice Input: Audio as input to the system is recorded by
b. Decoder: The decoding process is performed to discover
I. INTRODUCTION the collections of words and a best match is obtained using
The concept of human interaction with technology led to knowledge base. The tangible resolution regarding
interest with subsequent research in Speech Recognition and recognition of speech utterance by optimizing and
With the speedy development of computer software and combining the data that is convey by the language and
hardware, the speech recognition technology is used in broad acoustic models is evaluated by decoder.
range of applications like mobile phone applications, video c. Acoustic Models: After so many years of study, the
games, medical like deaf telephony, weather forecasting, precision of speech recognition system still remain the most
agriculture, communication technology, robotics, national significant topic. Accuracy of the speech system is
defense, education and automatic translation. The aim of determined by many of the recognized factors. In improving
researchers is to invent a computational technique, on the the accuracy of the system, Acoustic model acts a very
basis of which, the system can help in differentiating among significant role.
the various words spoken by different speaker having unique Two different types of acoustic models are phoneme and
accents with different environmental conditions. word model and are implemented by the help of various
Usually the speech recognition system delivers two type of methods like HMM, support vector machines (SVM),
important information: Voice recognition and speaker content dynamic Bayesian networks (DBN) and ANNs.
[10][34][15].Voice recognition is refers to speaker d. Language Models: Speech recognition is an application
identification rather than what they are saying and can be used of natural language processing which attempt to capture the
to validate the distinctiveness of speaker for security process. concepts of language modeling. These models helped to
Speaker recognizers focus on extracting the useful information direct the correct word sequence by predicting the possibility
which is independent of speaker identification. of nth word using (n-1) preceding words.
e. Pronunciation Models: In speech recognition system, the The technology evolved was
term pronunciation means what the speech engine thinks statistical knowledge of language
about how these words should sound like or pronounced that 1991-2000 and acoustic models. Vocal Tract
Length Normalization, HMM
is their phonetic representation. Multiple pronunciations can
Toolkit are developed.
be associated with single word which makes the system
difficult to recognize them well. 2001-2010
Concatinative Synthesis, Machine
Learning, Mixed –intiative Dialog
The basic system of speech recognition is summarized in
After the years of evolution, the
Figure1. technology of speech recognition
2010-present has established a landmark in
marketplace by helping the users
in number of ways.
c. Windowing: In this, the resulting frames are multiplied b. LPCC features are calculated by introducing the cepstrum
by hamming window with the purpose of minimizing the coefficients (CCs) in the LPC parameters. If we are able to
edge effect. find the predicator coefficient vector, it become easy to
d. Computing the LPC: In the final step, the method of calculate cepstrum coefficients. With the help of recursion
auto-correlation is functional upon the frames of the LPC parameters are converted into CC.
windowing speech sample. Maximum Autocorrelation
value is when analysis of the order of LPC is evaluated.
like LPC and LPCC. The most prominent discovery of this
Accurate study concludes that the LPC is good when it comes to
calculations of Not good for representing
estimate the speech parameters precisely and the LPCC has
speech parameters speech as it considers single
can be obtained stationary within the given good reliability and robustness. The process involved in
with the help of frame extracting these features is same except that the LPCC include
LPC an extra step of conversion of LPCC coefficients cepstral
coefficients to obtain LPCC. Both the techniques reduced their
Well-known for the performance when it comes in contact to noisy environment.
good performance
LPCC is highly sensitive to But today, hybrid approaches are future implementation used
and relative
the quantization noise in improving efficiency of the ASR. In future, the speech may
LPCC become speech understanding.
LPCC is much more
In case of insufficient order
robust and reliable
being used, the performance
than LPC
of LPCC is degraded [16].
