0% found this document useful (0 votes)

51 views

Isolated Word Recognition Using LPC & Vector Quantization: M. K. Linga Murthy, G.L.N. Murthy

This document discusses isolated word recognition using linear predictive coding (LPC) and vector quantization. It involves extracting speech features using LPC analysis, then applying vector quantization to generate codebooks for feature matching during recognition. The system contains training and testing phases. In training, word features are extracted and templates stored. In testing, extracted features are compared to templates to make recognition decisions based on matching scores. MATLAB is used to implement and test the concept on isolated words spoken by the speaker. The system provides a basic approach for isolated word speech recognition using LPC features and vector quantization for matching. Accuracy could be improved with better speech detection and noise elimination algorithms.

Uploaded by

mauricetappa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

Isolated Word Recognition Using LPC & Vector Quantization: M. K. Linga Murthy, G.L.N. Murthy

Uploaded by

mauricetappa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

M.

K LINGA MURTHY* et al Volume: 1 Issue: 3

ISSN: 2319 - 1163 479 - 482

ISOLATED WORD RECOGNITION USING LPC & VECTOR QUANTIZATION

1
1,2

M. K. Linga Murthy, 2G.L.N. Murthy

Asst. Professor, ECE, Lakireddy Balireddy College of Engineering, Andhra Pradesh, India, [email protected], [email protected]

Abstract
Speech recognition is always looked upon as a fascinating field in human computer interaction. It is one of the fundamental steps towards understanding human recognition and their behavior. This paper explicates the theory and implementation of Speech recognition. This is a speaker-dependent real time isolated word recognizer. The major logic used was to first obtain the feature vectors using LPC which was followed by vector quantization. The quantized vectors were then recognized by measuring the Minimum average distortion. All Speech Recognition systems contain Two Main Phases, namely Training Phase and Testing Phase. In the Training Phase, the Features of the words are extracted and during the recognition phase feature matching Takes place. The feature or the template thus extracted is stored in the data base, during the recognition phase the extracted features are compared with the template in the database. The features of the words are extracted by using LPC analysis. Vector Quantization is used for generating the code books. Finally the recognition decision is made based on the matching score. MATLAB will be used to implement this concept to achieve further understanding.

Index Terms: Speech Recognition, LPC, Vector Quantization, and Code Book. -----------------------------------------------------------------------***----------------------------------------------------------------------1. INTRODUCTION
Speech is a natural mode of communication for people. We learn all the relevant skills during early childhood, without instruction, and we continue to rely on speech communication throughout our lives. It comes so naturally to us that we dont realize how complex a phenomenon speech. Speech recognition, or more commonly known as automatic speech recognition (ASR), is the process of interpreting human speech in a computer. A more technical definition is given by Jurafsky, where he defines ASR as the building of system for mapping acoustic signals to a string of words. He continues by defining automatic speech understanding (ASU) as extending the goal to producing some sort of understanding of the sentence. A second dimension of variation is how fluent, natural or conversational the speech is isolated word recognition, in which each word is surrounded by some sort of pause, is much easier than recognizing continuous speech A third dimension of variation is channel and noise. Commercial dictation systems, and much laboratory research in speech recognition, is done with high quality, head mounted microphones A final dimension of variation is accent or speaker-class characteristics. The objective of this paper is to recognize the isolated words spoken by the speaker. These results are very useful for implementing the recognition systems. The words or utterances are recorded by Microphone and are stored in work space, then processed using MATLAB signal processing toolbox. It involves pre-emphasis, frame blocking autocorrelation analysis, LPC Analysis and Vector quantization.

1.1 Challenges
The general problem of automatic transcription of speech by any speaker in any environment is still far from solved. But recent years have seen ASR technology mature to the point where it is viable in certain limited domains.

2. LPC FOR SPEECH RECOGNITION:

There are three basic steps in Speech Recognition, they are Parameter estimation. ( in which the test pattern is created ) Parameter Comparison. Decision making.

1.2 Difficulties
One dimension of variation in speech recognition tasks is the vocabulary size.

__________________________________________________________________________________________ IJRET | NOV 2012, Available @ http://www.ijret.org/ 479

M.K LINGA MURTHY* et al Volume: 1 Issue: 3

2.3 Windowing:

ISSN: 2319 - 1163 479 - 482

Fig2.1 Block diagram for LPC Processor for Speech Recognition.

2.1 Pre emphasis:

From the speech production model it is known that the speech undergoes a spectral tilt of -6dB/oct. To counteract this fact a pre-emphasis filter is used. The main goal of the pre-emphasis filter is to boost the higher frequencies in order to flatten the spectrum. Pre emphasis follows a 6 dB per octave rate. This means that as the frequency doubles, the amplitude increases 6 dB. This is usually done between 300 - 3000 cycles. Pre emphasis is needed in FM to maintain good signal to noise ratio. Perhaps the most widely used pre emphasis network is the fixed first-order system:

Here we want to extract spectral features of entire utterance or conversation, but the spectrum changes very quickly. Technically, we say that speech is a non-stationary signal, meaning that its statistical properties are not constant across time. Instead, we want or extract spectral features from a small window of speech that characterizes a particular subphone and for which we can make the assumption that the signal is stationary. this is done by using a window which is non-zero inside some region and zero elsewhere, running this window across the speech signal, and extracting the waveform inside this window. A more common window used in feature extraction is the Hamming window, which shrinks the values of the signal toward zero at the window boundaries, avoiding discontinuities.

2.4 Autocorrelation analysis:

Each frame of windowing signal is next auto correlated to give

Where the highest autocorrelation value, p, is the order of the LPC analysis. Typically, values of p from 8 to 16 have been used, with p = 10 being the value used for this systems. A side benefit of the autocorrelation analysis is that the zeroth autocorrelation, , is the energy of the frame.

2.2 Frame Blocking:

In this step the pre-emphasized speech signal is blocked into frames of N samples, with adjacent frames being separated by M samples .Thus frame blocking is done to reduce the mean squared predication error over a short segment of the speech wave form. In this step the pre emphasized speech signal, is blocked into frames of N samples, with adjacent frames being separated by M samples.

2.5 LPC Analysis:

The next processing step is the LPC analysis, which converts each frame of P+1 autocorrelations into an LPC parameter set in which the set might be the LPC coefficients, the reflection coefficients(PARCOR-co-efficient), the log area ratio coefficients, or any desired information of the above sets. The formal method for converting from autocorrelation coefficients to an LPC parameter set ( for the Autocorrelation method ) is known as Durbins method

2.6

LPC

parameter

conversion

Cepstral

coefficients:
A very important LPC parameter set, which can be derived directly from the LPC coefficients set, is the LPC Cepstral Fig2.2 blocking of speech into overlapping frames. Typical values for N and M are 256 and 128 when the sampling rate of the speech is 6.67 kHz. These correspond to 45-msec frames, separated by 15-msec, or a 66.7-Hz frame rate. coefficients, . The recursion method is used. The Cepstral coefficients, which are the coefficients of the Fourier transform representation of the log magnitude spectrum, have been shown to be a more robust, reliable feature set for speech recognition than the LPC coefficients, the PARCOR coefficients, or the log area ratio coefficients.

__________________________________________________________________________________________ IJRET | NOV 2012, Available @ http://www.ijret.org/ 480

M.K LINGA MURTHY* et al Volume: 1 Issue: 3

3. VECTOR QUANTIZATION:
Vector quantization is one very efficient source-coding technique. Vector quantization is a procedure that encodes a vector of input (e.g., a segment of waveform or a parameter vector that represents the segment spectrum) into an integer (index) that is associated with an entry of a collection (codebook) of reproduction vectors. Using the basic pattern-recognition approach, the input (unknown class or word) Speech pattern is next compared with each class (or word) reference pattern and a measure (score) of similarity between the unknown pattern and each reference pattern is calculated.

ISSN: 2319 - 1163 479 - 482

Fig 4.1 Comparisons for isolated word recognition system for two speakers

CONCLUSION:
Isolated Word Recognition using Linear Predictive Coding and Vector Quantization provides basic idea for implementing for Speech Recognition for Isolated Words. The Vector Quantized based speech recognition is very simple method. In this method by using LPC analysis extracts the features of given words & vector quantization is used for feature matching in Speech Recognition. Fig 4.1 A Vector Quantized based speech recognition system It is the pattern recognition based approach to speech recognition, we see that the M codebooks are analogous to M (sets of) reference patterns (or templates) and the dissimilarity measure is defined according to Eq. (4.1) and Eq. (4.2), where no explicit time alignment is required. In the successful implementation the results were found to be satisfactory considering less number of training data. The accuracy of the real time system can be increased significantly by using an improved speech detection/noise elimination algorithm.

REFERENCES:
[1]. L.R.Rabiner and B.H.Juang, Fundamentals of speech recognition, Prentice Hall (Signal Processing series) 1993. Richard O.Duda, Peter E.Hart, David G.Stork,Pattern Classification, John Wiley & Sons (ASIA) Pte Ltd. Y.Linde ,A.Buzo and R.M.Gray, An algorithm for vector quantizer design ,IEEE Trans .COM28,January 1980 . Mayukh Bhaowal and kunal chawla , Isolated word recognition for English language using LPC,VQ & HMM , students of IIT, Allahabad, India. Poonam Bansal , Amita dev and Shail Bala Jain Automatic speaker Identification using VQ , Medwell journals,6(9) : 938 942 ,2007. Lawrence R. Rabiner Applications of speech recognition in the area of Telecommunications, AT&T Labs Florham Park, New Jersey 07932, 07803-3698-4/97/$10.00 0 1997 IEEE L. R. Rabiner, Applications of Voice Processing to Telecommunications, Proc. IEEE, Vol. 82, No. 4, pp. 199-228, Feb. 1994.

The utterance is recognized as class K if

[2].

[3].

Where And books.

is a minimum average distortion. is an M average distortion score of all code

[4].

[5].

4. SYSTEM IMPLEMENTATION:
The training set for the vector quantizer was obtained by recording the utterances of set isolated words .The words are recorded for a two different speakers. The recognition vocabulary consisted of the names (Forward, Back, Left, Right, and Stop). Here each word is applied for 10 times .The results obtained are shown in the table below.

[6].

[7].

__________________________________________________________________________________________ IJRET | NOV 2012, Available @ http://www.ijret.org/ 481

M.K LINGA MURTHY* et al Volume: 1 Issue: 3

[8]. L. R. Rabiner and R.W. Schafer Digital processing of Speech signals, Prentice Hall (Signal Processing series). J.E.Shore and D.K.Burton Discrete Utterance Speech recognition without Time alignment, IEEE Transactions on IT, IT 29(4), July 1983, pp. 473 491. D. K. Burton, J. T. Buck, and F. Shore, "Parameter Selection for Isolated Word Recognition Using Vector Quantization," Proc. ICASSP 84, San Diego, CA, pp. 9.4.1- 9.4.4, March 1984. Douglas oshaughnesy Speech Communication, Universities press Electrical engineering Series, 2/e ,2001

ISSN: 2319 - 1163 479 - 482

[9].

[10].

[11].

BIOGRAPHIES:
Mr. M. K. Linga Murthy is currently working as Asst. Professor in ECE in LBRCE, Mylavaram. He completed his B.Tech in the year 2001 at SJCET, Yemmiganur. He completed his M.Tech in the year 2008 at MITS, Madanapalle. He has over 06 years of teaching experience & his research an area is signal processing.

Mr. G.L.N Murthy is currently working as Asso. Professor in ECE in LBRCE, Mylavaram. He completed his B.Tech in JNTU, Anantapu. He completed his M.Tech in JNTU, Anantapur. He pursuing his PhD in SVU Tirupathi, he has over 12 years of teaching experience & his research an area is signal processing

__________________________________________________________________________________________ IJRET | NOV 2012, Available @ http://www.ijret.org/ 482

Speech Recognition Seminar Report
87% (97)
Speech Recognition Seminar Report
32 pages
1 Paper
No ratings yet
1 Paper
9 pages
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
No ratings yet
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
8 pages
A Review On Feature Extraction and Noise Reduction Technique
No ratings yet
A Review On Feature Extraction and Noise Reduction Technique
5 pages
10.1007@s11042 019 08293 7
No ratings yet
10.1007@s11042 019 08293 7
16 pages
Speaker Recognition Using Matlab
No ratings yet
Speaker Recognition Using Matlab
14 pages
Speaker Recognition System - v1
No ratings yet
Speaker Recognition System - v1
7 pages
Speech Recognition Using HMM ANN Hybrid Model
No ratings yet
Speech Recognition Using HMM ANN Hybrid Model
4 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
No ratings yet
AJSAT Vol.5 No.2 July Dece 2016 pp.23 30
8 pages
Digital Signal Processing: The Final
No ratings yet
Digital Signal Processing: The Final
13 pages
Voice Analysis Using Short Time Fourier Transform and Cross Correlation Methods
No ratings yet
Voice Analysis Using Short Time Fourier Transform and Cross Correlation Methods
6 pages
Ma Kale
No ratings yet
Ma Kale
3 pages
Voice Recognition System Speech To Text
No ratings yet
Voice Recognition System Speech To Text
5 pages
Speaker Recognition Publish
No ratings yet
Speaker Recognition Publish
6 pages
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
No ratings yet
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
24 pages
Malaysian Journal of Computer Science
No ratings yet
Malaysian Journal of Computer Science
14 pages
JAWS (Screen Reader)
No ratings yet
JAWS (Screen Reader)
18 pages
hedha houa
No ratings yet
hedha houa
5 pages
Ai For Speech Recognition
No ratings yet
Ai For Speech Recognition
19 pages
Methodology For Speaker Identification and Recognition System
100% (1)
Methodology For Speaker Identification and Recognition System
13 pages
%28sici%291099-1115%28199711%2911%3A7%3C569%3A%3Aaid-acs453%3E3.0.co%3B2-2
No ratings yet
%28sici%291099-1115%28199711%2911%3A7%3C569%3A%3Aaid-acs453%3E3.0.co%3B2-2
15 pages
Spoken Language Identification Using Hybrid Feature Extraction Methods
No ratings yet
Spoken Language Identification Using Hybrid Feature Extraction Methods
5 pages
Draft 6
No ratings yet
Draft 6
14 pages
Ijetae 0612 54 PDF
No ratings yet
Ijetae 0612 54 PDF
4 pages
DC Motor Control
No ratings yet
DC Motor Control
2 pages
Isolated Digit Recognition System
100% (1)
Isolated Digit Recognition System
3 pages
2020-An Improved Speech Segmentation and Clustering Algorithm
No ratings yet
2020-An Improved Speech Segmentation and Clustering Algorithm
19 pages
Dissen Ke 16
No ratings yet
Dissen Ke 16
6 pages
Speaker Identification System For Hindi and Marathi Languages Using Wavelet and Support Vector Machine
No ratings yet
Speaker Identification System For Hindi and Marathi Languages Using Wavelet and Support Vector Machine
4 pages
Utterance Based Speaker Identification
No ratings yet
Utterance Based Speaker Identification
14 pages
MFCC Step
100% (1)
MFCC Step
5 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
Advanced Signal Processing Using Matlab
No ratings yet
Advanced Signal Processing Using Matlab
20 pages
Chethana H N REPORT
No ratings yet
Chethana H N REPORT
12 pages
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
No ratings yet
Performance Evaluation of MLP For Speech Recognition in Noisy Environments Using MFCC & Wavelets
5 pages
Speech To Text
No ratings yet
Speech To Text
6 pages
Performance Improvement of Speaker Recognition System
No ratings yet
Performance Improvement of Speaker Recognition System
6 pages
A Review On Speech Recognition Methods: Ram Paul Rajender Kr. Beniwal Rinku Kumar Rohit Saini
No ratings yet
A Review On Speech Recognition Methods: Ram Paul Rajender Kr. Beniwal Rinku Kumar Rohit Saini
7 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Final Synopsis
No ratings yet
Final Synopsis
23 pages
Electrical Engineering (2017-2021) Punjab Engineering College, Chandigarh - 160012
No ratings yet
Electrical Engineering (2017-2021) Punjab Engineering College, Chandigarh - 160012
23 pages
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
No ratings yet
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
7 pages
IOSRJEN (WWW - Iosrjen.org) IOSR Journal of Engineering
No ratings yet
IOSRJEN (WWW - Iosrjen.org) IOSR Journal of Engineering
5 pages
(IJCST-V10I3P32) :rizwan K Rahim, Tharikh Bin Siyad, Muhammed Ameen M.A, Muhammed Salim K.T, Selin M
No ratings yet
(IJCST-V10I3P32) :rizwan K Rahim, Tharikh Bin Siyad, Muhammed Ameen M.A, Muhammed Salim K.T, Selin M
6 pages
Este Es 1 Make 01 00031 PDF
No ratings yet
Este Es 1 Make 01 00031 PDF
17 pages
Maretext Independent Speaker Identification Based On K-Mean Algorithm
No ratings yet
Maretext Independent Speaker Identification Based On K-Mean Algorithm
9 pages
ST Final Report TOMMOROW 4-4-2011 Report
No ratings yet
ST Final Report TOMMOROW 4-4-2011 Report
57 pages
Speech Recognition Algo
No ratings yet
Speech Recognition Algo
17 pages
Fusion ConvBERT Parallel Convolution and BERT Fusion For Speech
No ratings yet
Fusion ConvBERT Parallel Convolution and BERT Fusion For Speech
19 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Introduction
No ratings yet
Introduction
9 pages
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
No ratings yet
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
3 pages
2 Springer
No ratings yet
2 Springer
6 pages
Comp Sci - Speech Recognition - Sandeep Kaur
No ratings yet
Comp Sci - Speech Recognition - Sandeep Kaur
6 pages
Voice Recognition
100% (1)
Voice Recognition
18 pages
Voicerecognition
No ratings yet
Voicerecognition
23 pages
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
No ratings yet
Fusion of Spectrograph and LPC Analysis For Word Recognition: A New Fuzzy Approach
6 pages
Robust Processing of Spoken Situated Dialogue: A Study in Human-Robot Interaction
From Everand
Robust Processing of Spoken Situated Dialogue: A Study in Human-Robot Interaction
Pierre Lison
No ratings yet
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
Adaptive Control Using Neural Networks and Approximate Models
No ratings yet
Adaptive Control Using Neural Networks and Approximate Models
11 pages
J51 PDF
No ratings yet
J51 PDF
8 pages
Adaptive Fuzzy Wavelet Network Control Design For Nonlinear Systems
No ratings yet
Adaptive Fuzzy Wavelet Network Control Design For Nonlinear Systems
28 pages
Nonlinear Adaptive Wavelet Control Using Constructive Wavelet Networks
No ratings yet
Nonlinear Adaptive Wavelet Control Using Constructive Wavelet Networks
13 pages
SVPWM PDF
No ratings yet
SVPWM PDF
5 pages
A Direct Adaptive Neural-Network Control For Unknown Nonlinear Systems and Its Application
No ratings yet
A Direct Adaptive Neural-Network Control For Unknown Nonlinear Systems and Its Application
8 pages
Syllable Division Rules - Sarah's Teaching Snippets
No ratings yet
Syllable Division Rules - Sarah's Teaching Snippets
35 pages
July 10, 2008 13:32 WSPC/181-IJWMIP 00249: Intelligent Optimal Control of Robotic Manipulators Using Wavelets
No ratings yet
July 10, 2008 13:32 WSPC/181-IJWMIP 00249: Intelligent Optimal Control of Robotic Manipulators Using Wavelets
18 pages
Modulation MLI
No ratings yet
Modulation MLI
5 pages
Wavelet Adaptive Backstepping Control For A Class of Nonlinear Systems
No ratings yet
Wavelet Adaptive Backstepping Control For A Class of Nonlinear Systems
9 pages
Identification and Control of Dynamical Systems Using Neural Network PDF
No ratings yet
Identification and Control of Dynamical Systems Using Neural Network PDF
24 pages
Usnn Rep
No ratings yet
Usnn Rep
12 pages
Cluster Center Initialization Algorithm For K-Means Clustering
No ratings yet
Cluster Center Initialization Algorithm For K-Means Clustering
10 pages
Hybrid Training Algorithm For RBF Network: M. Y. Mashor
No ratings yet
Hybrid Training Algorithm For RBF Network: M. Y. Mashor
17 pages
IEEE Transactions On Magnetics Volume 22 Issue 5 1986 (Doi 10.1109 - Tmag.1986.1064466) Sebastian, T. Slemon, G. Rahman, M. - Modelling of Permanent Magnet Synchronous Motors
No ratings yet
IEEE Transactions On Magnetics Volume 22 Issue 5 1986 (Doi 10.1109 - Tmag.1986.1064466) Sebastian, T. Slemon, G. Rahman, M. - Modelling of Permanent Magnet Synchronous Motors
3 pages