Vision Based Gesture in Embedded Devices

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

COMPUTER VISION &

GESTURE RECOGNITION

-- Mukesh Kumar
OUTLINE

 Introduction
 Input devices
 How System Works
 Computer Vision Frameworks
 OpenCV
 Challenges
 Few Apps Idead
 Demo
 Conclusions
1. INTRODUCTION
 Computer vision is the science where machine is able to
extract information from an image that is necessary to
solve some task. It enables computer to understand
visual input

The image data can be acquired in many forms and from


many kind of sources, such as video sequences, views
from multiple cameras, or multi-dimensional data from a
medical scanner, video, normal 2D cameras, infrared
cameras, radars or specialized sensors.
GESTURE RECOGNITION

 Gesture recognition is a language technology with the


goal of interpreting human gestures via
mathematical algorithms

 Gesture recognition can be conducted with techniques


from computer vision and image processing.
INPUT DEVICES
The ability to track a person's movements and determine what gestures they may be
performing can be achieved through various tools.

 Depth-aware cameras e.g. Time of flight cameras


 Stereo Cameras – using two cameras. Other camera can be a normal camera, a IR
camera, ultrasonic camera.
 Single Camera
 Controller based Gestures -- wearable or which can act as a extension to the body
to capture the motion in 3D space. e.g. Wii Remote, sensor equipped gloves.
 Other sensors and instruments like Radars, X-ray, Surgical sensors for 3D
depth mapping.
 Two common technologies for hand gesture recognition
 glove-based method
 Using special glove-based device to extract hand posture
 vision-based method
 3D Depth data Model
 Normal Color Camera / Appearance Model
Data Glove Based Gesture Recognition
VISION-BASED GESTURE
RECOGNITION
HOW IT WORKS
 Recognition + Gesture recognition

 The classical problem in computer vision is that of determining whether or not the image data
contains some specific object, feature, activity, gesture, optical character, Motion e.t.c.
This can be achieved using following steps :
 
 Image Acquisition :
Image Acquisition is done to generate a 2D image, 3D depth data, Image sequence from
video feeds at real time or from any other sources using specific hardware.
 Pre processing of Images
Noise reduction, Contrast Enhancements, Gray scale conversion, Matrix of Image
creation, Histogram comparison, Image color Inversion.
Few of these steps can also be done during Image Acquisition phase inside the cameras.
 Feature Extraction
Line, Edge, Corner Detection
Blobs and shape detection and extraction
Color recognition, like skin color
 High Level Processing
High Level Processing is done on the extracted features from the image for Gesture, pose, Motion detection.
Pixel Classification
Image Correlation
Facial Recognition
Hand detection
Feature Tracking

In this phase Neural Network and Artificial intelligence is used to teach the system about the
gesture, kind of motions, faces. Adaptive algorithms are used to adapt the system for working under
different kind of scenarios and environments which learn itself and become more intelligent system after
learning from the various kind of inputs.
Training samples

 Training samples
 Negative samples: images that must not
contain object representations. We collected 500
random images as negative samples.
 Positive samples: hand posture images that are
collected from humans hand, or generated with a
3D hand model. For each posture, we collected
around 450 positive samples. As the initial test, we
use the white wall as the background.
AVAILABLE FRAMEWORKS
Few Computer Vision Frameworks available:
 OpenNI – Works with Microsoft Kinect and other 3D sensors generating 3D depth
Images
 OpenCV – (started by Intel) Have a C/C++ interface for complex Image Processing
and Applying Neural network Algorithm. Works with 2D or 3D image samples or live
video feeds.
 Aforge.Net – Same as OpneCV, with a .NET interface. Have complex Image
Processing and Neural network Algorithm, genetic algorithms and machine learning
libraries for Applications. Subset of this project has a Glyph recognition system.
 NokiaCV Library
 EmbedCV - An Embeddable Computer Vision Library
 The OpenSURF Computer Vision Library
OPENCV
INTEL® OPEN SOURCE COMPUTER VISION LIBRARY

 OpenCV is a library of programming functions mainly aimed at real time 


computer vision, developed by Intel.
 It is free for use under the open source BSD license.
 The library is cross-platform. 
FEATURES
 Basic image processing
 filtering, edge detection, corner detection, sampling and interpolation,
color conversion, morphological operations, histograms, image pyramids
 Structural analysis
 connected components, contour processing, distance transform, various
moments, template matching, Hough transform, polygonal
approximation, line fitting, ellipse fitting, Delaunay triangulation
 Camera calibration
 finding and tracking calibration patterns, calibration, fundamental matrix
estimation, homography estimation, stereo correspondence
 Motion analysis
 optical flow, motion segmentation, tracking
 Object recognition
 eigen-methods, HMM
 Basic GUI
 display image/video, keyboard and mouse handling, scroll-bars
 Image labeling, Robotics, Machine Learning Algorithms
MODULES
 OpenCV Functionality
 more than 350 algorithms
OPEN CV FOR MOBILES
 Complete Porting of OpenCV is required.
 Few Libraries has to be removed to decrease the size of
Library
 Image Processing support is very less in WinCE.

 Alternatively Cloud computing can be used for Image


processing.

 Few of the libraries of Image processing, data structures and XML Support are already ported.
CHALLENGES
 Accuracy of gesture recognition software.
 Image noise - not necessary be under consistent lighting,
or in the same location. Items in the background or
distinct features of the users may make recognition more
difficult.
 Hardware Requirement like 3D depth Mapper.

 Processing Power required for real time Vision.


 EyeSight's hand-waving, gesture-based UI
 a simple hand gesture over the phone will silence incoming calls while in a meeting, scroll
between photos in the photo gallery, skip tracks in the media player 
FEW APPS IDEAS

You might also like