PDF

A Publication of
The Science and Information Organization

(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011

(i)
http://ijacsa.thesai.org/

IJACSA Editorial
From the Desk of Managing Editor
It is a pleasure to present to our readers the February 2011 issue of International Journal of Advanced
Computer Science and Applications (IJACSA).
With monthly feature peer-reviewed articles and technical contributions, the Journal's content is dynamic,
innovative, thought-provoking and directly beneficial to readers in their work.
The number of submissions we receive has increased dramatically over the last issues. Our ability to accommodate
this growth is due in large part to the terrific work of our Editorial Board.
In order to publish high quality papers, Manuscripts are evaluated for scientic accuracy, logic, clarity, and general
interest. Each Paper in the Journal not merely summarizes the target articles, but also evaluates them critically, place
them in scientic context, and suggest further lines of inquiry. As a consequence only 26% of the received articles
have been finally accepted for publication.
IJACSA emphasizes quality and relevance in its publications. In addition, IJACSA recognizes the importance of
international influences on Computer Science education and seeks international input in all aspects of the journal,
including content, authorship of papers, readership, paper reviewers, and Editorial Board membership
We believe that open access will be an essential component of scientific publishing in the future and that works
reporting the results of current scientific research should be as openly accessible and freely useable as possible.
The success of authors and the journal is interdependent. While the Journal is advancing to a new phase, it is
not only the Editor whose work is crucial to producing the journal. The editorial board members , the peer
reviewers, scholars around the world who assess submissions, students, and institutions who generously give their
expertise in factors small and large their constant encouragement has helped a lot in the progress of the
journal and earning credibility amongst all the reader members.
We hope to continue exploring the always diverse and often astonishing fields in Advanced Computer Science and
Applications
Thank You for Sharing Wisdom!

Managing Editor
IJACSA
Volume 2 Issue 2, February 2011
[email protected]
ISSN 2156-5570(Online)
ISSN 2158-107X(Print)
2011 The Science and Information (SAI) Organization


(ii)

IJACSA Associate Editors
Dr. Zuqing Zhu
Service Provider Technology Group of Cisco Systems, San Jose
Domain of Research: Research and development of wideband access routers for hybrid fibre-
coaxial (HFC) cable networks and passive optical networks (PON)
Dr. Jasvir Singh
Dean of Faculty of Engineering & Technology, Guru Nanak Dev University, India
Domain of Research: Digital Signal Processing, Digital/Wireless Mobile Communication,
Adaptive Neuro-Fuzzy Wavelet Aided Intelligent Information Processing, Soft / Mobile
Computing & Information Technology
Dr. Sasan Adibi
Technical Staff Member of Advanced Research, Research In Motion (RIM), Canada
Domain of Research: Security of wireless systems, Quality of Service (QoS), Ad-Hoc Networks,
e-Health and m-Health (Mobile Health)
Dr. Sikha Bagui
Associate Professor in the Department of Computer Science at the University of West Florida,
Domain of Research: Database and Data Mining.
Dr. T. V. Prasad
Dean, Lingaya's University, India
Domain of Research: Bioinformatics, Natural Language Processing, Image Processing, Expert
Systems, Robotics
Dr. Bremananth R
Research Fellow, Nanyang Technological University, Singapore
Domain of Research: Acoustic Holography, Pattern Recognition, Computer Vision, Image
Processing, Biometrics, Multimedia and Soft Computing


(iii)

IJACSA Reviewer Board
Abbas Karimi
I.A.U_Arak Branch (Faculty Member) & Universiti Putra Malaysia
Dr. Abdul Wahid
University level Teaching and Research, Gautam Buddha University, India
Abdul Khader Jilani Saudagar
Al-Imam Muhammad Ibn Saud Islamic University
Abdur Rashid Khan
Gomal Unversity
Dr. Ahmed Nabih Zaki Rashed
Menoufia University, Egypt
Ahmed Sabah AL-Jumaili
Ahlia University
Md. Akbar Hossain
Doctoral Candidate, Marie Curie Fellow, Aalborg University, Denmark and AIT, Greeceas
Albert Alexander
Kongu Engineering College,India
Prof. Alc-nia Zita Sampaio
Technical University of Lisbon
Amit Verma
Rayat & Bahra Engineering College, Mohali, India
Ammar Mohammed Ammar
Department of Computer Science, University of Koblenz-Landau
Arash Habibi Lashakri
University Technology Malaysia (UTM), Malaysia
B R SARATH KUMAR
Principal of Lenora College of Engineering, India
Binod Kumar
Lakshmi Narayan College of Tech.(LNCT). ,Bhopal


(iv)

Dr.C.Suresh Gnana Dhas
Professor, Computer Science & Engg. Dept
Mr. Chakresh kumar
Assistant professor, Manav Rachna International University, India
Chandrashekhar Meshram
Shri Shankaracharya Engineering College, India
Prof. D. S. R. Murthy
Professor in the Dept. of Information Technology (IT), SNIST, India.
Prof. Dhananjay R.Kalbande
Assistant Professor at Department of Computer Engineering, Sardar Patel Institute of Technology,
Andheri (West),Mumbai, India
Dhirendra Mishra
SVKM's NMIMS University, India
G. Sreedhar
Rashtriya Sanskrit University
Hanumanthappa.J
Research Scholar Department Of Computer Science University of Mangalore, Mangalore, India
Dr. Himanshu Aggarwal
Associate Professor in Computer Engineering at Punjabi University, Patiala, India
Dr. Jamaiah Haji Yahaya
Senior lecturer, College of Arts and Sciences, Northern University of Malaysia (UUM), Malaysia
Prof. Jue-Sam Chou
Professor, Nanhua University, College of Science and Technology, Graduate Institute
and Department of Information Management, Taiwan
Dr. Juan Jos Martnez Castillo
Yacambu University, Venezuela
Dr. Jui-Pin Yang
Department of Information Technology and Communication at Shih Chien University, Taiwan
Dr. K.PRASADH
METS SCHOOL OF ENGINEERING, India


(v)

Dr. Kamal Shah
Associate Professor, Department of Information and Technology, St. Francis Institute of
Technology, India
Lai Khin Wee
Technischen Universitt Ilmenau, Germany
Mr. Lijian Sun
Research associate, GIS research centre at the Chinese Academy of Surveying and Mapping, China
Long Chen
Qualcomm Incorporated
M.V.Raghavendra
Head, Dept of ECE at Swathi Institute of Technology & Sciences,India.
Mahesh Chandra
B.I.T., Mesra, Ranchi, India
Mahmoud M. A. Abd Ellatif
Department of Information Systems, Mansoura University
Manpreet Singh Manna
SLIET University, Govt. of India
Md. Masud Rana
Khunla University of Engineering & Technology, Bangladesh
Md. Zia Ur Rahman
Narasaraopeta Engg. College, Narasaraopeta
Messaouda AZZOUZI
Ziane AChour University of Djelfa
Dr. Michael Watts
Research fellow, University of Adelaide, Australia
Mohammed Ali Hussain
Sri Sai Madhavi Institute of Science & Technology
Mohd Nazri Ismail
University of Kuala Lumpur (UniKL)


(vi)

Mueen Malik
University Technology Malaysia (UTM)
Dr. N Murugesan
Assistant Professor in the Post Graduate and Research Department of Mathematics, Government
Arts College (Autonomous), Coimbatore, India
Dr. Nitin Surajkishor
Professor & Head, Computer Engineering Department, NMIMS, India
Dr. Poonam Garg
Chairperson IT Infrastructure, Information Management and Technology Area, India
Pradip Jawandhiya
Assistant Professor & Head of Department
Rajesh Kumar
Malaviya National Institute of Technology (MNIT), INDIA
Dr. Rajiv Dharaskar
Professor & Head, GH Raisoni College of Engineering, India
Prof. Rakesh L
Professor, Department of Computer Science, Vijetha Institute of Technology, India
Prof. Rashid Sheikh
Asst. Professor, Computer science and Engineering, Acropolis Institute of Technology and Research,
India
Rongrong Ji
Columbia University
Dr. Ruchika Malhotra
Delhi Technological University, India
Dr.Sagarmay Deb
University Lecturer, Central Queensland University, Australia
Dr. Sana'a Wafa Al-Sayegh
Assistant Professor of Computer Science at University College of Applied Sciences UCAS-Palestine
Dr. Smita Rajpal
ITM University Gurgaon,India


(vii)

Suhas J Manangi
Program Manager, Microsoft India R&D Pvt Ltd
Sunil Taneja
Smt. Aruna Asaf Ali Government Post Graduate College, India
Dr. Suresh Sankaranarayanan
Department of Computing, Leader, Intelligent Networking Research Group, in the University of
West Indies, Kingston, Jamaica
T V Narayana Rao
Professor and Head, Department of C.S.E Hyderabad Institute of Technology and Management,
India
Totok R. Biyanto
Engineering Physics Department - Industrial Technology Faculty, ITS Surabaya
Varun Kumar
Institute of Technology and Management, India
Dr. V. U. K. Sastry
Dean (R & D), SreeNidhi Institute of Science and Technology (SNIST), Hyderabad, India.
Vinayak Bairagi
Sinhgad Academy of engineering, Pune
Vuda Sreenivasarao
St.Marys college of Engineering & Technology, Hyderabad, India
Mr.Zhao Zhang
Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong


(viii)

CONTENTS
Paper 1: Building XenoBuntu Linux Distribution for Teaching and Prototyping Real-Time Operating
Systems
Authors: Nabil LITAYEM, Ahmed BEN ACHBALLAH, Slim BEN SAOUD
PAGE 1 5

Paper 2: Design and Implementation of NoC architectures based on the SpaceWire protocol
Authors: Sami HACHED, Mohamed GRAJA, Slim BEN SAOUD
PAGE 5 10

Paper 3: A Study on Cross Layer MAC design for performance optimization of routing protocols in
MANETs
Authors: P.K.Alima Beebi, Sulava Singha, Ranjit Mane
PAGE 11 16

Paper 4: Churn Prediction in Telecommunication Using Data Mining Technology
Authors: Rahul J. Jadhav, Usharani T. Pawar
PAGE 17 19

Paper 5: Knowledge-Based Systems Modeling for Software Process Model Selection
Authors: Abdur Rashid Khan, Zia Ur Rehman, Hafeez Ullah Amin
PAGE 20 25

Paper 6: Modelling & Designing Land Record Information System Using Unified Modelling Language
Authors: Kanwalvir Singh Dhindsa, Himanshu Aggarwal
PAGE 26 30

Paper 7: An Algorithm to Reduce the Time Complexity of Earliest Deadline First Scheduling Algorithm in
Real-Time System
Authors: Jagbeer Singh, Bichitrananda Patra, Satyendra Prasad Singh
PAGE 31 37


(ix)

Paper 8: Analysis of Software Reliability Data using Exponential Power Model
Authors: Ashwini Kumar Srivastava, Vijay Kumar
PAGE 38 45

Paper 9: Priority Based Dynamic Round Robin (PBDRR) Algorithm with Intelligent Time Slice for Soft Real
Time Systems
Authors: Prof. Rakesh Mohanty, Prof. H. S. Behera, Khusbu Patwari, Monisha Dash, M. Lakshmi Prasanna
PAGE 46 50

Paper 10: Application of Expert System with Fuzzy Logic in Teachers Performance Evaluation
Authors: Abdur Rashid Khan, Zia Ur Rehman, Hafeez Ullah Amin
PAGE 51 - 57

Paper 11: Dynamic Approach To Enhance Performance Of Orthogonal Frequency Division
Multiplexing(OFDM) In A Wireless Communication Network
Authors: James Agajo, Isaac O. Avazi Omeiza, Idigo Victor Eze, Okhaifoh Joseph
PAGE 58 68

Paper 12: Sectorization of Full Kekres Wavelet Transform for Feature extraction of Color Images
Authors: H.B.Kekre, Dhirendra Mishra
PAGE 69 74

Paper 13: Dominating Sets and Spanning Tree based Clustering Algorithms for Mobile Ad hoc Networks
Authors: R Krishnam Raju Indukuri, Suresh Varma Penumathsa
PAGE 75 81

Paper 14: Distributed Group Key Management with Cluster based Communication for Dynamic Peer
Groups
Authors: Rajender Dharavath, K Bhima
PAGE 82 89


(x)

Paper 15: Extracting Code Resource from OWL by Matching Method Signatures using UML Design
Document
Authors: Gopinath Ganapathy, S. Sagayaraj
PAGE 90 96

Paper 16: Magneto-Hydrodynamic Antenna Design and Development Analysis with prototype
Authors: Rajveer S Yaduvanshi, Harish Parthasarathy, Asok De
PAGE 97 104

Paper 17: An Architectural Decision Tool Based on Scenarios and Non-functional Requirements
Authors: Mr. Mahesh Parmar, Prof. W.U. Khan, Dr. Binod Kumar
PAGE 105 110

Paper 18: To Generate the Ontology from Java Source Code
Authors: Gopinath Ganapathy, S. Sagayaraj
PAGE 111 116

Paper 19: Query based Personalization in Semantic Web Mining
Authors: Mahendra Thakur, Yogendra Kumar Jain, Geetika Silakari
PAGE 117 123

Paper 20: A Short Description of Social Networking Websites and Its Uses
Authors: Ateeq Ahmad
PAGE 124 128

Paper 21: Multilevel Security Protocol using RFID
Authors: Syed Faiazuddin, S.Venkat Rao, S.C.V.Ramana Rao, M.V.Sainatha Rao, P.Sathish Kumar
PAGE 129 133


1 | P a g e
http://ijacsa.thesai.org
Building XenoBuntu Linux Distribution for Teaching
and Prototyping Real-Time Operating Systems

Nabil LITAYEM, Ahmed BEN ACHBALLAH, Slim BEN SAOUD
Department of Electrical Engineering - INSAT,
University of Carthage, TUNISIA
{nabil.litayem, ahmed.achballah, slim.bensaoud}@gmail.com
Abstract- This paper describes the realization of a new Linux
distribution based on Ubuntu Linux and Xenomai Real-Time
framework. This realization is motivated by the eminent need of
real-time systems in modern computer science courses. The
majority of the technical choices are made after qualitative
comparison. The main goal of this distribution is to offer
standard Operating Systems (OS) that include Xenomai
infrastructure and the essential tools to begin hard real-time
application development inside a convivial desktop environment.
The released live/installable DVD can be adopted to emulate
several classic RTOS Application Program Interfaces (APIs),
directly use and understand real-time Linux in convivial desktop
environment and prototyping real-time embedded applications.
Keywords- Real-time systems, Linux, Remastering, RTOS API ,
Xenomai
I. INTRODUCTION
Real-Time embedded software become an important part
of the information technology market. This kind of technology
previously reserved to very small set of mission-critical
applications like space crafts and avionics, is actually present
in most of the current electronic usage devices such as cell
phones, PDAs, sensor nodes and other embedded-control
systems [1]. These facts make the familiarization of graduate
students with embedded real-time operating systems very
important [2]. However, many academic computer science
programs focus on PC based courses with proprietary
operating systems. This could be interesting for professional
training, but inappropriate to the academicians because it
limits students to these proprietary solutions.
In the RTOS market, there are some predominant actors
with industry-adopted standards. Academic real-time systems
courses must offer to the students the opportunity to use and
understand the most common RTOSs APIs. Actually, we assist
to the growing interest of the real-time Linux extensions. In
fact, they must be considered with great interests since each
real-time Linux extension offers a set of advantages [3].
Xenomai real-time Linux extensions have the main advantages
to emulate standard RTOS interfaces, compatibility with non-
real-time Linux. Such adoption can be very cost-effective for
the overall system.
In this paper, we present the interest of using Xenomai and
Ubuntu as live installable DVD for teaching real-time
operating systems and rapid real-time applications
prototyping. Technical choices and benefits of the chosen
solutions will be discussed.
The remainder of this paper is organized as follows.
Section 2 presents a survey of RTOS market and discusses
both of the classic solution and the Linux-based alternatives.
The Remastering solutions and available tools are detailed in
Section 3. Section 4 describes the realization of our live DVD.
Conclusions and discussion are provided in Section 5.
II. SURVEY OF THE RTOS MARKET
A. Classic RTOS and Real-Time API
RTOS is an essential building block of many embedded
systems. The most basic purpose of RTOS is to offer task
deadline handling in addition to classic operating system
functionalities. The RTOS market is shared between few
actors. Each of them has its appropriate development tools, its
supported target, its compiler tool-chain and its RTOS APIs.
In addition, several RTOS vendors can offer additional
services such as protocol stacks and application domain
certification.
According to the wide varieties of RTOSs, the designers
must choose the most suitable one for their application
domain. In the following, we will present a brief description
of traditional RTOS and real-time API available in the
embedded market.
1) VxWorks
VxWorks [4] is a RTOS made and sold by Wind River
Systems actually acquired by Intel. It was primary designed
for embedded systems use. VxWorks continues to be
considered as the reference RTOS due to its wide range of
supported targets and the quality of its associated IDE.
2) PSOS
This RTOS [5] was created in about 1982. It was widely
adopted especially for Motorola MCU. Since 1999 PSOS has
been acquired by Wind River Systems.

2 | P a g e
3) VRTX
VRTX [6] is an RTOS suitable for both traditional board-
based embedded systems and System on Chip (SoC). It was
widely adopted for RISC microprocessors.
4) POSIX
POSIX [7] (Portable Operating System Interface for
Computer Environments), is a set of a standardized interface
that provides source level compliance for RTOS services.
B. Real Time Linux Alternatives
According to the reference study [4], the market place of
embedded Linux becomes more and more important. In 2007
their part was about 47% of the total embedded market. The
same study anticipates that the market place of embedded
Linux will be 70% in 2012. These facts can be justified by the
growing availability resources in modern embedded hardware,
the maturity of actual Linux kernels and applications and the
cost reduction needs. Actually, there are many existing open
source implementations of real-time extensions for Linux
kernel, but we must note that various existing industrial
solutions are based on those extensions with an additional
value of support quality. Real-time Linux variants are actually
successfully used in different applications [5]. Due to the
increasing Linux popularity in the embedded systems' field,
many efforts were spent and proposed to transform Linux
kernel into a real-time solution. These works resulted in
several implementations of real-time Linux. Actually, there
are many existing implementations of real-time extension for
Linux kernelextension [6]. They can be classified in two
categories according to the approach used to improve their
real-time performance of the Linux kernel. The first approach
consists of modifying the kernel behavior to improve its real-
time characteristics. The second approach consists of using a
small real-time kernel to handle real-time tasks and what can
run the Linux kernel as a low priority task.
Actually, a lot of researches and industrial efforts are made
to enhance the real-time capability of the various real-time
Linux flavors' [3] for different perspectives and applications
domain. These works can be classified in two categories. The
first one is about scheduling algorithm and timer management.
The second category is about applications domain such as
Hardware-in-the-Loop simulation system, model based
engineering [7] and real-time simulation. In Table I, we
present some of the available open source many research
Linux implementations.
TABLE I. LINUX OPEN SOURCE RTOSS
Linux-
based RTOS
Description
ADEOS
Adaptive Domain Environment for Operating
Systems) [11], is a GPL nanokernel hardware
abstraction layer created to provide a flexible
environment for sharing hardware resources among
many operating systems. ADEOS enables multiple
prioritized domains to exist simultaneously on the same
hardware.
ART Linux Advanced Real-Time Linux [12], is a hard real-
time Linux extension inspired from RTLinux and
developed with robotics applications in mind. Real-
Time is accessible from user level and does not require
special device drivers. ART Linux is available for 2.2
and 2.6 Linux kernel.
KURT

Kansas University's Real-Time Linux is a real-time
Linux [13] extension developed by the Kansas
University for x86 platforms. It can allow scheduling of
events with a 10s resolution.
QLinux
QLinux [14] real-time Linux kernel, is a Linux
extension that focus and provide Quality of Service
(QoS) guarantees for "soft real-time" performance in
applications such as multimedia, data collection, etc.
Linux/RK
Linux Resource Kernel [15] is a real-time
extension which incorporates real-time extensions to the
Linux kernel.
RTAI
Real-Time Application [16] Interface usable both
for mono processors and symmetric multi-processors
(SMPs), that allows the use of Linux in many "hard
real-time" applications. RTAI is the real-time Linux that
has the best integration with other open source tools
scilab/scicos and Comedi. This extension is widely used
in control applications.
Xenomai
Xenomai [17] is a real-time development framework
that provides hard real-time support for GNU/Linux. It
implements ADEOS (I-Pipe) micro-kernel between the
hardware and the Linux kernel. I-Pipe is responsible for
executing real-time tasks and intercepts interrupts,
blocking them from reaching the Linux kernel to
prevent the preemption of real-time tasks by Linux
kernel. Xenomai provides real-time interfaces either to
kernel-space modules or to user-space applications.
Interfaces include RTOS interfaces (pSOS+, VRTX,
VxWorks, and RTAI), standardized interfaces (POSIX,
uITRON), or new interfaces designed with the help of
RTAI (native interface).These features made that
Xenomai was considered as the RTOS Chameleon for
Linux. It was designed for enabling smooth migration
from traditional RTOS to Linux without having to
rewrite the entire application.
RT-Preempt
The RT-Preempt patch [18] converts Linux into a
fully preemptible kernel. It allows nearly the entire
kernel to be preempted, except for a few very small
regions of code. This is done by replacing most kernel
spinlocks with mutexes that support priority inheritance
and are preemptive, as well as moving all interrupts to
kernel threads. (Dubbed interrupt threading), which by
giving them their own context allows them to sleep
among other things.
C. Selecting real-time extension for educational puropses
Xenomai, RTAI and RT-Prempt are the most used real-
time Linux extensions. According to the study [8], Xenomai
and RTAI can provide interesting performances comparable to
those offered by VxWorks in hard real-time applications.
RTAI has the best integration with open source tools and can
be remarkable for teaching control application. RT-Prempt has
the privilege to be integrated to the mainline kernel. It offers
the support of all drivers integrated into the standard kernel.
Xenomai can provide the capability of emulating classic
RTOS APIs with good real-time characteristics. It can be also
fully compatible with RTAI. For these reasons, we focus on
Xenomai to be the primary extension to integrate in our
solution.

3 | P a g e
1) Xenomai technology and ADEOS
To make Xenomai tasks hard real-time in GNU/Linux, a
real-time application interface (RTAI) co-kernel is used. It
allows real-time tasks to run seamlessly aside of the hosting
GNU/Linux system while the tasks of the regular Linux
kernel are seen as running in a low-priority mode. This
cohabitation is done using the previously presented ADEOS
nanokernel and illustrated by figure 1.

Figure 1. Concurrent access to hardware using ADEOS
Based on the behavioral similarities between the traditional
RTOS, Xenomai technology aims to provide a consistent
architecture-neutral and generic emulation layer taking
advantages from these similarities. This emulation can lead to
fill the gap between the very fragmented RTOS world and the
GNU/Linux world.

Figure 2. Xenomai Architecture
Xenomai relies on the shared features and behaviors [9]
found between many embedded traditional RTOS, especially
from the thread scheduling and synchronization standpoints.
These similarities are used to implement a nucleus that offers a
limited set of common RTOS behavior. This behavior is
exhibited using services grouped in high-level interfaces that
can be used in turn to implement emulation modules of real-
time application programming interfaces. These interfaces can
mimic the corresponding real-time kernel APIs. Xenomai
technology offers a smooth and comfortable way for real-time
application migration from traditional RTOS to GNU/Linux.
2) RTOS emulation in the industrial field
The fact that Xenomai can offer real-time capabilities in a
standard desktop environment can be very useful in control
system prototyping [10]. In this case, the desktop system
which is running Xenomai can be used as X-in-the-loop to
emulate the standard controlled equipments (electrical motors,
power plants, etc.) in different phases of product prototyping
and testing.
Thus, since Xenomai can emulate the most classic RTOS
API, we can easily port any application developed for this
RTOS to it. Furthermore, its covered by open-source license
which has a very interesting cost advantage. Moreover, by
Using Xenomai we can realize an easy migration to open-
source solutions without having to rewrite previously
developed RT applications for proprietary RTOS. Xenomai
can also reduce the application price by offering the ability to
cohabit them with standard time shared Linux applications, to
benefit from all the software infrastructure of Linux combined
to RT capability.
3) RTOS emulation in academic field
Real-Time students must have a clear idea about various
RTOS APIs. The cost of buying a large collection of classic
RTOS to use them in the education field is not feasible.
Xenomai also offers the capability of using different RTOS
APIs, understanding the abstraction concept of them,
manipulating the kernel/user spaces and learning about
virtualization technologies.
III. REMASTERING UBUNTU LINUX
A. Interest of remastering Linux distributions
A live CD or DVD allows any user to run different OS or
applications without having to install them on the computer.
To build a live CD/DVD, we must remaster an existing OS.
Remastering is the process of customizing a software
distribution. It is particularly associated with the Linux
distribution world but it was extended to the majority of
widely used OS. We can highlight that the most Linux
distributions have been started by remastering another
distribution. The term was popularized by Klaus Knopper,
creator of the Knoppix Live Distribution, which has
traditionally encouraged its users to modify his distribution in
the way that satisfies their needs. Remastering OS can be used
to make a full system backup including personal data to a live
or installable CD, DVD or Flash disk that is usable and
installable anywhere. It can also be exploited to make a
distributable copy of an installed and customized operating
system.
B. Existing remastering software for Ubuntu Linux
There are many remastering solutions of various Linux
distributions. Our live/installable DVD is based on Ubuntu
Adeos/ I-Pipe
Domain A
Xenomai Nucleus
Domaine
B
rt_task
Process
rt_task
rt_task
K
e
r
n
e
l

s
p
a
c
e

U
s
e
r

s
p
a
c
e

s
k
i
n
s

Hardware
K
e
r
n
e
l

s
p
a
c
e

Adeos
HAL
Real-time nucleus
Syscall interface
VR
TX
uIT
RON
pS
OS
Vx
Works
POS
IX
Nati
ve
User-space applications
Kernel-based
applications
u
s
e
r
s
p
a
c
e


4 | P a g e
because this OS has gained a growing place in different
application areas. The most known remastering solutions are
such as Remastersys, Ubuntu Customization Kit,
Reconstructor, Builder, ulc-livecd-editor and Disc
Remastering Utility.
Reconstructor and Ubuntu Customization Kit can make a
personalized live system based on official image. The use of
such approach is relatively complicated. The others' tools are
focusing on the package installation and boot customizations.
We adopted Remastersys since it is the most useful and
powerful tool that we find in the available list of remastering
solutions.
IV. THE INTEGRATION OF XENOMAI IN UBUNTU
INFRASTRUCTURE
Xenomai is only related to the Linux kernel version. Its
independent of the Linux distribution in which it will be run.
The recent Ubuntu distributions integrate Xenomai as a default
package. We have taken the choice of using Ubuntu as basic
distribution because it inherits all the benefit of a Debian
distribution in terms of reliability and the number of available
packages. Ubuntu has also the best existing Multilanguage
support. Many computer constructors propose Linux as an
alternative operating system. This type of systems can be used
as a framework for Model-Driven Engineering (MDE) in
Control and Automation since the usage of a standard
operating system such as Ubuntu can facilitate the integration
of these tools. The realization of our Live DVD was conducted
following the steps' bellow.
A. Adding Xenomai functionality to Linux kernel
In This step, we must firstly download the essential
packages needed to configure and compile the Linux kernel.
These packages are: build-essential, kernel-package, ncurses-
dev. They can be installed using synaptic or apt-get. Secondly,
we must download both Linux kernel and its compatible
Xenomai framework, patch the Linux kernel using the
prepare-kernel tool included in Xenomai package, configure,
compile it and add this kernel to the boot choices. For the
actual release, we used the xenomai-2.4.10 and the Linux-
2.6.30.8. The compilation and installation must preferably be
realized using make-kpkg tools designed especially for Debian
based distributions. After realizing these steps, we can boot a
system running Linux kernel using ADEOS.
B. Compiling Xenomai and running some samples
The second step must begin by creating a Xenomai group
and adding to it the appropriate users (XenoBuntu and root).
We can actually configure, compile and install Xenomai and
their examples, customizing available software by adding
development environments. We adopted CodeLite, which is an
Integrated Development Environment (IDE) designed for C
and C++ development and Scilab/Scicos which can be used
for control systems prototyping and real-time code generation.
After rebooting our running system, we can boot to a usable
system based on Xenomai through it, we can test some real-
time examples based on different standards API.
C. Transform our running system in a live DVD
The final step is to remaster our running real-time system
using Remastersys package. Before we move to the
explanation of this stage, we give a brief description of this
tool. In fact, its a Debian oriented remastering tool,
controllable using command line or Graphic User Interface. It
enables the creation of live installable CDs or DVDs,
including all the software available in the installed system. We
can choose to include or not our personal data by choosing
between dist and backup parameters. We must add
Remastersys repository, install and use it to remaster our
system to obtain an .iso burnable image usable as live
installable DVD. This phase is the easiest step in the
realization thanks to the simplicity of Remastersys usage and
the wide choice of parameters offered by this package.
D. Testing the real-time characteristics of our system
The realized system can be used as live DVD or installed
in a standard PC architecture. The real-time performances may
vary depending on the used architecture. To have a clear idea
about reached performances by deployment platform,
Xenomai offers a set of benchmarks able to test different real-
time aspects of the system. The most important benchmarks
are described in Table II.
TABLE II. XENOMAI ASSOCIATED BENCHMARKS
Benchm
ark
Description
Switchtest Can test thread context switches.
Switchbe
nch
Can measure the contexts switch latency between two
real-time tasks.
Cyclictes
t
Can be used to compare configured timer expiration and
actual expire time.
Clocktes
t
Can be used to repeatedly prints a time offset compared to
reference gettymeofday().

These benchmarks can be used to familiarize students with
real-time performance evaluation and their different associated
metrics. Such can be illustrated by the evaluation of the impact
of real-time enhancements into the overall system
performances.
V. CONCLUSION
The realized live/installable DVD can be used both in
education or system development. The main contribution of
such solution is to have a ready to run system, which
minimizes the time of selecting and including different needed
software components. This system can be enhanced and
remastered after its installation and can be tuned by inclusion
of new components to meet specific application needs.
This kind of solution offers the possibility to work with
real-time Linux without losing the contact with classic RTOS
knowledges. It can be a very interesting way to introduce
real-time and embedded Linux word especially when
considering that Xenomai is actually used by various
companies such as Sysgo in their ELinOS solution.

5 | P a g e

Considering that this distribution does not take the
advantage of the two other predominant real-time Linux
extensions (RTAI and RT-PREEMPT). We plan to extend our
distribution with these two extensions by including multi
configuration boot capability, which can allow the user to
choose between these three alternatives. In the other hand, we
plan to include and explore other open-source components that
can be used for real-time applications design and code
generation such as Topcased, Openembedd and Beremiz.
REFERENCES
[1] , , 2006
[2] , J. Ganssle,The Art of Designing Embedded Systems, 2008.
[3] G. Sudha, et al.,Enhancing Student Learning with Hands-On RTOS
Development in Real-Time Systems Course, in 38th ASEE/IEEE
Frontiers in Education Conference, 2008, pp. S2H-11 - S2H-16
[4] N. Vun, et al., Real-time Enhancements for Embedded Linux, in
IEEE International Conference on Parallel and Distributed Systems,
2008, pp. 737 - 740.
[5] Wind River, VxWorks Program guider, Tsinghua University Press ,
August 2003
[6] L Bacon, E Becquet, E Gressier-Soudan, C Lizz Provisioning QoS in
Real-Time Distributed Object Architectures for Power Plant Control
Applications, April 30th 2000
[7] Ready, J., "VRTX: A Real-Time Operating System for Embedded
Microprocessor Applications," IEEE Micro, pp. 8-17, Aug. 1986.
[8] M. A. Rivas and M. G. Harbour, \Evaluation of
[9] new posix real-time operating systems services
[10] for small embedded platforms," in Proceedings
[11] of the 15th Euromicro Conference on Real-Time
[12] Systems, (Porto, Portugal), July 2-4 2003.
[13] Z. Davis, Snapshot of the embedded Linux market, Available:
http://www.linuxdevices.com/articles/AT7065740528.html, 2007.
[14] S. Kume, Kanamiya, Y, Sato D,Towards an open-source integrated
development and real-time control platform for robots, IEEE, vol.
International Conference on Robotics and Biomimetics, pp. 22-25, 2008.
[15] M. T. Jones,Anatomy of real-time Linux architectures From soft to
hard real-time, IBM 2008.
[16] G. Doukas and K. Thramboulidis, A Real-Time Linux Based
Framework for Model-Driven Engineering in Control and Automation,
2009.
[17] A. Barbalace, et al., Performance Comparison of VxWorks, Linux,
RTAI, and Xenomai in a Hard Real-Time Application, IEEE
Transactions on Nuclear Science, pp. 435 - 439, 2008.
[18] K. Yaghmour, et al.,Building Embedded Linux Systems, Andy Oram
ed.: OReilly, 2008.
[19] B. W. Choi, et al., " Real-time control architecture using Xenomai for
intelligent service robots in USN environments, Intelligent Service
Robotics, pp. 139-151, 2009.
[20] Z. Chen , X. Luo , Z. Zhang, Research Reform on Embedded Linuxs
Hard Real-time Capability in Application, Embedded Software and
Systems Symposia, 2008. ICESS Symposia '08. International
Conference on 29-31 July 2008 Page(s): 146 - 151.
[21] S. Dietrich and D. Walker,The evolution of real-time linux, In Proc.
7th Real-Time Linux Workshop, 2005.
[22] B. Srinivasan, S. Pather, R. Hill, F. Ansari, and D. Niehaus,A Firm
Real-Time System Implementation Using Commercial Off The Shelf
Hardware and Free Software, IEEE RealTime Technology and
Applications Symposium, June 1998.
[23] V. Sundaram, A. Chandra, P. Goyal, P. Shenoy, J Sahni, and H Vin,
Application Performance in the QLinux Multimedia Operating
System, In Proceedings of the Eighth ACM Conference on Multimedia,
Los Angeles, CA, November 2000.
[24] S. Oikawa and R. Rajkumar,Linux/RK: A portable resource kernel in
Linux, In Proceedings of the IEEE Real-Time Systems Symposium
Work-In-Progress, Madrid, December1998.
[25] D. Beal, E.Bianchi, L. Dozio, S. Hughes, P.Mantegazza, and S.
Papacharalambous, RTAI: Real-Time Application Interface, Linux
Journal, 2000.
[26] P. Gerum, Xenomai - Implementing a RTOS emulation framework on
GNU/Linux, Whitepaper, 2004.
[27] A. Siro, C. Emde, and N. McGuire, Assessment of the realtime
preemption patches (RT-Preempt) and their impact on the general
purpose performance of the system, In Proceedings of the 9th Real-
Time LinuxWorkshop, 2007.

AUTHORS PROFILE
N. LITAYEM received the Dipl.Ing. and M.S. degrees in electrical
engineering from National School of Engineer of Sfax (ENIS), Tunisia,
in 2005 and 2006, respectively. He received the MS. degree in
Embedded Systems Engineering from the National Institute of Applied
Science and Technologies, Tunisia in 2009. Currently, he is a Ph.D
student with the Laboratoire d'Etude et de Commande Automatique de
Processus (LECAP) at the university of Carthage (INSAT-EPT). His
research interests are the reliable control of electrical drives using FPGA
technologies.
A. BEN ACHBALLAH received the BSc degree in Electronics from Bizertes
Faculty of Sciences in 2007 and the MSc degree in Instrumentation and
Measure from the National Institute of Applied Sciences and
Technology of Tunis (INSAT) in 2009. Currently, he is a PhD Student
with the Laboratoire d'Etude et de Commande Automatique de
Processus (LECAP) at the Polytechnic School of Tunisia (EPT). His
research interests include FPGA-based simulators for embedded control
applications, simulation methodologies for network-on-chips and high
level synthesis technique.
S. BEN SAOUD (1969) received the electrical engineer degree from the High
National School of Electrical Engineering of Toulouse/France
(ENSEEIHT) in 1993 and the PhD degree from the National Polytechnic
Institute of Toulouse (INPT) in 1996. He joined the department of
Electrical Engineering at the National Institute of Applied Sciences and
Technology of Tunis (INSAT) in 1997 as an Assistant Professor. He is
now Professor and the Leader of the Embedded Systems Design
Group at INSAT - University of Carthage. His research interests
include Embedded Systems Architectures, real-time solutions and
applications to the Co-Design of digital control systems and SpaceWire
modules.


5 | P a g e
Design and Implementation of NoC architectures
based on the SpaceWire protocol

Sami HACHED
Dept. of Electrical Engineering
University of Carthage, INSAT,
LECAP Laboratory (EPT/INSAT)
TUNISIA
[email protected]
Mohamed GRAJA
University of Carthage, INSAT,
TUNISIA
[email protected]
Slim BEN SAOUD
University of Carthage,INSAT,
TUNISIA
[email protected]

Abstract- The SpaceWire is a standard for high-speed links and
networks used onboard spacecrafts, designed by the ESA, and
widely used on many space missions by multiple space agencies.
SpaceWire has shown a great flexibility by giving the space
missions a wide range of possible configurations and topologies.
Nevertheless, each topology presents its own set of tradeoffs such
as hardware limitations, speed performance, power consumption,
costs of implementation, etc. In order to compensate these
drawbacks and increase the efficiency of the SpaceWire
networks, many solutions are being considered. One of these
solutions is the Network on Chip or NoC configuration which
resolves many of these drawbacks by reducing the complexity of
designing with a regular architecture improving speed, power,
and reliability and guaranteeing a controlled structure. This
paper presents the main steps for building Network on Chip
based on the SpaceWire protocol. It describes the internal
structure, the functioning and the communication mechanism of
the Networks Nodes. It also exposes the software development
and the validation strategy, and discusses the tests and results
conducted on the adopted NoC topologies.
Keywords- NoC architectures; SpaceWire protocol; Codec I P;
,FPGA.
I. INTRODUCTION
The SpaceWire standard was developed to meet the
electromagnetic compatibility (EMC) specifications of typical
spacecraft. It is also easy to implement, has a great flexibility,
and supports fault-tolerance by providing redundancy to
networks [1,2]. In order to interpret and exploit the received
data coming from space equipments correctly, the reliability
and the protection from errors of these data are essential. To
ensure reliability without jeopardizing the spacecraft operations
neither its components, the Spacewire was quickly adopted in
many space missions, not only in the ESA but also among the
largest agencies world's space like NASA, JAXA, or RKA.
The financial constraint of these missions eventually led to
many researches to elaborate new strategies of integration to
reduce weight and improve configurability within a distributed
satellite system (DSS) network. Among the solutions explored,
the Network on Chip (NoC) is a recent solution paradigm
adopted to increase the performance of multi-core designs [3,
4]. The key idea is to interconnect various computation
modules (IP cores) in a network fashion and transport packets
simultaneously across them, thereby gaining performance [5,6].
In addition to improving performance by having multiple
packets in flight, NoCs also present others advantages
including scalability, power efficiency, and component re-use
through modular design. The concept of Network-on-Chip or
NoC is a new approach to design mechanisms for internal
communications of a System On-a-Chip. The NoC-based
systems can accommodate multiple asynchronous clock signals
that use recent complex SoC [7, 8, 9].
The desired architecture built using NoC allows
interconnecting different systems types via a communications
network, all integrated on a chip. This configuration gives the
benefits of large scale integration, which leads to a reduction of
power consumption and a significant decrease in wiring and
noise. On the other hand, it allows the system to take advantage
of techniques and theories of development network as the
routing of data, parallelism and communication modularity.
In this context many works were conducted such as the
SpaceWire based System-on-Chip (SoCWire) communication
network. Designed to give enhanced dynamic reconfigurable
processing module, the SoCWire provides a mean to
reconfigure various parts of the FPGA during flight safely and
dynamically. The SoCWire has a dynamic partial
reconfigurable architecture with host system including
SoCWire CODEC and PRMs (Partial Reconfigurable Modules)
with SoCWire CODEC as well as an additional module for
control and data generation [10]. Besides, Configurable
System-on-Chip (SoC) solutions based on state-of-the art
FPGA have successfully demonstrated flexibility and reliability
for scientific space applications. Furthermore, in-flight
reconfigurability and dynamic reconfigurable modules
enhances the system with maintenance potential and at runtime
adaptive functionality. To achieve these advanced design goals
a flexible Network-on-Chip (NoC) is proposed which supports
in-flight reconfigurability and dynamic reconfigurable modules
[11, 12, 13, 14].
The INTA (Instituto Nacional de Tcnica Aeroespacial) has
made several successful missions and focuses on small
satellites with NANOSAT and INTASAT. In addition, the
INTA has recently made a data architecture based on
Spacewire protocol for the INTASAT. This design includes

6 | P a g e
several subsystems including an On Board Data Handling
(OBDH). Due to its importance as it ensures the best
performances for data exchange on small satellites, the OBDH
subsystem also comprises dedicated Remote Terminal Units
(RTU) and a Mass Memory Unit (MMU). The ladder consists
of a set of SDRAM memory banks and a Payload Processor as
part of a System-on-Chip design. Together with the processor,
the SoC will provide a Spacewire router, a low data rate
interface with the payload and a CAN interface [15].
Moreover, recent researches discuss other possibilities and
solutions to use the SoC technology. In fact, the similarities
between the IEEE802.11 and the Spacewire standards were
exploited in order to design a wireless link for Spacewire
networks. The idea was to create a bridge to connect
SpaceWire routers with an IEEE802.11 transceiver. This bridge
was also capable of converting data from one standard format
to the other and controlling information flow and had a
mechanism to provide memory access to SpaceWire routers.
The bridge was incorporated into a high performance SoC,
which provides a communication platform enabling spacecraft
with SpaceWire networks to communicate via inter-satellite
links based on the IEEE 802.11 wireless network standard [16].
Additionally, other works present a design of a SpaceWire
Router-network using CSP (Communication Sequential
Processes) where an IEEE1355 network router is integrated as
an Intellectual Property (IP) core [17]. The router design has
been evaluated, refined and verified from the point of view of
robustness and security using CSP method, one of the formal
design methods [18]. The router was implemented in a
Network on Chip (NoC) formed with several TPCOREs [19]
an IP (Intellectual Property) core for the T425 Transputer
where the same machine instructions as the transputer are
executable in this IP core.
In this paper, we investigate the protocol SPW which is
widely used to provide communications between devices in
satellites and we propose to explore different implementation
solutions in the form of NoC. After a brief presentation of the
NoC solutions advantages of, we introduce the SpaceWire IP
Codec we developed and its internal components. Then we
discuss the adopted design strategy and present the structure of
our SpaceWire node which is the unitary element of the
discussed NoC solutions. Finally we expose the main studied
NoC architectures, their performance/capability and the
obtained testing results.
II. ADVANTAGE OF NOC ARCHITECTURES
The NoC solution can combine theories and methods used
in networks communication mechanisms on chip. Compared to
the communication bus conventional, they greatly improve the
scalability of systems on chip, the exchange data and power
consumption in complex systems. A network on chip is
established from multiple type connections point-to-point
interconnected by routers. The messages or frames can
circulate throughout the network and reach all machines. The
design of a NOC is similar to a telecommunication network-
based modems and uses the switching digital bits or packets
through links multiplexed [20].
Thanks to this architecture, the connections are shared by
several signals that achieve a high level of parallelism since all
links may convey different packets of data simultaneously. And
more complex integrated networks increases, the size and
performance NoC are improved over previous architectures
communications [21].
Networks-on-chip (NoCs) are designed to provide high
bandwidth and parallel communication, while minimizing the
number of employed interconnect resources. However, NoC
performance can significantly degrade in absence of an
effective flow control mechanism. The flow control is
responsible for preventing resource starvation and congestion
inside the network. This goal is achieved by managing the flow
of the packets that compete for resources, such as links and
buffers [22, 23].
Due to the reduction in the constructions scale and the
circuits optimization, the NoCs tend to reduce design
complexity and routing connections in a SoC, while ensuring
low power consumption, minimal noise and high
reliability. NoCs support modularity and allow distribution of
data processing and communication. They are very suitable for
test platforms and increased productivity.
III. THE SPACEWIRE NODE DESIGN AND VALIDATION
A. The Node Hardware Design
A SpW node is composed mainly of two modules, namely
(1) the host based on the processor on which the user
application will be implemented and (2) the SpW CODEC
implementing the protocol interface Fig. 1. The Host
represents a processing unit that will execute tasks and
eventually communicate with other nodes. While the SpW
interface plays the role of the modem and enables interfacing
with the SpW network Fig. 2.
The designed SpW CODEC IP in Fig. 1, is composed of
the following main modules:
Transmitter module: responsible for emissions. Its
role is to synchronize the reception of data arriving
from the host, build frames Spacewire (FCTS,
NULLS, N-CHARS and Times Codes) and send
coded "DATA-STROBE.
Receiver module: responsible for receiving data.
Unlike the transmitter module, it decodes the
incoming frames, classifies them by type, and
synchronizes the delivery of N-CHARS and time
codes with the host.
FSM module: is the state machine responsible for
connection and communications management. This is
the heart of the interface that controls the operation of
all other modules.
Timer module: has the task of ensuring the necessary
synchronization for initializing a connection.

7 | P a g e

Figure 1. Overview of the SpaceWire CODEC architecture

Figure 2. Overview of the SpaceWire node architecture
For the processing unit, a Xilinx MicroBlaze
microprocessor (soft-core clocked at 100 MHz) is used and
connected to:
A RAM Memory bloc of 64 Kbits with its controllers.
An OPB bus controller for OPB peripherals.
An UART to control, monitor and debug the
functioning of the node with a computer.
An interrupt controller for the management of priority
signals.
The HOST communicates with its devices on the OPB Bus.
Data exchange is made via formatted requests and responses
through the bus and in its native form. The SpaceWire
Interfaces IP is only usable with low-level binary signals. It is
far from being equipped to be directly plugged into the OPB
peripherals bus. To overcome this inability, an OPB IPIF
(Intellectual Property InterFace) was added; so we can
transform our raw IP into an OPB interface device that can
communicate not only with our processor but can also serve in
any other project involving this bus.
The concept of the IPIF is to regroup the different I/O into
registers interposed between the OPB bus and the IP. The
exploitation of the device (formerly binary) is achievable
through loading and unloading registers driver-functions
targeted by the address of the device in question. So in order to
command or read any I/O of the interface, we have just to load
the bits of the register covering that I/O while identifying the
interface on which the changes will be brought.
B. The Node Software Development
The programming of host was made in C language with
EDK environment. We wrote a function for each elementary
operation of the interface: initialization, sending and receiving
data, sending and receiving time codes and the display on the
output stream. This facilitates the programming, avoids
cluttering our source code and facilitates the errors localization
when debugging.
With these functions we created a library dedicated to our
device. So the hosts main program will be less loaded and the
update or the summoning of a function in several source codes
does not require its full presence / reissue in each code. That
was very handy in a NOC programming or any multiprocessor
configuration.
C. Tests and validation
In order validate the functioning of the designed node; we
have established a checklist that includes the main criteria for
compliance with the standard. We made an experimental plan
that runs through scenarios putting in trial : The proper
connection, the proper sending and receiving of data (Chars,
control codes, time codes) and the proper functioning of the
FiFos (Indicators, limitations and good management).
The tests were conducted on a self-looped Node and on a
host equipped with two interconnected interfaces.
IV. THE SPACEWIRE NOC DESIGNED ARCHITECTURES
A. The double Nodes Network:
We first began with a double node network-on-chip that
consists of two nodes connected to each others. We have
doubled the material of a single node, so each Node has its own
processor, working memory and peripherals. The only
exclusive device to one of the nodes is the serial interface that
does not support multi-drive and is used to debug the program
and the functioning of the network.
We could use an OPB Bridge to interconnect the
processors OPB buses on and provide simultaneous access to
the UART, but it goes against our main objective which is to
route two separate nodes connected only by Data-Strobe
signals for SpaceWire communication. The node equipped with
UART will perform the monitoring of the network and inform
us of its status. If need dictates, it is possible to supervise each
node individually by connecting it to the UART, but with each
change, a new synthesis and a new routing is required.
Concerning the software layer, the two processors have
their own individual programming source, headers and libraries
(dual-processor configuration) and exploit the SpaceWire
library we created.
To test this setup, we used a simple source code,
exchanging data. The tests scenario consists on sending X data
and time / control codes from the first node to the second and
the second node sends fully the received data to the first one as
shown in the flowchart diagrams of Fig. 3 and Fig. 4. The
tests were positive: Whether in data, control codes or time
codes exchange, the dual node architecture meets the standards
of Spacewire protocol. Using this architecture, a parallel
processing could be made with an exchange of data at high
speed and very low error rate.

8 | P a g e

Figure 3. Node A flowcharts: (a) Main Program; (b) Interruption

Figure 4. Node B flowcharts: (a) Main Program; (b) Interruption
B. The triple Nodes Network
In the fully meshed networks, the nodes are linked together
by direct connections (without any intermediary node). When a
node needs to communicate with another it refers directly to it
(no data carrying or intermediate machine). This direct
communication provides the mesh network the highest possible
performance in terms of communications: It minimizes the loss
of data, errors and data transfer time. Unfortunately fully
meshed networks suffer from a major handicap which is the
high number of links and communications interfaces necessary
for their establishment. Indeed, the number of interface is
double the number of connections which is N (N-1) / 2. That
is costly in terms of hardware (special cables, shielding,
interfaces ...), size and weight.
At our level of integration, the problems related to the scale
are resolved; the only persisting limitation is the capacity of the
FPGA. The space probe intended to explore Venus has
received a Virtex 4 with a Spacewire network and feedback
over this mission are very good! Thats why we decided to
explore this kind of Networking Method on Chip.
In the same way as the dual node architecture, we tripled
the number of components. The only difference compared to
the dual node architecture lies in the number of interfaces in
each node. We provided every Spacewire node with two
interfaces: One for each channel. So when Node A needs to
transfer data to Node B or C, it sends them through the
interface linking it with the Node in question.
The network architecture is as follows:

Figure 5. Triple nodes Architecture
As for the dual network node, each of our three nodes must
be programmed separately. The testing scenario we executed
consists on sending data through the network and make it
returns back to its issuers: We send data and time / control
codes from the first node to the second, the second node sends
them to the 3rd which will forward them back to the first (that
supervises the network and displays the data movement on the
serial connection).
According to the original node and the destination node, the
data are sent threw the interfaces linking to the two concerned
nodes.
The programs of each node are illustrated by the flowcharts
diagrams shown on Fig. 6 and Fig. 7:

Figure 6. The Node A distributed program


9 | P a g e

Figure 7. The Node B and C distributed programs
As for the dual nodes network, the transmission and
reception of data takes place properly. The nodes communicate
correctly. The transmitted data does a complete turn and returns
back to its issuer. The Network is well operational.
V. TESTS AND VALIDATION OF DESIGNED SYSTEMS
In this work, we have developed NoC architectures based
on the SpW communication protocol which is widely used as a
proved reliable interface standard on-board spacecrafts. Our
work focused on implementing a SpW CODEC and its
integration into point to point networks architectures.
The designed IP has several features such as:
- It operates at a frequency of 100 MHz while
maintaining the standard (flow and exchange time of
silence);
- It is synthesizable and routable;
- It does not load its FiFo Randomly (total
synchronization with the host);
- It connects and operates normally when it is
associated to a host.
In order to test the SpW CODEC IP, we have programmed
various codes that emulate a HOST connection with a specific
behaving scenario. After each execution of a test, the temporal
evolution of signals has been visualized. So we could conclude
on the smooth running of the script and compliance behavior of
the interface.
In addition, this IP has been associated with a host and
tested with different configurations of NoCs. The obtained
results showed the efficiency and flexibility of the proposed
solution for systems based on FPGA circuits.
The digital clock manager DCM of the used FPGA (Virtex
II Pro) is limited to 100 MHz, which limits our
communication rate to 50 Mbits/s. The use of more recent
FPGA circuits such as the Virtex 6 would increase the number
of nodes and the transmission frequencies.
VI. CONCLUSION
In this paper, we study various NoC architectures based on
the SpW protocol. These solutions integrate the SpW IP codec
developed as part of this work and are achieved by the
introduction of point to point interconnections between
different nodes. Each node consists of a host processor
associated with one or more SpW interfaces (IP codec).
Compared to other related works like SoCwire Or MARC
architecture, we built simple but versatile and intelligent nodes
that can be used to develop more complex applications. In fact,
the main benefits of the proposed architectures are modularity,
flexibility and usability. Since the Nodes are all connected
directly to each other through the dedicated IP codec, (without
any intermediate units) communication speed is increased,
routing errors are reduced, and in case of an error or a
disconnection between two nodes occurs, alternate way to
transmit data through other nodes is always possible.
The number of necessary links and communications
interfaces may appear significant In case of a large Network
establishment but in reality it is not a consequent problem in
front of the performance of the new generations of FPGA
platform like Virtex 5 or Virtex 6.
Another advantage of our design is the SpacWire Codec
peripheral wich is an OPB Perphiral. It can be easily integrated
in any design or mechanism that includes that bus and establish
a SpaceWire Comunication.
Once the initial tests done, it is possible to carry out more
optimizations and improvements of the designed NoC
architectures. In addition, some bits of the frame are reserved
for future extension: they may be used for example to
implement additional techniques to manage messages priorities
and or to generate distributed interruptions between nodes.
REFERENCES
[1] ECSS Secretariat ECSS-E-ST-50-12C, "SpaceWire - Links, Nodes,
Routers and Networks", ESA Requirements and Standards Division, 31
July 2008
[2] B. M. Cook, C. P.H. Walker, "SpaceWire NetworkTopologies",
International SpaceWire Conference,September 2007
[3] C.P. Bridges and T. Vladimirova, "Dual Core System-on-a-Chip Design
to Support Inter-Satellite Communications",NASA/ESA Conference on
Adaptive Hardware and Systems June, 2008.
[4] C.F. Kwadrat, W.D. Horne, B.L. Edwards, "Inter-Satellite
Communications considerations and requirements for distributed
spacecraft and formation flying systems", NASA/SpaceOps, 2002

10 | P a g e
[5] A. Senior, W. Gasti, O. Emam, T. Jorden, R. Knowelden, S. Fowell,
Modular Architecture for Robust Computation, International
SpaceWire Conference 2008.
[6] A. Senior, P. Ireland, S. Fowell, R. Ward, O. Emam, B. Green,
ModularArchitecture for Robust Computing (MARC) , International
SpaceWireConference 2007
[7] S. S. Mehra, R. Kalsen and R. Sharma, "FPGA based Network-on-Chip
Designing Aspects", National Conference on Advanced Computing and
Communication Technology | ACCT-10, India, 2010
[8] A. Janarthanan, Networks-on-Chip based High Performance
Communication Architectures for FPGAs, M.S. Thesis, University of
Cincinnati 2008
[9] R. Gindin, I.Cidon, I.Keidar, "NoC-Based FPGA: Architecture and
Routing", First International Symposium on Networks-on-Chip-Cover,
mai 2007
[10] F. Bubenhagen, B. Fiethe, H. Michalik, B. Osterloh, P. Norridge, W.
Sullivan, C. Topping and J. Ilstad, "Enhanced Dynamic Reconfigurable
Processing Module For Future Space Applications", Long Paper,
International SpaceWire Conference, House of Scientists, St. Petersburg,
Russia, June 2010
[11] B. Osterloh, H. Michalik and B. Fiethe, "Socwire: A Spacewire Inspired
Fault Tolerant Network-On-Chip For Reconfigurable System-On-Chip
Designs In Space Applications", Long Paper, International SpaceWire
Conference, Nara Prefectural New Public Hall, Nara, Japan, November
2008
[12] B. Osterloh, H. Michalik and B. Fiethe, "SoCWire: A Robust and Fault
Tolerant Network-on-Chip Approach for a Dynamic Reconfigurable
System-on-Chip in FPGAs", ARCHITECTURE OF COMPUTING
SYSTEMS, Lecture Notes in Computer Science, 2009, Volume
5455/2009
[13] B. Osterloh, H. Michalik, B. Fiethe, K. Kotarowski, "SoCWire: A
Network-on-Chip Approach for Reconfigurable System-on-Chip
Designs in Space Applications", NASA/ESA Conference on Adaptive
Hardware and Systems, June 2008
[14] B. Fiethe, H. Michalik, C. Dierker, B. Osterloh and G. Zhou
"Reconfigurable System-on-Chip Data Processing Units for Space
Imaging Instruments",Design, Automation & Test in Europe Conference
& Exhibition, April 2007
[15] D. Guzmn, M. Angulo, L. Seoane, S. Snchez, M. Prieto and D.
Meziat, "Overview Of The Intasats Data Architecture Based On
Spacewire", Short Paper, International SpaceWire Conference, House of
Scientists, St. Petersburg, Russia, June 2010
[16] J.R.Paul and T. Vladimirova, "Design Of A Wireless Link For
Spacewire Networks", Short Paper, International SpaceWire Conference,
House of Scientists, St. Petersburg, Russia, June 2010
[17] K. Tanaka, S. Iwanami, T. Yamakawa, C. Fukunaga, K. Matsui and T.
Yoshida, "The Design and Performance of SpaceWire Router-network
using CSP", Short Paper, International SpaceWire Conference, Nara
Prefectural New Public Hall, Nara, Japan, November 2008
[18] K. Tanaka, S. Iwanami, T. Yamakawa, C. Fukunaga, K. Matsui and T.
Yoshida, "Proposal of CSP based Network Design and Construction",
Short Paper, International SpaceWire Conference, Nara Prefectural New
Public Hall, Nara, Japan, November 2008.
[19] M. Tanaka, N. Fukuchi, Y. Ooki and C. Fukunaga, "Design of a
Transputer Core and its implementation in an FPGA", Proceedings of
Communication Process Architecture 2004 (CPA2004) held in Oxford,
England, pp.361-372, IOS press.
[20] D. Atienza, F. Angiolinic, S. Muralia, A. Pullinid, L. Beninic, G. De
Michelia, "Network-on-Chip design and synthesis outlook",
INTEGRATION, the VLSI journal 41 (2008) , 340359.
[21] Agarwal, C. Iskander Survey of Network on Chip (NoC) Architectures
& Contributions, "Journal of Engineering, computing and architecture",
Volume 3, Issue 1, 2009.
[22] R. Dobkin, "Credit-based Communication in NoCs", Introduction to
Networks on Chips: VLSI aspects, Technion, Winter 2007
[23] E. Salminen, A. Kulmala, and T.D. Hmlinen, "On network-on-chip
comparison", Digital System Design Architectures Methods and Tools,
2007.

AUTHORS PROFILE
S. HACHED received the Engineer degree (2009) and the MSc degree (2010)
in Automation and Industrial Computing from the National Institute of
Applied Sciences and Technology of Tunis. Currently, he is a PhD
student in Electrical Engineering at the Ecole Polytechnique, Montreal
Univeristy.
M. GRAJA received the Engineer degree (2008) and the MSc degree (2010)
in Automation and Industrial Computing from the National Institute of
Applied Sciences and Technology of Tunis. Currently, he is a
development Engineer at the ASIC Company.
S. BEN SAOUD (1969) received the electrical engineer degree from the High
National School of Electrical Engineering of Toulouse/France
(ENSEEIHT) in 1993 and the PhD degree from the National Polytechnic
Institute of Toulouse (INPT) in 1996. He joined the department of
Electrical Engineering at the National Institute of Applied Sciences and
Technology of Tunis (INSAT) in 1997 as an Assistant Professor. He is
now Professor and the Leader of the Embedded Systems Design
Group at INSAT. His research interests include Embedded Systems
Architectures, real-time solutions and applications to the Co-Design of
digital control systems and SpaceWire modules.


11 | P a g e
A Study on Cross Layer MAC design for performance
optimization of routing protocols in MANETs

P.K.Alima Beebi
School of Information Technology and
Engineering,
VIT University, Vellore. 632014,
Tamil Nadu, India.
Email:[email protected]
Sulava Singha
Engineering,
Tamil Nadu, India.
Email: [email protected]

Ranjit Mane
Engineering,
Tamil Nadu, India.

Abstract One of the most visible trends in todays commercial
communication market is the adoption of wireless technology.
Wireless networks are expected to carry traffic that will be a mix
of real time traffic such as voice, multimedia conferences, games
and data traffic such as web browsing, messaging and file
transfer. All of these applications require widely varying and
very diverse Quality of Service (QoS) guarantees. In an effort to
improve the performance of wireless networks, there has been
increased interest in protocols that rely on interactions between
different layers. Cross-Layer Design has become the new issue in
wireless communication systems as it seeks to enhance the
capacity of wireless networks significantly through the joint
optimization of multiple layers in the network. Wireless multi-
hop ad-hoc networks have generated a lot of interest in the recent
past due to their many potential applications. Multi-hopping
implies the existence of many geographically distributed devices
that share the wireless medium which creates the need for
efficient MAC and routing protocols to mitigate interference and
take full advantage of spatial reuse. Cross-Layer Design is an
emerging proposal to support flexible layer approaches in Mobile
Ad-hoc Networks (MANETs). In this paper, we present few
Cross-Layer MAC design proposals by analyzing the ongoing
research activities in this area for optimizing the performance of
routing protocols in MANETs.
Keywords- cross-layer, routing, MAC, MANET, multi-hop
networks;
I. INTRODUCTION
An ad-hoc network is a local area network (LAN) that is
built spontaneously as devices connect. It is self-creating, self-
organizing and self-administrating network. Each node acts as
a host and router and forwards each others packet to enable the
communication between nodes. The network topology changes
frequently because of node mobility and power limitations.
Multi-hopping in ad-hoc networks implies the existence of
many geographically distributed devices that share the wireless
medium. Efficient routing is the fundamental issue in multi-hop
wireless ad-hoc networks [1, 2].
A layered architecture, like the seven-layer open systems
interconnect (OSI) model [3, p.20], divides the overall
networking task into layers and defines a hierarchy of services
to be provided by the individual layers. The services at the
layers are realized by designing protocols for the different
layers. The architecture forbids direct communication between
non adjacent layers; communication between adjacent layers is
limited to procedure calls and responses. It is repeatedly argued
that although layered architectures have served well for wired
networks, they are not suitable for wireless networks. The
complexity and time-varying attributes of the wireless channel
call for cross-layer design [4]. Protocols can be designed by
violating the reference architecture, for example, by allowing
direct communication between protocols at nonadjacent layers
or sharing variables between layers. Such violation of a layered
architecture is called cross-layer design with respect to
reference architecture [5].
It is argued that while designing efficient routing protocols
for multi-hop wireless ad-hoc networks to meet the QoS
requirements, we also need to consider the influence of MAC
protocol in finding the optimal routes. Researchers analyzed
the interaction between routing and the MAC layer protocols
and confirmed that the MAC protocols can significantly affect
the performance of routing protocols and vice versa.
Thus, a central challenge in the design of ad-hoc networks
is the development of efficient MAC and dynamic routing
protocols that can efficiently find routes between the
communicating nodes. In this paper, we focus on cross layer
design proposals based on the coupling between network layer
and MAC layer for optimizing the performance of routing
protocols in MANETs.
Widely used routing protocols for Ad-hoc networks are the
two major categories:
A. Pro-active (table-driven) Routing Protocols:
This type of protocols maintains fresh lists of destinations
and their routes by periodically distributing routing tables
throughout the network.
The DSDV and OLSR are well-known proactive routing
protocols.
B. Reactive (on-demand) Routing Protocols:
This type of protocols finds a route on demand by flooding
the network with Route Request (RREQ) packets.

12 | P a g e
The AODV and DSR are representatives of on-demand
routing protocols.
The fundamental MAC technique of IEEE 802.11 based
WLAN standard is the Distributed Coordination Function
(DCF). DCF employs a CSMA/CA with Binary exponential
back-off algorithm. This algorithm is used to space out
repeated retransmissions of the same block of data, often as
part of network congestion avoidance.
II. AN OVERVIEW OF THE PROPOSALS
The layered architecture can be violated by creating new
interfaces between the layers or by merging of adjacent layers
[5]. The new interfaces are used for information sharing
between the layers at runtime. Depending on the direction of
information flow along the new interfaces, we classify the
different proposals in the literature under consideration into the
following categories:
1. The cross-layer proposals in which some information
from the MAC layer is passed onto network layer for
optimized routing in MANETs.
2. The cross-layer proposals in which some parameter of the
network layer is passed onto the MAC layer to improve
the performance of routing protocols.
3. The cross-layer proposals in which some information is
passed back and forth between the two layers for efficient
routing.
The adjacent layers are merged by designing a new
sublayer such that the services provided by the new sublayer is
the union of the services provided by the constituent layers.
4. Novel routing protocols based on cross-layer coupling
between MAC and network layer.
The route maintenance overhead in AODV increases as the
network mobility is high and the topology changes frequently.
The established routes are to be changed as the nodes move
away. It results in low performance as many packets are
dropped when some active router node on the path moves away
significantly. The performance of AODV protocol is improved
by adopting a cross-layer approach and position based
forwarding technique [6]. Here the MAC layer calculates the
received power of the packets from the nodes and informs
network layer if it is below the threshold value for efficient
transmission. (Category 1). The network layer removes those
nodes from the routing table and finds an alternate path.
A. Implementation of AODV-PF:
AODV-PF is an update over on-demand routing protocol.
In position-based forwarding, every node maintains a list of
neighbors nearest to the position of the destination. At the time
of route establishment, only these nodes will be selected.
Hence the route lifetime improves. A forwarding region is
either a circular or a Box like virtual area drawn around the
destination node.

Figure 1: Sample PF Box forwarding region. A route between source node S
and destination node D is found by flooding the forwarding region.
When a node, S, needs a route to a destination, D, it floods
a route request (RREQ) through the forwarding region/ entire
network attempting to find the destination. In PF Box, a
neighbor of S determines whether it is within the forwarding
zone by using the location of S and the expected zone for D.
Once the path is established, source transmits data packets
to the destination. There may be some nodes which may be in
the same direction as of the destination, but may not have
sufficient energy to forward the packets further. Such nodes
must notify their predecessors about the energy constraint.
Energy is a physical layer feature, which is measured at the
MAC layer. If the estimated energy is below the threshold
value for efficient transmission, then the routing layer is
notified about the energy fall. Accordingly routing layer
initializes route maintenance by notifying its neighbors about
the problem and removes those nodes from the routing table.
AODV-PF outperforms on-demand routing protocols
(AODV) in various constraints such as control overhead,
throughput, latency when simulated with pause time and
different loads. AODV-PF sustains high packet delivery rates.
In terms of routing overhead, AODV-PF has scalable routing
overhead for mobility, random packet loss, and traffic load,
thus utilizing the channel efficiently.
A strategy of cross-layer algorithm AODV- SPF
(Scheduling Priority Flow) is proposed in [7] to address the
Intra-Flow contention problem in chain topology where the
source, could actually inject more packets into the chain than
the subsequent nodes can forward.

13 | P a g e

Figure 2: MAC layer interference among a chain topology. Small circle
denotes nodes valid transmission range. Large circles denote nodes
interference range. Node 0 is the source, 6 is the destination.
These packets are eventually dropped at the two subsequent
nodes. In the shared channel environment of multi-hop ad hoc
network, intra-flow contention is widespread and result in
collision and congestion at some nodes. This not only greatly
decreases the end-to-end throughput but also increase the
probability of link failure during data transmission in network
layer. Here the total hop count parameter is transmitted from
network layer to MAC layer (Category 2) which is used to
recalculate the contention window of the nodes along the
routing path. This approach avoids collision in the MAC layer
and results in getting better performances of data transmission
in the network layer.
B. Implementation of AODV-SPF:
The AODV-SPF includes two major mechanisms. The first
one is to assign a higher probability of channel access to the
downstream node. This could achieve optimum packet
scheduling for chain topology and avoid severe intra-flow
contentions in each flow. The second one is to limit the
data/fragment outgoing rate of source node in MANET, not to
allow the source node to occupy the whole outgoing queue and
bandwidth. This could efficiently prevent the irresponsible
applications from injecting more packets than the network
could handle, and leave more queue space and bandwidth for
other flows passing through network, which further alleviates
the unfairness problem between moderate and greedy source
flows. One way to prevent the first node on the path from
injecting more packets than what the succeeding nodes can
forward is to assign the lowest channel access probability to
source node and higher channel access probability to
intermediate nodes along the downstream path. This can
achieve optimum scheduling for one-way traffic in the regular
chain topology.
Extensive simulations verify that compared to IEEE 802.11
DCF, this scheme, AODV -SPF, in most of the cases could
achieve better performance metrics of data transmission in
network layer, e.g., stable and higher throughput, packet
delivery ratio, lower normalized routing load, decreasing the
number of control messages such as Route Request and Route
Error.
A mobile station that experiences bad channel tends to
transmit at a low rate in order to decrease the bit error rate
(BER). In [8], a cooperative MAC protocol is used to improve
the performance of the routing protocol (DSDV) in the network
layer. In the cooperative MAC protocol, a station would use a
neighboring helper station for MAC layer forwarding, if the
two-hop relaying yields to a better performance than a direct
single-hop transmission. A cross-layer approach is followed
here where the DSDV routing protocol finds a multi-hop path
from the source to destination while the cooperative MAC
scheme, eventually selects two-hop forwarding for each routing
layer hop (category 3), to boost the performance of the routing
protocol.
C. Implementation of Cooperative Routing MAC protocol:
Every station maintains information about its candidate
helpers in a table called CoopTable. Corresponding to a
particular helper station, each row in the CoopTable stores the
MAC address of the helper and the transmission rates that this
helper could provide for the two hop transmission (i.e., from
the transmitter to the helper, and from the helper to the
intended destination).

Figure 3: Represents a simple network consisting of a source node (Ns ) ,a
destination node (Nd), and a helper node(Nr ) .
In a real environment, every station could be considered as
a candidate helper by its neighboring stations. The authors have
implemented a broadcasting scheme using a hello packet in
each station. The hello packet is generated directly by the MAC
layer and is broadcasted on a periodic basis, and it indicates the
sustainable rates between the particular station and its
neighbors. A mobile station updates its CoopTable based upon
the received Hello messages, in order to be aware of candidate
helpers, and revokes timely an enlisted helper once the helper
becomes inactive.
The extensibility of the cooperative MAC protocol into
multi-hop ad-hoc networks, where in conjunction with the
routing protocol can achieve superior performance, compared
to the legacy 802.11g.
Due to the highly complicated nature of medium access
control (MAC) layer in wireless networks, MAC protocol has
been implemented as software. This is different from a wired
network situation where MAC is implemented in hardware.
Due to the software implementation of MAC protocol, the
traditional routing structure in multi-hop wireless ad hoc
networks results in long processing delays for forwarding
packets in every intermediate/relay node. The authors in [9]

14 | P a g e
propose a solution to alleviate this issue based on cross-layer
MAC design, which improves the coordination between MAC
and routing layers using an idea called virtual link".
Experimental results show that the proposed cross-layer design
significantly improves the performance in terms of reduced
round trip time(RTT), reduced processing time in the
intermediate relay/forwarding nodes and increased throughput
compared to a legacy architecture.
D. Implementation of cross layer MAC enabling Virtual Link:

Figure 4: A multi-hop ad-hoc network example
In the proposed cross-layer MAC architecture, the authors
introduce two extra modules: Inbound Monitor module and
Self-Learning module. The steps for creating a virtual link are
then as follows. When the wireless MAC starts to run in a
node, its IP address is noted and the Inbound Monitor also
starts to run. The Inbound Monitor of this node checks for the
destination IP address on each frame. If the destination IP is
equal to its own IP address, this is treated as a normal frame.
Otherwise, the Inbound Monitor will look up the corresponding
virtual link entry for this frame. If a suitable virtual link is
located successfully, this frame will be re-encapsulated
according to this virtual link entry and sent to the physical layer
immediately for relay/forwarding purpose.
If no corresponding virtual link is found, the self-learning
module will be triggered. From now on, this monitor module
will work on the outbound direction of the IEEE 802.11 MAC.
After routing layer re-encapsulates the frame, which triggers
the self-learning module, this frame will be shown again on the
outbound direction. The self-learning module will create a
suitable virtual link according to the new MAC header of this
frame. When other data frames arrive at this node, the Inbound
Monitor will re-encapsulate the MAC header according to
corresponding Virtual Link entry.
The tests performed clearly demonstrate that cross-layer
MAC design employing the proposed virtual link concept
reduces the processing time at the intermediate nodes
approximately by 50% while the throughput increases by 7
10% when compared with the legacy routing algorithm.
LEss remaining hop More Opportunity (LEMO) algorithm
was proposed to improve the packet delivery ratio and fairness
among flows for multi-hop ad hoc networks through cross-
layer interaction between MAC and the routing layer. The
routing information about the total hops and the remaining
hops required by a packet to reach its destination is exploited
by the MAC layer (Category 2) in order to give priority to the
packets that are closer to their destination. Reference [10]
compares the performance of LEMO algorithm by using DSR
and AODV protocols at the routing layer and varying the
mobility and the load conditions. With the help of performance
metrics like packet delivery ratio, average end-to-end delay and
normalized routing load, it is shown that cross-layering
between DSR and IEEE 802.11 DCF performs better than
cross-layering between AODV and IEEE 802.11 DCF.
E. Implementation of LAODV and LDSR:
LAODV algorithm is implemented by applying cross-
layered approach between AODV protocol at the routing layer
and IEEE 802.11 DCF at the MAC layer. In order to achieve
this, the information about the total number of hops between
the source and the destination nodes, and the number of
remaining hops from the forwarding node is collected from the
routing layer and is sent to the MAC layer. The IEEE 802.11
DCF used at the MAC layer is modified to process the received
information from the routing layer and change the value of
CWmin accordingly. LDSR algorithm is implemented in a
similar manner by using DSR as the routing layer protocol
instead of AODV.
It is concluded from the simulation results that LAODV and
LDSR have better packet delivery ratio. Both have shown
significant improvement in performance in terms of average
end-to-end delay and normalized routing load. Cross layering
between DSR and IEEE 802.11 DCF has shown better results
than cross-layering between AODV and IEEE 802.11 DCF.
LDSR has shown an increase in packet delivery ratio up to 2%
whereas with LAODV marginal increase can be seen.
In [11], the authors propose a novel Cross-layer
Synchronous Dynamic Token Protocol (CLSDTP) in single
channel that is based on token-passing scheme (Category 4).
The protocol introduces a token relay algorithm which is fast
and adaptive to topology variation, presents a collision
avoidance algorithm which solves the exposed and hidden
terminal problem. The CLSDTP improves the spatial
multiplexing compared with the RTS/CTS access mechanism.
The results of the simulation show that CLSDTP can
significantly improve the system performance.
F. Implementation of CLSDTP:
In CLSDTP, the Time Slot is defined as the time need to
send a token and a data packet. Every node synchronizes with
each other through monitoring messages transferred by its
neighbors. A node begins to transfer token at the time T0, send
data packet at the time T1. Node sends one data packet in one
time slot. If a node hears a token transferred by its neighbor, it
will realize it is the beginning of a slot. If a node hears a data
packet sent by its neighbor, it will realize it is the time T1 of a
slot. There should be no worry about the timeslot
synchronization because the Time Slot is long and the guard
interval. Same as WDTP, each node should maintain a token
passing queue (TPQ) to record its neighbor nodes. The nodes
not holding the token listen to the channel. If they find a node
processed the token transfer, they push the node to the rear of

15 | P a g e
their TPQs. The node holding the token transfers the token to
the front node in its TPQ. Successful token transmission relies
on implicit acknowledgment which is the successors token
transmission. A node will consider that the connection with its
successor is broken, if the successor does not transfer token in
the beginning of the next timeslot, which indicates that the
transfer was unsuccessful. It should delete the node from its
TPQ and transfer the token again to the new front node of its
TPQ.
The most important feature of CLSDTP is the sharing of
net information by MAC and routing layer, which reduces the
system overhead significantly. The protocol inherits the
advantage of the token passing scheme under which the
probability of collision is very low, which effects the
performance of network dramatically. The results of Spatial-
multiplexing analysis and the simulation demonstrate that
CLSDTP outperforms 802.11 RTS/CTS-AODV protocol in
terms of system throughput and delay.
A novel cross-layer efficient routing protocol (CLERP) is
presented in [12]. CLERP adopts cross-layer design to establish
backup route to reduce the packet losses when link breaks
occur. To decrease the unnecessary overhead of hello packets,
adaptive links connectivity is employed to improve
connectivity and coverage when the nodes are far away from
the primary route. The simulation results demonstrate that
CLERP yields lower route discovery frequency, higher packet
delivery fraction, better average end-to-end delay and lower
routing load.
G. Implementation of CLERP:
CLERP is presented by sharing the cross-layer cache
information while still maintaining separation between the
MAC layer 802.11 and the route layer AODV in protocol
design. Cross-layer cache is used to enhance the connectivity
of the network. Node updates its cross-layer cache if any
communication is heard from any neighbor (Category 1). If a
node receives any messages from its neighbors, the neighbors
link status is set to active and the timeout is reset to the current
time plus active timeout. If active timeout passes without any
messages from a neighbor, the neighbors link status changes to
inactive. Once in the inactive state, if there is still no sign of the
neighbor during the delete interval, the neighbor is deleted
from the cross-layer cache. Cross-layer cache can be used to
establish backup route to reduce the packet losses due to link
break. The backup routes are established during the
RTS/CTS/DATA/ACK exchange procedure. When a node that
is not part of the route overhears RTS messages transmitted by
a neighbor, it records MAC address of the receiver and the
sender of the packet. When the node overhears the CTS
messages, it checks if the recorded sender of the RTS is the
receiver of the CTS. If it is, the receiver of the RTS is a
neighbor and should be inserted in the cross-layer cache.
Meanwhile, it records that neighbor as the next hop to the
destination in its backup route table. Using this method, backup
route can be conducted. When a node detects a link break, it
caches the packets transmitted to it and then broadcasts a one
hop Route Request (RREQ) to candidate for backup routes.
After receiving this RREQ, nodes that have a routing to the
destination in their alternate route table, forward the packet to
their next hop node. Data packets therefore can be delivered
through one or more alternate routes and are not dropped when
link breaks occur. If no backup route can be constructed with
the downstream node, an RERR message is propagated toward
the source node to initiate a route rediscovery after a timeout
value.
Simulation results prove that CLERP increases the packet
delivery fraction, reduces the route discovery frequency,
average end-to-end delay and normalized routing load.
III. CONCLUSION AND FUTURE RESEARCH
In accordance with the review performed, we propose to
combine the strategies followed in more than one literature in
the following manner to further improve the performance of the
routing protocols in MANETs. 1. AODV-PF and AODV-SPF
can be combined with the Cooperative Routing MAC protocol
to further improve the performance of routing protocols. 2.
AODV-SPF considers only the chain topology. We can focus
on how this AODV-SPF can be extended on networks with
other topologies and can analyze the results through simulation.
3. To reduce packet processing delay at every node the concept
of virtual link can be incorporated along with other strategies
for performance improvement of routing protocols. 4. AODV-
SPF can be combined with LEMO algorithm to further improve
the performance of routing protocols in MANETs. We also
suggest modifying the existing approach in the literatures with
the new one and analyzing whether it could be possible to bring
out the best outcome. 5. In CLSDTP, a token passing scheme is
used through which MAC and network layer share net
information. This idea of token passing can be combined with
AODV-PF to pass information about the weak nodes to
network layer. In CLERP, AODV is used at the routing
protocol in the network layer. Instead other on-demand routing
protocol such as DSR can be used and the results can be
analyzed through simulation. Also we plan to introduce more
parameters other than the existing parameters and try to
analyze the results through simulation.
REFERENCES
[1] Carine TOHAM, Franois JAN Multi-interfaces and Multi-channels
Multi-hop Ad hoc Networks: Overview and Challenges (MASS), 2006
IEEE International Conference on Oct. 2006, Print ISBN-1-4244-0507-
6.
[2] Catherine Rosenberg Challenges in Multi-Hop Networks Next
generation internet design and Engineering, 2006. NGI, 2006 2
nd

Conference on. E-ISBN-0-7803-9456-9. Print ISBN-0-7803-9455-0.
[3] D.Bertsekas and R.Gallager, Data Networks, 2
nd
ed., Prentice Hall, 1992.
[4] Frank Aune , Cross-Layer Design Tutorial Published under Creative
Commons License.D.Bertsekas and R.Gallager, Data Networks, 2
nd
ed.,
Prentice Hall, 1992.
[5] Vineet Srivastava, Mehul Motani Cross-Layer Design: A survey and
the road ahead Communication Magazine, IEEE. Issue date: Dec, 2005.
ISSN: 0163-6804.
[6] Patil R.; Damodar A.; Das R. Cross Layer AODV with Position based
Forwarding Routing for Mobile Adhoc Network MANET Wireless
Communication and Sensor Networks 2009 5
th
IEEE Conference on
10.11.09/ACSN,2009.5434796. Print ISBN-978-1-4244-5876-9.
[7] So tsung Chou; Hann Tzong Chun; Chang-Ho Shiao; AODV Protocol
with Scheduling Priority Flow in Mac Layer (AODV SPF)
Networked Computing and Advanced Information Management (NCM),
2010 6
th
International Conference on Publication year 2010. Issue date:
16-18 Aug 2010. Print ISBN: 978-1-4244-7671-8.
[8] Jian Lin, Thanasis Korakis, Xiao Wang, Shunyuan Ye and Shivendra
Panwar A Demonstration of a Cross-Layer Cooperative Routing-MAC

16 | P a g e
scheme in Multi-hop Ad-Hoc Networks Testbeds and Research
infrastructure for the Development of Network and Communities and
workshops, 2009. TridentCom 2009 5
th
International Conference on
Issue date: 6-8 April 2009. ISBN: 978-1-4244-2846-5.
[9] Kai Hong Shamik Sengupta R. Chandramouli Cross-layer MAC
Enabling Virtual Link for Multi-hop Routing in Wireless Ad-Hoc
Networks Communications (ICC), 2010 IEEE International Conference
on Issue date: 23-27 May 2010. ISSN: 1550-3607. Print ISBN: 978-1-
4244-6402-9.
[10] Manjul Walia and Rama Krishna Challa Performance Analysis of
Cross-Layer MAC and Routing Protocols in MANETs Computer and
Network Technology (ICCNT), 2010 2
nd
Issue date: 23-25 April 2010. Print ISBN: 978-0-7695-4042-9.
[11] Yao Chang-hua, Wang Cheng-gui A Cross-Layer Synchronous
Dynamic Token Protocol for Ad-Hoc Networks Communication and
Mobile Computing (CMC), 2010 International Conference on 12-14
April 2010. Vol. 3 Print ISBN: 978-1-4244-6327-5.
Wang Qing-wen, Shi Hao-shan, Jiang Yi, Cheng Wei A Cross-Layer
Efficient Routing Protocol for Ad hoc Networks Mobile Adhoc and
Sensor Systems, 2009. Mass09, IEEE 6
th
Issue date: 12-15 Oct 2009. Print ISBN: 978-1-4244-5113-5.
AUTHORS PROFILE
P.K.ALIMA BEEBI is a M.Tech(Networking) student in the IT
department at Vellore Instiutute of Technology, Vellore, TamilNadu, India. Her
research interest is in distributed database systems, wireless communications
and networking. She holds an M.C.A. degree from Madurai Kamaraj
University, TamilNadu, India and M.Phil degree from Periyar University,
India. Her M.Phil dissertation is on Emerging Database Technologies and its
Applications.
SULAVA SINGHA is a M.Tech(Networking) student in the IT department
at Vellore Instiutute of Technology, Vellore, TamilNadu, India. Her research
interest is in distributed operating systems and networking. She holds an
B.Tech degree in Information Technology from Pune University, India.
RANJIT MANE is a M.Tech(Networking) student in the IT department at
Vellore Instiutute of Technology, Vellore, TamilNadu, India.

17 | P a g e
Churn Prediction in Telecommunication
Using Data Mining Technology

Rahul J. Jadhav
1

Bharati Vidyapeeth Deemed University,Pune
Yashwantrao Mohite Institute of Management, Karad,
Maharashtra INDIA
[email protected]
Usharani T. Pawar
2

Shivaji University Kolhapur
S.G.M college Department of Computer Science Karad,
Maharashtra INDIA
[email protected]

AbstractSince its inception, the field of Data Mining and
Knowledge Discovery from Databases has been driven by the
need to solve practical problems. In this paper an attempt is
made to build a decision support system using data mining
technology for churn prediction in Telecommunication Company.
Telecommunication companies face considerable loss of revenue,
because some of the customers who are at risk of leaving a
company. Increasing such customers, becoming crucial problem
for any telecommunication company. As the size of the
organization increases such cases also increases, which makes it
difficult to manage, such alarming conditions by a routine
information system. Hence, needed is highly sophisticated
customized and advanced decision support system. In this paper,
process of designing such a decision support system through data
mining technique is described. The proposed model is capable of
predicting customers churn behavior well in advance.
Keywords- churn prediction, data mining, Decision support system,
churn behavior.
I. INTRODUCTION
The biggest revenue leakages in the telecom industry are
increasing customers churn behavior. Such customers create an
undesired and unnecessary financial burden on the company.
This financial burden results in to huge loss of the company
and ultimately may lead to sickness of the company, detecting
such customers well in advance is an objective of this research
paper.
II. DATA MINING IN TELECOMMUNICATION

In telecommunication sector data mining is applied for
various purposes. Data mining can be used in following ways:

A. Churn prediction:
Prediction of customers who are at risk of leaving a
company is called as churn prediction in telecommunication.
The company should focus on such customers and make every
effort to retain them. This application is very important because
it is less expensive to retain a customer than acquire a new.
B. Insolvency prediction:
Increasing due bills are becoming crucial problem for any
telecommunication company. Because of the high competition
in the telecommunication market, companies cannot afford the
cost of insolvency. To detect such insolvent customers data
mining technique can be applied. Customers who will refuse to
pay their bills can be predicted well in advance with the help of
data mining technique.
C. Fraud Detection:
Fraud is very costly activity for the telecommunication
industry; therefore companies should try to identify fraudulent
users and their usage patterns.
III. CHURN PREDICTION IN TELECOMMUNICATION
Major concern in customer relationship management in
telecommunications companies is the ease with which
customers can move to a competitor, a process called
churning. Churning is a costly process for the company, as it
is much cheaper to retain a customer than to acquire a new one.
The objectives of the application to be presented here were
to find out which types of customers of a telecommunications
company is likely to churn, and when.
In many areas statistical methods has been applied for churn
prediction. But in the last few years the use of data mining
techniques for the churn prediction has become very popular in
telecom industry. Statistical approaches are often limited in scope
and capacity. In response to this need, data mining techniques are
being used providing proven decision support system based on
advanced techniques.

The BSNL. Satara like other telecom companies suffer
from churning customers who use the provided services
without paying their dues. It provides many services like the
Internet, fax, post and pre-paid mobile phones and fixed
phones, the researchers would like to focus only on postpaid
phones with respect to churn prediction, which is the purpose
of this research work.


18 | P a g e
June July Aug Sept

Billing Issue Due

Bill

Date

Nullif-

ication

Service
Interruption

Figure 1

As described in above figure1, customers use their phone for a
period of one month, called the billing period. The bill is issued
two weeks after the billing period. The due date for the payment is
normally two weeks after the date of issue. If a bill is not paid in
this period, the company takes action on such a customers.
The company disconnects the phone one way, two week after
payment due date for 30 days. That means the customer can
only receive incoming calls and cant make outgoing calls
during these 30 days.
If the customer pays their bill, connection is reestablished. If
the customer doesnt pay in this 30 days period the companies
nullify the contract and uncollected amount will be passed to
custody. The amount that customer owes is transferred to
uncollectible debts and the company considers the money most
probably lost. Telecommunication companies face considerable
loss of revenue, because some of the customers who are at risk of
leaving a company. As one can see the measures that the company
takes against churn customers come quite late Predicting such
customers well in advance who are at risk of leaving a company.
Detection of as many such customers well in advance is the main
objective of this research paper.
A. Data collection

Following are the different sources used for collecting the
data.
In house customer databases- It has major fields such as
phone number, address category, type of security deposit and
cancellation.
External sources- Call detail record of every call made by
the customer i.e. call no, receiver no., call date, call time and
duration of each call. In addition data was identified from
billing sections.
Research survey- Data is collected through previous
research survey.
In order to make study more precise customers from
various categories such as government, businesses and private
were included. The following table shows the number of
records from various categories were included.

Category Number of records

Government 35

Business 125

Private 735

Total 895

Table 1: category wise records
B. Data Preparation
Before data can be used for data mining they need to be
cleaned and prepared in required format. Initially multiple
sources of data is combined under common key. Typical
missing values on the call detail records like call_date,
call_time, and call_duration was found, it forced to ignore such
records in the study. At this stage, two attributes such as
late_pay, and extra_charges were eliminated since records in
these attributes were not complete even though they were
playing significant role in the problem of churn. In order to
perform above tasks SQL server were used.

Attribute Data Type Description

category_Id Text Category id of phone account.
Category Text Category of phone account.
Sec_dep Currency Security deposit

Table 2: Attributes from customer file


Late_pay Number Count of late pay
Extra_charges Number Count of bills with extra
charges.
Table 3: Attributes from billing information section


Max_Units Currency Maximum no of units charged
in any two week period during
the study period.
Min_Units Currency Minimum no of units charged
in any two week period during
the study period.
Max_dur Number Maximum total duration for the
calls in any two week period
during the study period.
Min_dur Number Minimum total duration for the
calls in any two week period
during the study period.
Max_count Number Maximum no. of calls in any
two week period during the
study period.
Min_count Number Minimum no. of calls in any
two week period during the
study period.
Max_dif Number Maximum no. of different
numbers are called in any two
week period during the study
period.
Min_dif Number Minimum no. of different
numbers are called in any two
week period during the study
period

Table 4: Attributes from call detail record
C. Defining data mining function

Churn prediction can be viewed as a classification problem,
where each customer is classified in one of the two classes such
as most possible churning or not. Even though there were many

19 | P a g e
churning customers reported, it was difficult to get a significant
number of them during study period. As a result, the
distribution of customers between the two classes was very
uneven in the original dataset. Approximately 82.33% were not
churning and 17.67% churning customers during research
period. Classification problem with such characteristics are
difficult to solve. Hence new dataset had to be created
especially for the data mining function. For every phone call
made by a customer, the data had to be aggregated.
Aggregation is done with the aim of creating a customer profile
that reflects the customers phoning behavior over the last five
months. The details of this aggregation process are complex,
interesting and important, but cannot be described here due to
space limitations. In essence, many aggregated attributes
containing the lengths of calls made by every customer in these
five months were created.

1
st
2
nd
3
rd
4
th
week week week week
Not Churning 102.5 89.2 96.1 117.2
Possible Churning 71 109.5 131.5 174

Figure 2: Difference between Churning and not churning customers

From the above figure the difference between the not
churning and possible churning customers can see clearly. On
an average, the not churning customers were using their phone
approximately for the same number of times during all periods
ranging from 102 to 117. On the contrary, the possible
churning customers on the average were using their phone for
less number of times for the first few days and then their
behavior changed resulting to high number of calls than not
churning customers, ranging from 71 to 174.
D. Model Building and Evaluation
The major task to be performed at this stage was creating
and training a decision support system that can discriminate
between churning and not churning customers. For the
proposed, we had a choice of several data mining tools
available, and METALAB was found to be the most suitable
for this purpose, because it supports with many algorithms. It
has neural network toolbox. The algorithm that is used for this
research work is Back propagation algorithm. While building a
model whole dataset was divided into three subsets. These
subsets were the training set, the validation set, and the test set.
The training set is used to train the network. The validation set
is used to monitor the error during the training process. The test
set is used to compare the performance of the model.
IV. CONCLUSION

This research report is about to predicting customers who
are at risk of leaving a company, in telecommunication sector.
Using this report company will be able to find such kind of
customers. The model can be employed in its present state.
With further work, the scope of model can be widened to
include insolvency prediction of telecommunication customers.
REFERENCES
[i] Abbot D. An evaluation of high end data mining for fraud detection
1996 www.kdnuggets.com
[ii] Berthold and Hand / Hardcover / 2003 (Second Edition) Intelligent Data
Analysid
[iii] Bigus J. P. Data mining with neural network McGraw-Hill Dalkalaki S.
Data mining for decision support on customer insolvency in
telecommunication businesses 2002 URL:
www.elsevier.com/locate/dsw
[iv] Issac F. Risk and Fraud management for Telecommunication 2003.
[v] Kurth thearling Tutorial An Introduction to Data Mining
www.thearling.com
[vi] Zhaohui Tang and Jemie Maclennan Data mining with SQL Server
2005, wiley.
[vii] Michael Berry & Gordon Linoff / Paperback / 1999 Mastering Data
Mining
[viii] Data mining techniques(2
nd
edition), Michael J. A. Berry and Gordon S.
Linoff, Wiley India
[ix] Data ware housing Data mining and OLAP ,Alex berson and Stephen
Smith
[x] Data mining A tutorial based premier, Richard J. Roiger, Michael Geatz.
[xi] Business Intelligence and Insurance, White Paper, Wipro Technologies,
Bangalore,2001
[xii] Revenue Recovering With Insolvency Prevention On a Brazilian Telecom
Operator (Calos Andre R. Pinheiro, Alexander G. Evsukoof, Nelson F.
F. Ebecken Brazil )

0
50
100
150
200
1st 2nd 3rd 4th
Not
Churning
Possible
Churning

20 | P a g e
Knowledge-Based Systems Modeling for Software
Process Model Selection

Abdur Rashid Khan
1
,
1
Institute of Computing &
Information Technology,
Gomal University, Dera Ismail
Khan, Pakistan

1
[email protected]
[email protected]
Zia Ur Rehman
2
2,
Institute of Information
Technology
Kohat University of Science &
Technology (KUST), Pakistan
2
[email protected]
Hafeez Ullah Amin
3
3,
Technology
Kohat University of Science &
Technology (KUST), Pakistan
3
[email protected]

AbstractThis paper depicts the knowledge-based system
named as ESPMS (Expert System for Process Model Selection)
through various models. A questionnaire was developed to
identify important parameters, which were evaluated through
domain experts in about all the universities of Pakistan. No exact
system was found, which could guide Software Engineers for
selection of a proper model during software development. This
paper shows that how various technologies like Fuzzy Logic,
Certainty Factors, and Analytical Hierarchy Process (AHP) can
be adopted to develop the Expert System. Priorities assignments
to critical factors have been shown for decision making in the
model selection for a problem domain. This research work will
be beneficial to both students and researchers for integrating
Soft Computing Techniques and Software Engineering.
Keywords- ESPMS, Expert System, Analytical Hierarchy
Process, Certainty Factors, Fuzzy Logic, Decision-Making.
I. INTRODUCTION
This article presents a conceptual framework for
selection of an appropriate software process model showing
the whole work through various models. The main goal of
this research work is to guide the Software Engineer for
decision making about selection and evaluation of software
process model through implementation of Soft Computing
Technology. Expert system named as ESPMS (Expert System
for Process Model Selection) was developed using ESTA (Expert
System Shell for Text Animation) as a development tool. .
A software process defines the approach that is taken as
software is engineered. But software engineering also
encompasses technologies that populate the process technical
method and automated tools [1]. Professional system
developers and the customers they serve share a common goal
of building information systems that effectively support
business process objectives. In order to ensure that cost-
effective, quality systems are developed which address an
organizations business needs, developers employ some kind of
system development process model to direct the projects
lifecycle [2].
Software process is a framework to build high quality
software [1]. The most difficult task in software engineering is
to select an appropriate software process model, which
completely suits for the particular situation. If a particular
software process model is not selected then it will become a
bottleneck for the software product; takes more time, higher
budget than the estimated one. Mostly software projects fail
due to inappropriate modeling. That is why care must be taken
during a selection of software process model. As, most of the
times domain experts (software engineers) are few in numbers,
are much busy and/or not available in time, so such types of
systems are much important to novice users.
This research aims to devise a theoretical framework for
software process model selection, which will help Knowledge
Engineer and Software Engineer in developing high quality,
cost-effective software, well in time within the available
resources.
II. STUDY DOMAIN
A. Software Engineering and Artificial Intelligence
The integration of matured AI methods and techniques with
conventional software engineering remains difficult and poses
both implementation problems and conceptual problems [3].
Artificial Intelligence and Software engineering both
disciplines have many commonalities. Both deal with modeling
real world objects from the real world like business processes,
expert knowledge, or process models. Recently several research
directions of both disciplines come closer together and are
beginning to build new research areas. Some of these research
areas are the following; Software Agents play an important role
as research objects in Distributed AI (DAI) as well as in agent-
oriented software engineering (AOSE). Knowledge-Based
Systems (KBS) are being investigated for learning software
organizations (LSO) as well as knowledge engineering.
Ambient Intelligence (AmI) is a new research area for dis-
tributed, non-intrusive, and intelligent software systems both
from the direction of how to build these systems as well as how
to design the collaboration between ambient systems. Last but
not least, Computational Intelligence (CI) plays an important
role in research about software analysis or project management
as well as knowledge discovery in databases or machine
learning [4].

21 | P a g e
An expert system is a computer program which captures the
knowledge of a human expert on a given problem, and uses this
knowledge to solve problems in a fashion similar to the expert
[5]. Computer programs using Artificial Intelligence (AI)
techniques to assist people in solving difficult problems
involving knowledge, heuristics, and decision-making are
called expert systems, intelligent systems, or smart systems [6].
Software development process is a very complex process that,
at present, is primarily a human activity. Programming, in
software development, requires the use of different types of
knowledge: about the problem domain and the programming
domain. It also requires many different steps in combining
these types of knowledge into one final solution. One of my
key observations is that expert programmers rely
heavily on a large body of standard implementation
methods and program forms [7]. Software Engineering is a
highly dynamic field in terms of research and knowledge, and
it depends heavily upon the experience of experts for the
development and advancement of its methods, tools, and
techniques [4]. Learning is based on knowledge and
experiences related to the different processes, products, tools,
techniques and methods applied to the software development
process. The overall objective of a Learning Software
Organization (LSO) is to improve software processes and
products according to the strategic goals of the organization
[8]. Literature study reveals that there is a great intersection
between Software Engineering and Artificial Intelligence. The
knowledge of experts of related fields can be captured, elicited
and copied into computer to work just like commitment to
users.
B. Background and Related Work
Literature study shows that research has been taken on
different aspects of software process models but there is no
standard criteria being developed for evaluation and selection
of software process model. To select an appropriate software
process model, which completely suits for a particular situation
is very difficult task as much of the problems as well as process
models cannot be separated among each other due to their
mixed characteristics. Process modeling is a rather young and
very active research area during the last few years, new
languages and methods have been proposed to describe
software processes [9]. Lonchamp [10] focused some
framework conceptual issues and terminology of process, such
as; framework for Process, Process Models, Meta-process, and
process centered software engineering environment. Yu &
Mylopoulos, [11] presents a model which captures the
intentional structure of a software process and its embedding
organization. The model is embedded in the conceptual
modeling language Tools. The expert systems could advise the
domain engineer in programming without the detailed
experience in programming languages, Integrate with the help
of deductive database and domain knowledge, the previously
developed software components to new complex functionalities
[12]. Canfora [13] describe the results and lessons learned in
the application of the Framework for the Modeling and
Measurement of Software Processes in a software company
dedicated to the development and maintenance of software for
information systems. LSO is an organization that learns within
the domain of software development, evolution and application
[8]. Modeling concept is well accepted in software engineering
discipline. However, there is still a lacking integration of
software process modeling and software process measurement
by software engineers. This paper aims to portray the idea and
result of integrating measurement in software process modeling
[14]. Kim & Gil [15] propose a complementary approach,
KANAL (Knowledge Analysis) which helps users and check
process models. Liao et al, [16] described an ontology-based
approach to express software processes at the conceptual level.
Software process is viewed as an important factor to deliver
high quality products. Although there have been several
Software Process Models proposed, the software processes are
still short of formal descriptions.
Literature study reveals, the integration of matured AI
methods and techniques with conventional software
engineering remains difficult and poses both implementation
problems and conceptual problems [3]. Lonchamp [10] focused
some framework conceptual issues and terminology of process.
Raza [7] stated software development problems includes
conceptual specifying, designing, testing the conceptual
construct and representation problems that comprising
representing software and testing the reliability of a
representation. A basic problem of software engineering is the
long delay between the requirements specification and the
delivery of a product. This long development cycle causes
requirements to change before product arrival. Canfora [13]
described, modeling and measurement are two key factors to
promote continuous process improvement. Turban [17] stated
expert system may contain components of knowledge
acquisition subsystem, knowledge base, inference engine, user
interface, explanation subsystem, and knowledge refinement
system. Awad [18] described four components of AI systems: a
knowledge base, an inference engine, justifier/scheduler, and
user interface. Durkin [5] stated that Expert systems solve
problems using a process which is very similar to the methods
used by the human expert. Knowledge base, working memory,
inference engine, and Explanation Facility are the components
of Expert System.
Knowledge base is core component of expert system in AI.
Durkin [5] described, the knowledge base contains specialized
knowledge on a given subject that makes the human a true
expert on the subject. This knowledge is obtained from the
human expert and encoded in the knowledge base using one of
several knowledge representation techniques. One of the most
common techniques used today for representing the knowledge
in an expert system is rules. Turban [17] described a system
which emulates human intelligence in system by capturing
knowledge and expertise from knowledge sources is known as
artificial intelligent system. Hence it is the need of the day to
develop a Knowledge base System to work just like a
consultant for Software Engineers for selection of a proper
process model for software development.
III. PROBLEM MODELING
In this research work critical factors were identified to
select a process model for a specific problem through a
questionnaire, which were verified by domain experts and were
analyzed using SPSS. These factors were assigned weights
using AHP. Decision making process in the proposed research

22 | P a g e
work has been shown through various models and tables.
Following paragraphs describes various models to depict the
problem domain:
A. Conceptual Modeling of Prototype ESPMS
This model depicts the whole process of knowledge
acquisition through decision making using ESPMS. See Fig 1.

Figure 1. Conceptual Model of the ESPMS

B. Decision Making Model
The questionnaire (i.e. knowledge acquisition tool) was sent
to about 100 domain experts of 124 universities (both public
and private) in the country. Expert opinions were analyzed
through using SPSS and weights of the critical factors were
known. The analysis resulted that project team, project type &
risk management, and Validation & verification were on the
top among the parameters [19]. See Figure 2 and Appendix-I
for detail.

Figure 2. Decision Making by Priorities Assignments to Parameters
C. Expert System Model
Expert System model represents how the Expert System
will be developed. Expert Systems are developed either
through using expert system languages (i.e. PROLOG, LISP
etc) or through using expert system shells (i.e. ESTA, EXSYS
(Novice Software Engineers and Designer)
Knowledge
Representation
Inputs
Performance
& Validity
Checking
Explanations &
Consultation
User inputs and
system results
USERS
Knowledge Base
ESTA
Data
Base
Domain Experts
(Expert Knowledge)
Knowledge Refinements
Problems/ Question
Experts Solution
Decision
Making
Criteria
Formalized/
Structured
Knowledge
Knowledge Sources
(Books, Journals, Databases, Reports,
Internet, etc)
Rules of Thumb
(Heuristic
Knowledge)
Knowledge Domain
Inference
Knowledge
Implementation
& Maintenance
0.0853

Documentatio
n
0.0854

Validation
and
Verification
0.0978

System
Integration
and Testing
0.0912

Quality
Assurance and
Quality
Control 0.0945

Unit Development,
Software
Integration &
Testing 0.0906
System Design
and
Architectural
Design 0.0859
Selection &
Evaluation
Criteria
for
Software
Process
Model
Max 1.0000
Requirement
s Analysis
and
Specification
0.0910
Project Team
0.0897

User
Community
0.0921

Project
Type and
Risk 0.0964


23 | P a g e
etc). We adopted ESTA (Expert System for Text Animation) as
development tool for expert system development [20].
ESTA was combined with the knowledge-base to develop
the ESPMS, shown as below:
ESPMS = ESTA + Knowledge Base
D. Dialogue Mechanism of ESPMS
ESTA has a special DDE (Dynamic Data Exchange)
component, which can share knowledge with the external
environment (i.e. other software and databases). Figure 4
represents how Expert System exchanges information with its
environment.

Figure 3: ESTA Dialogue Mechanism of Expert System
E. Production Rules
Rules represent the major elements of a modular knowledge
representation scheme, especially when using shells. Rules are
conditional statements that are easy to understand and write;
they specify an action to be taken if a certain condition is true.
They also express relationship between parameters or variables.
In expert systems vocabulary, production rules are also called
premise-action, cause-effect, hypothesis-action, condition-
action, IFTHEN, or IF .THEN .ELSE, [18].
The basic idea of using rules is to represent the experts
knowledge in the form of premise-action pairs as follows:
Syntax: IF (premise) THEN (action)
e.g. IF X < 0.20 THEN Prototype Process Model
The above example shows that if the value of X is less
than 0.20 then Prototype Process Model will be selected.
F. Symbolic Modeling
For proposed intelligent framework, analytical hierarchy
process (AHP) has been used for decision making process in
selection and evaluation factors in the software process model.
The AHP is a structured technique for handling with complex
decision problem, developed by Thomas L. Saaty in 1970s,
which is based on Mathematics & Psychology. It provides a
framework for solving decision problem and quantifying its
elements, for overall goals; also evaluating possible alternative
solutions [21].
AHP was used to prioritize the decision making parameters
in different levels. The weights of individual factors (i.e.
levels) were summed up to level 2 and the weights of level 2
were summed up to get the value of level 1. See Appendix-I
for detail.
Maximum weight is 1 and therefore the weight of main
goal is 1.000, which is the sum of all the factors weights. To
achieve the main goal, first of all we sum up the weights of sub
items which become the weight of their groups parameter, and
at the end sum of the weight of groups parameters become the
weight of the goal. The level wise weight assignments to main
groups & sub-criteria elements are shown in Appendix-I. To
calculate the score of sub-hierarchy the following formula is
used.
The following mathematical model evaluates the final value
of the objective function. The weight of concerned parameter is
multiplied with its assigned weight and summed up together.
See equation (1) through equation (4).

n
q
q q W C
Kq
1
. (1)
Where
Kq = q
th
main criteria.
n = number of sub-criteria in the q
th
Criteria.
C
q
= Fuzzy value of q
th
parameter
W
q
= Weight of the relative parameter.

To calculate the overall score of the decision hierarchy, the
equation is re-defined as:
Total Score = K
1
+ K
2
+ K
3
+ . . . . + K
11
. (2)

It implies that:
11
1 i
Ki Score Total . (3)
Where,
Wi= Expert weight of the i
th
main criteria.
Ki= Calculation obtained from equation (1) or
value of main criteria obtained through
equation (1).
From equation (1) and (3), we derived equation (4) as
below:

11
1 1
) (
i
n
q
Wq Cq Score Total ... (4)
Processs evaluation score could not be achieved efficiently
through using yes/no, i.e. 1 or 0; as the values of parameters are
qualitative in nature. For these qualitative parameters we use
fuzzy logic. Zadeh [22] used fuzzy logic to measure the
continuous values.
Since, through out in the decision making process
parameters weight assigned from experts opinion is constant,
final decision ranking score computed from the overall decision
System Users Knowledge Engineer
Dialog Dialog
DDE Facility
External Database
(SQL Server or MS Access)
User Interface
Knowledge Base
(Collection of facts)
Control Mechanism
(Inference Engine)

24 | P a g e
hierarchy is varied due to the input parameters values entered
by users. The value of parameters, might be 0 or 1 [22], if a
special parameter is present then the parameter weight is
multiplied by 1, else by 0 if not present.
According to Zadeh [22, 24] variables words or sentences
as their values is called linguistics variables and the variables
that represents the gradual transition from high to low, true to
false is called fuzzy variables and a set containing these
variables is the fuzzy set. Their degree of membership is [1, 0],
where 1 represents highest membership and 0represents no
membership.
We defined a fuzzy variable set for conceptual framework
model as:
Fuzzy Set = {Extremely Strong, Strong, Moderate, Weak,
Extremely Weak}
Their fuzzy membership values are as: Fuzzy membership
value = {1.0, 0.75, 0.5, 0.25, 0}

Table I depicts the fuzzy variables with respective degree of
membership value. In the following table, from top to bottom a
gradual transition is represented from extremely strong to
extremely weak in the fuzzy variables and the respective
degree of membership values.
TABLE I. PARAMETERS OF FUZZY VALUES
Fuzzy variable Degree of Membership
Extremely Strong 1.0
Strong 0.75
Moderate 0.50
Weak 0.25
Extremely Weak 0

These fuzzy values are input parameters values provided
by the users during consultation and the final ranking score of a
particular processs evaluation is calculated by the system at
run time, but here an example of computation in the decision
making score is depicted. If qualitative value of a parameter is
extremely strong, then related numeric value 1.0 will be
multiplied with parameters weight.
Let, suppose a parameter risk analysis is assigned a weight
0.046 by experts then the fuzzy decision score can be
calculated as shown in the Table II.
The overall weights of all the parameters are calculated by
Equation (4), and the resultant score will be the final score for
decision making.
Decision score calculated by linguistic mapping, which is
the output of the intelligent system and also description for a
selection of software process model ranking is depicted in the
Table III.
IV. CONCLUSION
This research is promising to solve the problems associated
with the existing approach to intelligent framework modeling.
These models will become a base for selection of an
appropriate process model for Expert systems development (i.e.
ESPMS). Neither there exist any strict rules to be followed to
select a software process model nor any consultative system to
guide novice user. This an attempt to integrate various
technologies, like Expert Systems, AHP, Fuzzy Logic and
Decision Making to solve real world problems.
TABLE II. FUZZY SCORE CALCULATION
TABLE III. LINGUISTIC DESCRIPTION OF PROPOSED SYSTEM
WITH OUTPUT (X)

V. FUTURE SCOPE
Following the decision issues and the accompany models
presented in this paper, a prototype ESPMS can easily be
developed. This prototype ESPMS can be linked with external
database and other software to develop a full-fledge Expert
System for final decision making in selection of a process
model for a particular software project. This work may become
a base for solving other similar problems.
REFERENCES
[1] [Pressman R. S., Software Engineering a Practitioners Approach, Fifth
Edition, McGraw Hill, 2001.
[2] Reddy A. R. M, Govindarajulu P., Naidu M. A Process Model for
Software Architecture, IJCSNS, VOL.7 No.4, April 2007..
[3] Jorge L. Daz-Herrera, Artificial Intelligence (AI) and Ada: Integrating
AI with Mainstream Software Engineering, 1994.
[4] Rech J., Althoff K. D., Artificial Intelligence and Software
Engineering: Status and Future Trends, 2004.
[5] Durkin, J., Application of Expert Systems in the Sciences, OHIO J.
SCI. Vol. 90 (5), pp. 171-179,1990
[6] Kazaz, A., Application of an Expert System on the Fracture Mechanics
of Concrete. Artificial Intelligence Review. 19, 177190, 2003.
[7] Raza F. N., Artificial Intelligence Techniques in Software
Engineering (AITSE), IMECS 2009, Vol-1, Hong Kong, 2009.
Parameters fuzzy
value
Membership
value
* Parameters
weight
Fuzzy score
Extremely Strong 1.0 0.046 0.046
Strong 0.75 0.046 0.034
Moderate 0.5 0.046 0.023
Weak 0.25 0.046 0.011
Extremely Weak 0 0.046 0
System Output Linguistic Description
X < 0.20 Prototyping Process Model
0.20X0.40 RAD Process model
0.40X0.60 Evolutionary Process Models
(Incremental, Spiral, Win-Win Spiral &
Concurrent Development Model)
0.60X0.80 Waterfall Process Model
X0.80 Component-Based Development

25 | P a g e
[8] Ruhe G., Learning Software Organizations, Fraunhofer Institute for
Experimental Software Engineering (IESE), Volume 2 , Issue 3-
4 (October-November 2000), Pages: 349 367, ISSN:1387-3326.
2000
[9] Armenise P., Bandinelli S, Ghezzi C. & Morzenti A, A Survey and
Assessment of Software Process Representation Formalisms, CEFRIEL
Politecnico di Milano, 1993.
[10] Lonchamp J. (1993), A Structured Conceptual and Terminological
Framework for Software Process Engineering, (CRIN), France, 1993.
[11] Yu E. S. K. & Mylopoulos J., Understanding Why in Software
Process Modelling, Analysis, and Design, Proc. 16th Int. Conf. Software
Engineering, 1994.
[12] Grobelny P., The Expert System Approach in Development of Loosely
Coupled Software with Use of Domain Specific Language, Proceedings
of the International Multi-conference on Computer Science and
Information Technology, pp. 119 123, ISSN 1896-7094, 2008.
[13] Canfora G., Garca F., Piattini M., Ruiz F. & C. A. Visaggio, Applying
a framework for the improvement of software process maturity, Softw.
Pract. Exper. 36:283304, 2006.
[14] Atan R., Ghani A. A. A, Selamat M. H., & Mahmod R, Software
Process Modelling using Attribute Grammar, IJCSNS VOL.7 No.8,
2007.
[15] Kim J., & Gil Y., Knowledge Analysis on Process Models, Information
Sciences Institute University of Southern California, (IJCAI-2001),
Seattle, Washington, USA. 2001
[16] Liao L., Yuzhong Qu Y. & Leung H. K. N., A Software Process
Ontology and Its Application. 2005.
[17] Turban, E., Expert Systems and Applied Artificial Intelligence. New York:
Macmillan Publishing Company, 1992.
[18] Awad, E.M. Building Experts Systems: Principals, Procedures, and
Applications, New York: West Publishing Company, 1996.
[19] Abdur Rashid Khan, Zia Ur Rehman. Factors Identification of Software
Process Model Using a Questionnaire, IJCNS) International Journal of
Computer and Network Security, Vol. 2, No. 5, May 2010 90.
[20] Prolog Development Center, Expert System Shell for Text Animation
(ESTA), version 4.5, A/S. H. J. Volst Vej 5A, DK-2605 Broendby
Denmark, copy right1992-1998.
[21] Saaty T. L., The analytic hierarchy process: planning, priority setting
and resource allocation. New York: McGraw-Hill, 1980.
[22] Zadeh, L.A., Fuzzy sets. Inform. and control, 8, 338-353,
1965.
[23] Khan, A.R., Expert System for Investment Analysis of Agro-based
Industrial Sector, Bishkek 720001, Kyrgyz Republic, 2005
[24] Zadeh, L.A., The concept of a linguistic variable and its
application to
appropriate reasoning.information sciences, 8,43-80, 1975.
[25] Futrell R. T., Shafer L. I, Shafer D. F, Quality Software Project
Management, Low Price Edition, 2004.
[26] Anderson S., & Felici M., Requirement Engineering Questionnaire,
Version 1.0, Laboratory for Foundation of Computer Science, Edinburgh
EH9 3JZ, Scotland, UK 2001.
[27] Skonicki M., QA/QC Questionnaire for Software Suppliers, January
2006.
[28] Energy U. S. D., Project Planning Questionnaire,
www.cio.energy.gov/Plnquest.pdf (last access, 04 August 2009)
[29] Liu & Perry (2004), On the Meaning of Software Architecture,
Interview Questionnaire, Version 1.2, July, 2004.
AUTHORS PROFILE
Abdur Rashid Khan The author is presently working as an Associate
Professor at ICIT, Gomal University D.I.Khan, Pakistan. He received his PhD
degree from Kyrgyz Technical University, Kyrgyz Republic in 2004. He has
been published more than 23 research papers in national and international
journals and conferences. His research interest includes ES, DSS, MIS and
Software Engineering.
Zia Ur Rehman The author has received his MCS in Computer Science
from Institute of Information Technology, Kohat University of Science &
Technology (KUST), Kohat, Pakistan in 2005. He is currently pursuing his MS
degree in Computer Science from the same institute. His area of interest
includes software engineering, AI, knowledge engineering, expert system, and
applications of fuzzy logic.
Hafeez Ullah Amin- is a research student at Institute of Information
Technology, Kohat University of Science & Technology, Kohat 26000, KPK,
Pakistan. He has completed BS(Hons) in Inforamtion Technology and MS in
Computer Science in 2006 & 2009 respectiviely from the above cited
institution. His current research interests includes Artificial Intelligence,
Information System, and Data Base.

26 | P a g e
Modelling & Designing Land Record Information
System Using Unified Modelling Language
Kanwalvir Singh Dhindsa
CSE & IT Department,
B.B.S.B.Engg.College,
Fatehgarh Sahib,Punjab,India
[email protected]
Himanshu Aggarwal
Department of Computer Engg.,
Punjabi University,
Patiala, Punjab,India
[email protected]

Abstract - Automation of Land Records is one of the most
important initiatives undertaken by the revenue department to
facilitate the landowners of the state of Punjab. A number of
such initiatives have been taken in different States of the
country. Recently, there has been a growing tendency to adopt
UML (Unified Modeling Language) for different modeling needs
and domains, and is widely used for designing and modelling
Information systems. UML diagramming practices have been
applied for designing and modeling the land record information
system so as to improve technical accuracy and understanding
in requirements related with this information system. We have
applied a subset of UML diagrams for modeling the land record
information system. The case study of Punjab state has been
taken up for modelling the current scenario of land record
information system in the state. Unified Modeling Language
(UML) has been used as the specification technique. This paper
proposes a refined software development process combined
with modeled process of UML and presents the comparison
study of the various tools used with UML.

Keywords - I nformation system, Unified Modeling Language
(UML), software modelling, software development process, UML
tools.
I. INTRODUCTION
Computerization of Land Records is one of the most
important initiatives undertaken by the Revenue Department
to facilitate the landowners of the State. A number of such
initiatives have been taken in different States of India. The
paper proposes a UML based approach, where non-
functional requirements are defined as reusable aspects to
design and analysis. UML offers vocabulary and rules for
communication and focus on conceptual and physical
representations of a system. UML uses an object oriented
approach to model systems which unifies data and
functions (methods) into software components called
objects. Various diagrams are used to show objects and
their relationships as well as objects and their
responsibilities (behaviors). UML is Standard for object-
oriented modeling notations endorsed by the Object
Management Group (OMG), an industrial consortium on
object technologies. UML has become a standard after
combining and taking advantage of a number of object
oriented design methodologies (Kobryn, 1999) and is
currently posed as a modeling language instead of a design
process.
A. Process of Data Digitisation
The automation of the projects related with information
systems is underway in many Govt. sectors. With the use of
the funds, 153 Fard kendras will be established in the Tehsils
of the State to provide certified copies of the Revenue Records
to the general public. Some farad centres have been already
opened in few tehsils and sub-tehsils, for to be used by public.
The land records (Jamabandi etc.), generally are updated after
every 5 years.The legacy land records to be digitized are:
Jamabandi, Mutation, Roznamcha Waqiati, Khasra Girdawari
and Field Book. Lack of faith and undefined procedures
regarding services being provided to the citizens. This
implementation of changing the paper record into digital
records will lead to facilitation of the farmers, maintaining
better transparency of the revenue records, lead to drastic
reduction of fraudulent practices, level of corruption and
procedural hassles relating to the management of the land
records, will lead to reduction in time delay and will also work
as a faith building measure, providing service to citizens.
II. UNIFIED MODELLING LANGUAGE
UML (Unified Modelling Language) is a complete
language for capturing knowledge(semantics) about a
subject and expressing knowledge(syntax) regarding the
subject for the purpose of communication. It applies to
modeling and systems. Modeling involves a focus on
understanding a subject (system) and being able to
communicate in this knowledge. It is the result of unifying
the information systems and technology industrys best
engineering practices (principals, techniques, methods and
tools). It is used for both database and software modeling.
UML attempts to combine the best of the best from: Data
Modeling concepts (Entity Relationship Diagrams),
Business Modeling (work flow), Object Modeling and
Component Modeling. UML is defined as: UML is a
graphical language for visualizing, specifying,
constructing, and documenting the artifacts of a software
intensive system [Booch]. Software architecture is an area
of software engineering directed at developing large,
complex applications in a manner that reduces development
costs, increases the quality and facilitates evolution[8]. A
central and critical problem software architects face is

27 | P a g e
how to efficiently design and analyze software
architecture to meet nonfunctional requirements. UML
offers vocabulary and rules for communication and focus
on conceptual and physical representations of a
system.
The various structural things in UML are Class,
Interface, Collaboration, Use-case, behavioral things
comprise of Interaction, State machine, Grouping things
comprise of packages and notes.
a) Things: important modeling concepts.
b) Relationships: tying individual things (i.e., their
concepts).
c) Diagrams: grouping interrelated collections of things
and relationships.
The artifacts included in standard UML consist of:
Use case diagram, Class diagram, Collaboration
diagram, Sequence diagram, State diagram, Activity
diagram, Component diagram and Deployment diagram
(OMG, 1999).There are different ways of using UML
in terms of design methodologies to accomplish different
project objectives.
III. SYSTEM ANALYSIS & DESIGN
Unified Modeling Language (UML) is used as a
specification technique for the system analysis and design
process involved in the software development life cycle.
A. Modelling & Designing Using UML
1) Case Scenario : Land Record Information System

UML is built upon the MOF metamodel for OO
modeling. A modeling method comprises a language and
also a procedure for using the language to construct models,
which in this case is Unified Modeling Language(UML).
Modeling is the only way to visualize ones design and
check it against requirements before developers starts to
code. The land record information system is modeled using
use-case, sequence, class, and component diagrams offered
by the Unified Modeling Language.
a) Use-Case Diagram: Use case diagrams describe
what a system does from the standpoint of an external
observer [17]. Use Case Diagrams describe the
functionality of a system and users of the system. And
contain the following elements:
Actors, which represent users of a system, including
human users and other systems.
Use Cases, which represent functionality or services
provided by a system to users.

{ *as modeled in StarUML }

b)Class Diagrams & Object Diagrams: Being the
most important entity in modeling object-oriented software
systems, it is used to depict the classes and the static
relationships among them [3]. Class Diagrams describe the
static structure of a system, or how it is structured rather
than how it behaves. These diagrams contain the following
elements:
Classes, which represent entities with common
characteristics or features. These features include
attributes, operations and associations.
Associations, which represent relationships that
relate two or more other classes where the relationships
have common characteristics or features.

c) Object Diagrams: describe the static structure of a system
at a particular time. Whereas a class model describes all
possible situations, an object model describes a particular
situation. Object diagrams contain the following elements:
Objects, which represent particular entities. These are
instances of classes.
Links, which represent particular relationships
between objects. These are instances of associations.

d) Collaboration Diagrams & Component Diagrams:
Component diagram is one of UMLs architectural
diagrams used to effectively describe complex architectures
as a hierarchy of components (subsystems) communicating
through defined interfaces [6]. Collaboration Diagrams
describe interactions among classes and associations. These
interactions are modeled as exchanges of messages
between classes through their associations. Collaboration
diagrams are a type of interaction diagram. Collaboration
diagrams contain the following elements:
i) Class roles, which represent roles that objects

28 | P a g e
may play within the interaction.
ii) Association roles, which represent roles that
links may play within the interaction.
iii) Message flows, which represent messages sent
between objects via links. Links transport or
implement the delivery of the message.

{*as modeled in StarUML }
e) Deployment Diagrams : Deployment diagrams describe
the configuration of processing resource elements and
the mapping of software implementation components
onto them. These diagrams contain components and
nodes, which represent processing or computational
resources, including computers, printers, etc. Each cube
icon is known as a node representing a physical system.
All the system requirements are shown in the
architecture which is used for the land record
information system. All the modules of the information
system have been developed using Visual Basic with
SqlServer at the backend. The web components are
hosted on Apache web server and use Java Servlets.
The modeled components* are shown in the
deployment diagram.


IV. MODELLING TOOLS USED IN UML
The various types of tools used for modelling in Unified
Modeling Language(UML) are:
a) Modeling Tools : Rational Rose, ArgoUML,
Together,
UMbrello
b) Drawing Tools : Visio, Dia
c) Metamodels: Eclipse UML2, NSUML, OMF
d) Renderers: Graphviz, UMLDoc
e) IDEs: Visual Studio 2005, XCode 2,
Rational XDE
A. Comparison of UML Tools
The Unified Modelling Language(UML) tools used for
modeling the design of various information systems are
compared by taking some vital parameters which
distinguish each one of them; giving fairly the advantage of
one tool over the other.
TABLE I. COMPARISON OF UML TOOLS
Tools Strength/Stability Cost Additional
Features
Current
Status
Rational
Rose
Full-strength industrial
modeling suite
expensi
ve
Office for UML,
add-ons, plug-
ins, scripting
interface,
plug-in to MS
Visual Studio
and Eclipse
Re-
developed as
Rational
XDE

Together Supports most
UML diagrams
Mid-
range
cost
Can reverse
engineer with
C++,Java
Generate
source code
for
C++, Java
Exports to
PNG
ArgoUML Open source
UML modeling
application
written in Java
Free to
downlo
ad
Supports most
diagram types,
reverse
engineering
and code
generation for
Java
Forked
into
commercial
product
Poseidon
Umbrello Open source
modeling
application for
KDE, written in
C++
Free to
downlo
ad
Supports data
modeling for
SQL, reverse
engineering
and code
generation
Under active
developmen
t
MS Visio Fairly compliant
with UML
metamodel
Not
interop
erable
used for creating
2D schematics
and diagrams


29 | P a g e
Dia Open source
graphics drawing
Program form
GNOME
Free Supports the
creation of
some
UML diagram
types
Used
frequently
by open
Source
develop
ers
Graphviz

and

UMLDoc
AT&T Graphviz -Accepts
graph
specification
input, generates
PNG, PDF
layouts of graphs

UMLDoc -
parses Java
comments to
produce
diagrams
mid-
range
generates PNG,
PDF layouts of
graphs

Actually uses
graphviz to
create diagrams

MS-Visual
Studio
Supports UML-
like diagrams
for .NET
languages (i.e.,
C#)
New
part in
VS
2005
Provides support
for roundtrip
engineering, and documentation generation
New part in
VS 2005
XCode2 Claims support
for C, C++,
and Java

Provides UML-
likeclass
diagrams for
Objective-C,
Can be used
for roundtrip
engineering
Provides
UML-like
class
diagrams
for
Objective-C
Rational
XDE
Visual modeling
suite for
UML
Costly Plugs in to
many different
IDEs (Visual
Studio .NET,
Eclipse, IBM
WebSphere),
Supports
roundtrip
engineering
Provides features of
Rational
Rose

V. UML IN INFORMATION SYSTEMS: ITS
APPLICATIONS

Any type of application, running on any type and
combination of hardware, operating system,
programming language, and network can be
modeled in UML.
UML Profiles (that is, subsets of UML tailored for
specific purposes) help to model Transactional,
Real- time, and Fault-Tolerant systems in a natural
way.
UML is effective or modeling large, complex
software systems.
It is simple to learn for most developers, but
provides advanced features for expert analysts,
designers and architects.
It can specify systems in an implementation-
independent manner.
Structural modeling specifies a skeleton that can be
refined and extended with additional structure and
behavior.
Use case modeling specifies the functional requirements
of system in an object-oriented manner. Existing source
code can be analyzed and can be reverse-engineered into
a set of UML diagrams.
UML is currently used for applications other than drawing
designs in the fields of Forward engineering, Reverse
engineering, Roundtrip engineering and Model-Driven
Architecture (MDA). A number of tools on the market
generate Test and Verification Suites from UML models.

VI. CONCLUSION & FUTURE SCOPE
UML tools provide support for working with the UML
language for the development of various types of
information systems. From the paper, it is concluded that
each UML tool is having its own functionality and can be
used, according to the need of the software development
cycle for the development of information systems. The
three different views of using UML are: Documenting
design up front, maintaining design documentation after the
fact and generating refinements or source code from
models. This paper has concluded with the aspect that
information system can be modeled using UML due to its
flexibility and inherent nature & the tools tend to add to its
ever-increasing demand for the use of development of
information systems. UML can still further be considered
as part of mobile development strategy and further
planning can also be done to conceive the unified
modeling principles for later stages of enhancement of
land record information system.
Future work that could be pursued includes applying the
software process to large scale m-commerce application
systems and generating the model diagrams with UML,
for them to be made specially tailored for the software
development process; providing backbone to the analysis
and design phases associated in the SDLC.
REFERENCES
[1] A. Gurd, Using UML 2.0 to Solve Systems Engineering Problems,
White Paper, Telelogic,2003.
[2] Blaha, M. & Premerlani, W., Object-Oriented Modeling and Design for
Database Applications, Prentice Hall, New Jersey,1998.
[3] R. Miller, Practical UML: A Hands-On Introduction for
Developers, White Paper, Object Mentor Publications,1997.
[4] B. Graham, Developing embedded and mobile Java technology-based
applications using UML, White Paper, IBM Developerworks,2003.
[5] Y. . Fowler, UML Distilled: a brief guide to the standard object
modeling language, 3rd ed., Addison- Wesley,2004.
[6] P.Jalote, A.Palit, P.Kurien, V.T. Peethamber,Timeboxing: A
Process Model for Iterative Software Development, Journal of
Systems and Software (JSS), Volume 70, Number 1-2, pp.117-127,
2004.

30 | P a g e
[7] Nikolaidou M., Anagnostopoulos D., A Systematic Approach for
Configuring Web-Based Information Systems, Distributed and Parallel
Database Journal, Vol 17, pp 267-290, Springer Science, 2005.
[8] M. Shaw, and D. Garlan, Software Architecture: Perspectives on an
Emerging Discipline, Prentice Hall, 1996.
[9] Kobryn, C., UML 2001: a standardization odyssey'', Comm. of the
ACM, Vol. 42 No. 10, October, pp. 29-37,1999.
[10] OMG UML Revision Task Force,OMG-Unified
Modeling Language Specification, http://uml.systemhouse.mci.com/
[11] Jeusfeld, M.A. et al.: ConceptBase: Managing conceptual models
about information systems. Handbook of Information Systems,
Springer-Verlag ,pp. 265-285,1998.
[12] Berardi D., Calvanese D., and De Giacomo G.: Reasoning on
UML class diagrams, Artificial Intelligence, 168, 70-118,2005.

AUTHORS PROFILE
Er. Kanwalvir Singh Dhindsa is currently an Assistant Professor at CSE &
IT department of B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India. He
received his M.Tech. from Punjabi University, Patiala (Punjab) and is
currently pursuing Ph.D. degree in Computer Engineering from the same
university. His research interests are Information Systems,Relational Database
Systems and Modelling Languages. He is a member of IEI, ISTE and ACEEE.

Prof. (Dr.) Himanshu Aggarwal is currently an Reader at department of
Computer Engg. of Punjabi University,Patiala(Punjab). He received his Ph.D.
degree in Computer Engineering from Punjabi University in 2007. His
research interests are Information Systems, Parallel Computing and Software
Engineering. He has contributed 14 papers in reputed journals and 35 papers
in national and international conferences. He is also on the editorial board of
some-international-journals.


31 | P a g e
An Algorithm to Reduce the Time Complexity of
Earliest Deadline First Scheduling Algorithm in
Real-Time System

Jagbeer Singh
Dept. of Computer Science and
Engineering
Gandhi Institute of Engg. & Tech.
Gunupur, Rayagada, India-765022
[email protected]
Bichitrananda Patra
Dept. of Information Technology
Gandhi Institute of Engg. & Tech.
[email protected]
Satyendra Prasad Singh
Dept. of Master of Computer
Application
Gandhi Institute of Compt. Studies
[email protected]

AbstractTo this paper we have study to Reduce the time
Complexity of Earliest Deadline First (EDF), a global scheduling
scheme for Earliest Deadline First in Real Time System tasks on
a Multiprocessors system. Several admission control algorithms
for earliest deadline first are presented, both for hard and soft
real-time tasks. The average performance of these admission
control algorithms is compared with the performance of known
partitioning schemes. We have applied some modification to the
global earliest deadline first algorithms to decrease the number of
task migration and also to add predictability to its behavior. The
Aim of this work is to provide a sensitivity analysis for task
deadline context of multiprocessor system by using a new
approach of EFDF (Earliest Feasible Deadline First) algorithm.
In order to decrease the number of migrations we prevent a job
from moving one processor to another processor if it is among the
m higher priority jobs. Therefore, a job will continue its
execution on the same processor if possible (processor affinity).
The result of these comparisons outlines some situations where
one scheme is preferable over the other. Partitioning schemes are
better suited for hard real-time systems, while a global scheme is
preferable for soft real-time systems.
Keywords- Real-time system; task migration, earliest deadline first,
earliest feasible deadline first.
I. INTRODUCTION (HEADING 1)
Real-time systems are those in which its correct operation
not only depends on the logical results, but also on the time at
which these results are produced. These are high complexity
systems that are executed in environments such as: military
process control, robotics, avionics systems, distributed systems
and multimedia.
Real-time systems use scheduling algorithms to decide an
order of execution of the tasks and an amount of time assigned
for each task in the system so that no task (for hard real-time
systems) or a minimum number of tasks (for soft real-time
systems) misses their deadlines. In order to verify the
fulfillment of the temporal constraints, real-time systems use
different exact or inexact schedulability tests. The
schedulability test decides if a given task set can be scheduled
such that no tasks in the set miss their deadlines. Exact
schedulability tests usually have high time complexities and
may not be adequate for online admission control where the
system has a large number of tasks or a dynamic workload. In
contrast, inexact schedulability tests provide low complexity
sufficient schedulability tests.
The first schedulability test known was introduced by Liu
and Layland with the Rate Monotonic Scheduling Algorithm
[Liu, 1973] (RM). Liu and Layland introduced the concept of
achievable utilization factor to provide a low complexity test
for deciding the schedulability of independent periodic and
preemptable task sets executing on one processor.
In Earliest Deadline First scheduling, at every scheduling
point the task having the shortest deadline is taken up for
scheduling. The basic principle of this algorithm is very
intuitive and simple to understand. The schedulability test for
EDF is also simple. A task is schedule under EDF, if and only
if it satisfies the condition that total processor utilization (U
i
)
due to the task set is less than 1.
With scheduling periodic processes that have deadlines
equal to their periods, EDF has a utilization bound of 100%.
Thus, the schedulability test for EDF is:

Where the {C
i
} are the worst-case computation-times of the
n processes and the {T
i
} are their respective inter-arrival
periods (assumed to be equal to the relative deadlines).
The schedulability test introduced by Liu and Layland for
RM states that a task set will not miss any deadline if it meets
the following condition: U n(2
1/n
- 1). Liu and Layland
provided a schedulability tests that fails to identify many
schedulable task sets when the system is heavily overloaded.
After the work of Liu and Layland, many researchers have
introduced improvements on the schedulability condition for
RM for one and multi processors. These improvements include
the introduction of additional timing parameters in the
schedulability tests and transformations on the task sets. It is a
well-known fact that when more timing parameters are

32 | P a g e
introduced in the schedulability condition better performance
can be achieved.
For example let us Consider 3 periodic processes scheduled
using EDF, the following acceptance test shows that all
deadlines will be met.
Table 1: Task Parameters
Process Execution Time = C Period = T
P1 1 8
P2 2 5
P3 4 10
The utilization will be:

The theoretical limit for any number of processes is 100%
and so the system is schedulable.
EDF has been proven to be an optimal uniprocessor
scheduling algorithms [8].This means that if a set of tasks is
unschedulable under EDF, then no other scheduling algorithm
can feasible schedule this task set. The EDF algorithm chooses
for execution at each instant in the time currently active job(s)
that have the nearest deadlines. The EDF implementation upon
uniform parallel machines is according to the following rules
[2], No Processor is idled while there are active jobs waiting
for execution, when fewer then mjobs are active, they are
required to execute on the fastest processor while the slowest
are idled, and higher priority jobs are executed on faster
processors.
A formal verification which guarantees all deadlines in a
real-time system would be the best. This verification is called
feasibility test.
Three different kinds of tests are available:-
Exact tests with long execution times or simple
models [11], [12], [13].
Fast sufficient tests which fail to accept feasible task
sets, especially those with high utilizations [14], [15].
Approximations, which are allowing an adjustment of
performance and acceptance rate [1], [8].
For many applications an exact test or an approximation
with a high acceptance rate must be used. For many task sets a
fast sufficient test is adequate.
EDF is an appropriate algorithm to use for online
scheduling on uniform multiprocessors. However, their
implementation suffers from a great number of migrations due
to vast fluctuations caused by finishing or arrival of jobs with
relatively nearer deadlines. Task migration cost might be very
high. For example, in loosely coupled system such as cluster of
workstation a migration is performed so slowly that the
overload resulting from excessive migration may prove
unacceptable [3]. Another disadvantage of EDF is that its
behavior becomes unpredictable in overloaded situations.
Therefore, the performance of EDF drops in overloaded
condition such that it cannot be considered for use. In this
paper we are presenting a new approach, call the Earliest
Feasible Deadline First (EFDF) which is used to reduce the
time complexity of earliest deadline first algorithm by some
assumptions.
II. BACKGROUND AND REVIEW OF RELATED WORKS
Each processor in a uniform multiprocessor machine is
characterized by a speed or Computing capacity, with the
interpretation that a job executing on a processor with speed s
for t time units completes (s * t) units of execution. The
Earliest-Deadline First scheduling of real-time systems upon
uniform multiprocessor machines is considered. It is known
that online algorithms tend to perform very poorly in
scheduling such real-time systems on multiprocessors;
resource-augmentation techniques are presented here that
permit online algorithms in general (EDF in particular) to
perform better than may be expected given these inherent
limitations.
Generalization the definition of utilization from periodic
task to nonperiodic tasks has been studies in [23] and [24]. In
deriving the utilization bound for rate monotonic scheduler
with multiframe and general real time task models, Mok and
Chen in [25] and [26] proposed a maximum average utilization
which measures utilization in an infinite measuring window.
To derive the utilization bound for nonperiodic tasks and
multiprocessor system, the authors in [23] and [24] proposed a
utilization definition that is based on relative deadlines of tasks,
instead of periods. It is shown that EDF scheduling upon
uniform multiprocessors is robust with respect to both job
execution requirements and processor computing capacity.
III. SCHEDULING ON MULTIPROCESSOR SYSTEM
Meeting the deadlines of a real-time task set in a
multiprocessor system requires a scheduling algorithm that
determines, for each task in the system, in which processor they
must be executed (allocation problem), and when and in which
order, with respect to other tasks, they must start their
execution (scheduling problem). This is a problem with a
difficult solution, because (i) some research results for a single
processor not always can be applied for multiple processors
[17], [18], (ii) in multiple processors different scheduling
anomalies appear [19], [21], [20] and (iii) the solution to the
allocation problem requires of algorithms with a high
computational complexity.
The scheduling of real-time tasks on multiprocessors can be
carried out under the partitioning scheme or under the global
scheme. In the partitioning scheme (Figure 1.a) all the instances
(or jobs) of a task are executed on the same processor. In
contrast, in the global scheme (Figure 1.b), a task can migrate
from one processor to another during the execution of different
instances. Also, an individual job of a task that is preempted
from some processor, may resume execution in a different
processor. Nevertheless, in both schemes parallelism is
prohibited, that is, no job of any task can be executed at the
same time on more than one processor.
On both schemes, the admission control mechanism not
only decides which tasks must be accepted, but also it must
create a feasible allocation of tasks to processors (i.e., on each

33 | P a g e
processor, all tasks allocated must met their deadlines). For the
partitioning and global schemes, task sets can be scheduled
using static or dynamic schedulers. In any case, the
computational complexity associated to the admission control
must remain as low as possible, especially for the dynamic
case.
The partitioning scheme has received greater attention than
the global scheme, mainly because the scheduling problem can
be reduced to the scheduled on single processors, where at the
moment a great variety of scheduling algorithms exist. It has
been proved by Leung and Whitehead [18] that the partitioned
and global approaches to static-priority scheduling on identical
multiprocessors are incomparable in the sense that (i) there are
task sets that are feasible on identical processors under the
partitioned approach but for which no priority assignment
exists which would cause all jobs of all tasks to meet their
deadlines under global scheduling on the same processors,
and (ii) there are task sets that are feasible on identical
processors under the global approach, which cannot be
partitioned into distinct subsets such that each individual
partition is feasible on a single static-priority uniprocessor.

Fig. 1. (a). Partitioning and (b). Global Scheduling Schemes
IV. OUR PROPOSED GRID APPROXIMATION STRATEGY
We have applied some modification to the global Earliest
Deadline First algorithms to decrease the number of task
migration and also to add predictability to its behavior. In order
to decrease the number of migrations we prevent a job from
moving to another processor if it is among the mhigher priority
jobs. The scheduling algorithms can be classified in static and
dynamic. In a static scheduling algorithm, all scheduling
decisions are provided a priori. Given a set of timing
constraints and a schedulability test, a table is constructed,
using one of many possible techniques (e.g., using various
search techniques), to identify the start and completion times of
each task, such that no task misses their deadlines. This is a
highly predictable approach, but it is static in the sense that
when the characteristics of the task set change the system must
be re-started and its scheduling table re-computed.
In a dynamic scheduling algorithm, the scheduling decision
is executed at run-time based on task's priorities. The dynamic
scheduling algorithms can be classified in algorithms with fixed
priorities and algorithms with variable priorities. In the
scheduling algorithms with fixed priorities, the priority of each
task of the system remains static during the complete execution
of the system, whereas in an algorithm with variable priorities
the priority of a task is allowed to change at any moment.
The schedulability test in static scheduling algorithms can
only be performed off-line, but in dynamic scheduling
algorithms it can be performed off-line or on-line. In the o-
line scheduling test, there are complete knowledge of the set of
tasks executing in the system, as well as the restrictions
imposed to each one of the tasks (deadlines, precedence
restrictions, execution times), before the start of their
execution. Therefore no new tasks are allowed to arrive in the
system. Therefore, a job will continue its execution on the same
processor if possible (processor affinity
1
).
A. The Strategy
In Earliest Deadline First scheduling, at every scheduling
point the task having the shortest deadline is taken up for
scheduling. The basic principle of this algorithm is very
intuitive and simple to understand. The schedulability test for
Earliest Deadline First is also simple. A task is schedule under
EDF, if and only if it satisfies the condition that total processor
utilization due to the task set is less than 1. For a set of periodic
real-time task {T1, T
2
, T
n
}, EDF schedulibility criterion can be
expressed as:-

Where e
i
is the execution time, p
i
is the priority of task and
u
i
is the average utilization due to the task T
i
and n is the total
number of task in set. EDF has been proven to be an optimal
uniprocessor scheduling algorithm [8]. This means that if a set
of task is unschedulable under Earliest Deadline First , then no
other scheduling algorithm can feasible schedule this task set.
In the simple schedulability test for EDF we assumed that the
period of each task is the same as its deadline. However in
practical problem the period of a task may at times be different
from its deadline. In such cases, the schedulability test needs to
be changed. If p
i
>d
i
, then each task needs e
we are
ount of
computing time every min(p
i
, d
i
) duration time. Therefore we
can write:

However, if p
i
<d
i
, it is possible that a set of tasks is EDF
schedulable , even when the task set fail to meet according to
expression
B. Mathematical Representation
Our motivation for exploiting processor affinitydrive from
the observation that, for much parallel application, time spent
bringing data into the local memory or cache is significant
source of overhead, ranging between 30% to 60% of the total
execution time [3]. While migration is unavoidable in the

34 | P a g e
global schemes, it is possible to minimize migration caused by
a poor assignment of task to processors.
By scheduling task on the processor whose local memory or
cache already contains the necessary data, we can significantly
reduce the execution time and thus overhead the system. It is
worth mentioning that still a job might migrate to another
processor when there are two or more jobs that were last
executed on the same processor. A migration might also
happen when the numbers of ready jobs become less than the
number processors. This fact means that our proposed
algorithm is a work conserving one.
In order to give the scheduler a more predictable behavior
we first perform a feasibility check to see whether a job has a
chance to meet its deadline by using some exiting algorithm
like Yaos [16]. If so, the job is allowed to get executed.
Having known the deadline of a task and its remaining
execution time it is possible to verify whether it has the
opportunity to meet its dead line. More precisely, this
verification can be done by examining a tasks laxity
3
. The
laxity of a real-time task T
i
at time t, L
i
(t), is defined as
follows:-
L
i
(t) =D
i
(t) - E
i
(t)
Where D
i
(t) is the dead line by which the task T
i
must be
completed and E
i
(t) is the amount of computation remaining to
be performed. In other words, Laxity is a measure of the
available flexibility for scheduling a task. A laxity of L
i
(t)
means that if a task T
i
is delayed at most by L
i
(t) time units, it
will still has the opportunity to meet its deadline.
A task with zero laxity must be scheduled right away and
executed without preemption or it will fail to meet its deadline.
A negative laxity indicates that the task will miss the deadline,
no matter when it is possible picked up for execution. We call
this novel approach the Earliest Feasible Deadline First
(EFDF)
C. EFDF Scheduling Algorithm
Let m denote the number of processing nodes and n, (nm)
denote the number of Available tasks in a uniform parallel real-
time system. Let s
1
, s
2
, s
m
denote the computing capacity of
available processing nodes indexed in a non-increasing
manner: s
j
s
j
+1 for all j, 1<j<m. We assume that all speeds
are positive i.e. s
j
>0 for all j. In this section we are presenting
five steps of EFDF algorithm. Obviously, each task which is
picked for up execution is not considered for execution by
other processors. Here we are giving following methods for our
new approach:
1. Perform a feasibility check to specify the task
which has a chance to meet their deadline and put
them into a set A, Put the remaining tasks into set
B. We can partition the task set by any existing
approach.
2. Sort both task sets A and B according to their
deadline in a non-descending order by using any
of existing sorting algorithms. Let k denote the
number of tasks in set A, i.e. the number of tasks
that have the opportunity to meet their deadline.
3. For all processor j, (jmin(k,m)) check whether a
task which was last running on the j
th
processor is
among the first min(k,m) tasks of set A. If so
assign it to the j
th
processor. At this point there
might be some processors to which no task has
been assigned yet.
4. For all j, (jmin(k,m)) if no task is assigned to the
j
th
processor , select the task with earliest deadline
from remaining tasks of set A and assign it to the
j
th
processor. If km, each processor have a task
to process and the algorithm is finished.
5. If k<m, for all j, (k<jm) assign the task with
smallest deadline from B to the j
th
processor. The
last step is optional and all the tasks from B will
miss their deadlines.
D. Experimental Evaluation
We conducted simulation-based experimental studies to
validate our analytical results on EFDF overhead. We consider
an SMP machine with four processors. We consider four tasks
running on the system. Their execution times and periods are
given in Table 2. The total utilization is approximately 1.5,
which is less than 4, the capacity of processors. Therefore,
LLREF can schedule all tasks to meet their deadlines. Note that
this task sets (i.e., max
N
{u
i
}) is 0.818, but it does not affect
the performance of EFDF, as opposed to that of global EDF
[22].

Table 2: Task Parameters (4 Task Set)
Process P
i
Execution Time C
i
Period T
i
U
i

P1 9 11 0.818
P2 5 25 0.2
P3 3 30 0.1
P4 5 14 0.357

Figure 1: Scheduler Invocation Frequency with 4 Tasks
In Figure 1, the upper-bound on the scheduler invocation
frequency and the measured frequency are shown as a dotted

35 | P a g e
line and a fluctuating line, respectively. We observe that the
actual measured frequency respects the upper bound.

Table 3: Task Parameters (8 Task Set)
Process P
i
Execution Time C
i
Period T
i
U
i

P1 3 7 0.429
P2 1 16 0.063
P3 5 19 0.263
P4 4 5 0.8
P5 2 26 0.077
P6 15 26 0.577
P7 20 29 0.69
P8 14 17 0.824

Figure 2: Scheduler Invocation Frequency with 8 Tasks
Figure 2 shows the upper-bound on the invocation
frequency and the actual frequency for the 8-task set.
Consistently with the previous case, the actual frequency never
moves beyond the upper-bound. We also observe that the
average invocation frequencies of the two cases are
approximately 1.0 and 4.0, respectively. As expected the
number of tasks proportionally affects EFDF overhead.
E. Complexity and Performance of the Partitioning
Algorithms
In Table 2 we are taking the compression of given standard
and simulated complexities of different algorithms given below
and we are comparing these complexities to our purposed
algorithm, the complexity and performance of the partitioning
algorithms is introduced. Note that the algorithms with lowest
complexity are RMNF-L&L, RMGT/M, and EDF-NF, while
the algorithm with highest complexity is RBOUND-MP. The
rest of the algorithms have complexity O(n log n). The
algorithms with best theoretical performance are RM-FFDU,
RMST, RMGT, RMGT/M, EDF-FF and EDF-BF.[16]
TABLE 2 :COMPLEXITY AND PERFORMANCE OF THE MULTIPROCESSOR
PARTITIONING ALGORITHMS

F. Complexity Analysis
The Earliest Deadline First algorithm would be
maintaining all tasks that are ready for execution in a queue.
Any freshly arriving task would be inserted at the end of queue.
Each task insertion will be achieved in O(1) or constant time,
but task selection (to run next) and its deletion would require
O(n) time, where n is the number of tasks in the queue. EDF
simply maintaining all ready tasks in a sorted priority queue
that will be used a heap data structure. When a task arrives, a
record for it can be inserted into the heap in O(log
2
n) time
where n is the total number of tasks in the priority queue.
Therefore, the time complexity of Earliest Deadline First is
equal to that of a typical sorting algorithm which is O(n log
2
n).
While in the EFDF the number of distinct deadlines that tasks
is an application can have are restricted.
In our approach, whenever a task arrives, its absolute
deadline is computed from its release time and its relative
deadline. A separate first in first out (FIFO) queue is
maintained for each distinct relative deadline that task can
have. The schedulers insert a newly arrived task at the end of
the corresponding relative deadline queue. So tasks in each
queue are ordered according to their absolute deadlines. To find
a task with the earliest absolute deadline, the scheduler needs to
search among the threads of all FIFO queues. If the number of
priority queue maintained by the scheduler in n, then the order
of searching would be O(1). The time to insert a task would
also be O(1). So finally the time complexity of five steps of
Earliest Feasible Deadline First (EFDF) are O(n), O(n log
2
n),
O(m), O(m), O(m), respectively.
V. CONCLUSION AND FUTURE WORK
This work focused on some modification to the global
Earliest Deadline First algorithms to decrease the number of
task migration and also to add predictability to its behavior.
Mainly Earliest Feasible Deadline First algorithms are
presented the least complexity according to their performance
analyzed. Experimental result of Earliest Feasible Deadline
First (EFDF) algorithm reduced the time complexity in
compression of Earliest Deadline First algorithm on real time
system scheduling for multiprocessor system and perform the
feasibility checks to specify the task which has a chance to
meet their deadline.

36 | P a g e
When Earliest Feasible Deadline First is used to schedule a
set of real-time tasks, unacceptable high overheads might have
to be incurred to support resource sharing among the tasks
without making tasks to miss their respective deadlines, due to
this it will take again more time. Our future research will
investigate other less complexity Algorithm and also reduced
the overhead for different priority assignments for global
scheduling which will, consequently, lead to different bounds.
We believe that such studies should be conducted regularly
by collecting data continuously so that skill demand patterns
can be understood properly. This understanding can lead to
informed curricula design that can prepare graduates equipped
with necessary skills for employment. Once such studies
are carried out, students can use the findings to select courses
that focus on those skills which are in demand. Academic
institutions can use the findings so that those skills in demand
can be taken into account during curriculum design.
As an advance to our work, in future, we have desire to
work on different deployment approaches by developing more
strong and innovative algorithms to solve the time complexity
of Earliest Deadline First. Moreover, as our proposed algorithm
is a generalized one, we have planned to expand our idea in the
field of Real Time System existing Rate Monotonic Algorithm
for calculating minimum Time Complexity. Moreover, we have
aim to explore some more methodologies to implement the
concept of this paper in real world and also explore for Fault
Tolerance Task Scheduling Algorithms to finding the Task
Dependency in single processor or multiprocessor system for
reducing the time for fault also reduce the risk for fault and
damage.
ACKNOWLEDGMENTS
The authors thank the reviewers of drafts of this paper. It is
profound gratitude and immense regard that we acknowledge
to Dr. S.P. Panda, Chairman, GGI, Prof. N.V.J. Rao Dean
(Admin), GGI for their confidence, support and blessing
without which none of this would have been possible. Also a
note to all professors here in GIET for the wisdom and
knowledge that they given us, all of which came together in the
making of this paper. We express our gratitude to all my
friends and colleagues as well for all their help and guidance.
REFERENCES
[1] S. Baruah, S. Funk, and J. Goossens , Robustness Results Concerning
EDF Scheduling upon Uniform Multiprocessors,IEEE Transcation on
computers, Vol. 52, No.9 pp. 1185-1195 September 2003.
[2] E.P.Markatos, and T.J. LeBlanc, Load Balancing versus Locality
Management in Shared-Memory Multiprocessors, The 1992
International Conference on Parallel Processing, August 1992.
[3] S. Lauzac, R. Melhem, and D. Mosses,Compression of Global and
Partitioning Scheme for Scheduling Rate Monotonic Task on a
Multiprocessor, The 10th EUROMICRO Workshop on Real-Time
Systems, Berlin,pp.188-195, June 17-18, 1998.
[4] Vahid Salmani ,Mohsen Kahani , Deadline Scheduling with Processor
Affinity and Feasibility Check on Uniform Parpllel Machines, Seventh
International Conference on Computer and Information Technology,
CIT.121.IEEE,2007.
[5] S. K. Dhall and C. L. Liu, On a real-time scheduling problem.
Operations Research, 26(1):127140, 1978.
[6] Y. Oh and S. Son. Allocating fixed-priority periodic tasks on
multiprocessor systems, Real-Time Systems Journal, 9:207239, 1995.
[7] J. Lehoczky, L. Sha, and Y. Ding. The rate monotonic Scheduling:
Exact characterization and average case behavior, IEEE Real-time
Systems Symposium, pages 166171, 1989.
[8] C.M. Krishna and Shin K.G. Real-Time Systems. Tata
McGrawiHill,1997.
[9] S. Chakraborty, S. Knzli, L. Thiele. Approximate Schedulability
Analysis. 23rd IEEE Real-Time Systems Symposium (RTSS), IEEE
Press, 159-168, 2002.
[10] J.A. Stankovic, M. Spuri, K. Ramamritham, G.C. Buttazzo. Deadline
Scheduling for Real-Time Systems EDF and Related Algorithms. Kluwer
Academic Publishers, 1998.
[11] S. Baruah, D. Chen, S. Gorinsky, A. Mok. Generalized Multiframe
Tasks. The International Journal of Time-Critical Computing Systems,
17, 5-22, 1999.
[12] S. Baruah, A. Mok, L. Rosier. Preemptive Scheduling Hard-Real-Time
Sporadic Tasks on One Processor. Proceedings of the Real- Time
Systems Symposium, 182-190, 1990.
[13] K. Gresser. Echtzeitnachweis Ereignisgesteuerter Realzeitsysteme.
Dissertation (in german), VDI Verlag, Dsseldorf, 10(286), 1993.
[14] M. Devi. An Improved Schedulability Test for Uniprocessor Periodic
Task Systems. Proceedings of the 15th Euromicro Conference on Real-
Time Systems, 2003.
[15] C. Liu, J. Layland. Scheduling Algorithms for Multiprogramming in
Hard Real-Time Environments. Journal of the ACM, 20(1), 46-61, 1973
[16] Omar U. Pereira Zapata, Pedro Meja Alvarez EDF and RM
Multiprocessor Scheduling Algorithms: Survey and Performance
Evaluation Report No. CINVESTAV-CS-RTG-02. CINVESTAV-IPN,
Seccin de Computacin.
[17] S. K. Dhall and C. L. Liu, On a Real-Time Scheduling Problem,
Operation Research, vol. 26, number 1, pp. 127-140, 1978.
[18] J. Y.-T. Leung and J. Whitehead, On the Complexity of Fixed-Priority
Scheduling of Periodic Real-Time Tasks, Performance Evaluation,
number 2, pp. 237-250,1982.
[19] R. L. Graham, Bounds on Multiprocessing Timing Anomalies,SLAM
Journal of Applied Mathematics, 416-429, 1969.
[20] R. Ha and J. Liu, Validating Timing Constraints in Multiprocessor and
Distributed Real-Time Systems, Intl Conf. on Distributed Computing
system, pp. 162-171, June 21-24, 1994.
[21] B. Andersson, Static Priority Scheduling in Multiprocessors, PhD
Thesis, Department of Comp.Eng., Chalmers University, 2003.
[22] J. Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen,An
Optimal Real-Time Scheduling Algorithm for Multiprocessors, IEEE
Conference Proceedings, SIES 2007: 9-16.
[23] B. Anderson,Static-priority scheduling on multiprocessors, PhD
dissertation, Dept. of Computer eng., Chalmers Univ. of
Technology,2003.
[24] T.Abdelzaher and C.Lu,Schedubility Analysis and Utilization Bound of
highly Scalable Real-Time Services, Proc. 15
th
Euro-micro Conf. Real
Time Systems,pp.141-150,july 2003.
[25] A.K.Moc and D.Chen, A General Model for Real Time Tasks,
Technical Report TR-96-24,Dept. of Computer Sciences, Univ.of Texas
at Austin, Oct.1996.
[26] A.K.Moc and D.Chen, A multiframe Model for Real Time
Tasks,IEEE Trans. Software Eng., vol. 23 ,no.10,pp.635-645,Oct 1997.

AUTHORS PROFILE
Jagbeer Singh has received a bachelors degree in Computer
Science and engineering, from the Dr. B.R.A. University Agra 2000, Uttar
Pradesh (India). In 2006, he received a masters degree in computer science
from the Gandhi Institute of Engineering and Technology Gunupur , under
Biju Patnaik University of Technology Rourkela ,Orissa(India) He has been a

37 | P a g e
Asst. Professor Gandhi Institute of Engineering and Technology Gunupur in
the Department of Computer Science since 2004. His research interests are in
the areas of Real Time Systems under the topics Fault Tolerance Tasks
Scheduling in single processor or multiprocessor system, he has published 3
peer-reviewed, and 6 scientific papers, organized 5 national research papers in
international/ national conferences and organized national conferences
/workshops, and serves as a reviewer for 3 journals, conferences, workshops,
and also having membership for different professional bodies like
ISTE,CSI,IAENG etc.

Bichitrananda Patra He is an assistant professor at the
Department of Information Technology Engineering, Gandhi Institute of
Engineering Technology, Gunupur, Orissa, India, He received his master
degree in Physics and Computer Science from the Utkal University,
Bhubaneswar, Orissa, India. His research interests are in Soft Computing,
Algorithm analysis, statistical, neural nets. He has published 8 research papers
in international journals and conferences organized national workshops and
conference and also having membership for different professional bodies like
ISTE, CSI etc.

Satyendra Prasad Singh having M. Sc., MCA and Ph. D. in
Statistics and working as a Professor and Head of department of MCA, Gandhi
Institute of Computer Studies, Gunupur, Rayagada, Orissa, India since 2007.
He has worked as a Research Associate in Defence Research and Development
Organisation, Ministry of Defence, Government of India, New Delhi for 2
years and also worked in different universities. Also he has received a Young
Scientist Award in year 2001 by International Academy of Physical Sciences
for the best Research Paper in CONIAPS-IV, 2001. He has published more
than 10 papers in reputed International/National journals and presented 15
Papers in International/National Conferences in the field of Reliability
Engineering, Cryptology and Pattern Recognization. He has guided many M.
Tech and MCA project thesis.

38 | P a g e
Analysis of Software Reliability Data using
Exponential Power Model

Ashwini Kumar Srivastava

Department of Computer Application,
S.K.P.G. College, Basti, U.P., India
[email protected]
Vijay Kumar

Departments of Mathematics & Statistics,
D.D.U. Gorakhpur University, Gorakhpur, U.P., India
[email protected]

AbstractIn this paper, Exponential Power (EP) model is
proposed to analyze the software reliability data and the present
work is an attempt to represent that the model is as software
reliability model. The approximate MLE using Artificial Neural
Network (ANN) method and the Markov chain Monte Carlo
(MCMC) methods are used to estimate the parameters of the EP
model. A procedure is developed to estimate the parameters of
the EP model using MCMC simulation method in OpenBUGS by
incorporating a module into OpenBUGS. The R functions are
developed to study the various statistical properties of the
proposed model and the output analysis of MCMC samples
generated from OpenBUGS. A real software reliability data set is
considered for illustration of the proposed methodology under
informative set of priors.
Keywords- EP model, Probability density, function, Cumulative
density function, Hazard rate function, Reliability function,
Parameter estimation, MLE, Bayesian estimation.
I. INTRODUCTION
Exponential models play a central role in analyses of
lifetime or survival data, in part because of their convenient
statistical theory, their important 'lack of memory' property and
their constant hazard rates. In circumstances where the one-
parameter family of exponential distributions is not sufficiently
broad, a number of wider families such as the gamma, Weibull
and lognormal models are in common use. Adding parameters
to a well-established family of models is a time honoured
device for obtaining more flexible new families of models. The
Exponential Power model is introduced by [14] as a lifetime
model. This model has been discussed by many authors [4], [9]
and [12].
A model is said to be an Exponential Power model with
shape parameter o>0 and scale parameter >0, if the survival
function of the model is given by
( )
( ) x
R x ( , ) > 0 and x (0, ) exp 1 e ,
o
o e

=
`
)
.
A. Model Analysis
For > 0 and, > 0 the two-parameter Exponential Power
model has the distribution function

( ) x
F(x; , ) 1 exp 1 e ; ( , ) 0, x 0
o

o = o > >
`
)
(1)
The probability density function (pdf) associated with eq
(1) is given by
( ) ( ) x x 1
f (x; , ) x e e xp 1 e ; ( , ) 0, x 0
o o
o o

o = o o > >
`
)
(2)
We shall write EP(, ) to denote Exponential Power model
with parameters .
as shape parameter by [4] and [14]. The R functions
dexp.power( ) and pexp.power( ) given in SoftreliaR
package can be used for the computation of pdf and cdf,
respectively.
Some of the typical EP density functions for different
values of o and for = 1 are depicted in Figure1. It is clear
from the Figure 1 that the density function of the Exponential
Power model can take different shapes.

Figure 1 Plots of the probability density function of the Exponential
Power model for =1 and different values of o
1) Mode
The mode can be obtained by solving the non-linear
equation
( ) ( )
( ) x
1 x 1 e 0
o
o

o +o =
`
)
. (3)

2) The quantile function
For a continuous distribution F(x), the p percentile (also
referred to as fractile or quantile), x
p
, for a given p, 0 < p <1,
is a number such that
p p
P(X x ) F(x ) p s = = . (4)

39 | P a g e
The quantile for p=0.25 and p=0.75 are called first
and third quartiles and the p=0.50 quantile is called the
median(Q
2
). The five parameters
Minimum(x), Q
1
, Q
2
, Q
3
, Maximum(x)
are often referred to as the five-number summary or
explanatory data analysis. Together, these parameters give a
great deal of information about the model in terms of the
centre, spread, and skewness. Graphically, the five numbers are
often displayed as a boxplot. The quantile function of
Exponential Power model can be obtained by solving
( ) x
1 exp 1 e p
o

=
`
)

or, ( ) { }
1
p
1
x log 1 log 1 p ; 0 p 1
o
= < <
. (5)
The computation of quantiles the R function qexp.power(
), given in SoftreliaR package, can be used. In
particular, for p=0.5 we get
( ) { } ( )
1
0.5
1
Median(x ) log 1 log 0.5
o
=
. (6)
3) The random deviate generation
Let U be the uniform (0,1) random variable and F(.) a cdf
for which F
-1
(.) exists. Then F
-1
(u) is a draw from distribution
F(.) . Therefore, the random deviate can be generated from
EP(o, ) by
( ) { }
1 1
x log 1 log 1 u ; 0 u 1
o
= < <
(7)
where u has the U(0, 1) distribution. The R function
rexp.power( ), given in SoftreliaR package, generates the
random deviate from EP(, ).
4) Reliability function/survival function
The reliability/survival function
( ) ( )
{ }
S x; , ( , ) > 0 and exp 1 exp x , x 0
o
o o = > (8)
The R function sexp.power( ) given in SoftreliaR
package computes the reliability/ survival function.
5) The Hazard Function
The hazard function of Exponential Power model is given
by
( ) ( )
1
h x; , ( , ) >0 and x exp x , x 0
o o o
o o = o > (9)
and the allied R function hexp.power( ) given in
SoftreliaR package. Since the shape of h(x) depends on
the value of the shape parameter o. When o 1, the failure
rate function is increasing. When o < 1, the failure rate
function is of bathtub shape. Thus the shape parameter o plays
an important role for the model.
Since differentiating equation (9) w.r.to x, we have
( ) ( ) ( )
{ }
1
h x 1 x
x
o
' = o +o . (10)
Setting h (x) ' = 0 and after simplification, we obtain the
change point as

1
1
0
1
x
o
o | |
=
|
o
\ .
. (11)
It easily follows that the sign of ( ) h x ' is determined by
( ) ( ) 1 .x
o
o +o which is negative for all x x
0
and positive
for all x x
0
.

Figure 2 Plots of the hazard function of the Exponential Power
model for =1 and different values of o

Some of the typical Exponential Power Model hazard
functions for different values of o and for = 1 are depicted in
Figure 2. It is clear from the Figure 2 that the hazard function
of the Exponential Power model can take different shapes.
6) The cumulative hazard function
The cumulative hazard function H(x) defined as
{ } H(x) 1 logF(x) = (12)
can be obtained with the help of pexp.power( ) function
given in SoftreliaR package by choosing arguments
lower.tail=FALSE and log.p=TRUE. i.e.
- pexp.power(x, alpha, lambda, lower.tail = FALSE,
log.p = TRUE)
7) Failure rate average (fra) and Conditional survival
function(crf)
Two other relevant functions useful in reliability analysis
are failure rate average (fra) and conditional survival function
(crf) The failure rate average of X is given by

H(x)
FRA(x) =
x
, x > 0, (13)
where H(x) is the cumulative hazard function.

40 | P a g e
The survival function (s.f.) and the conditional survival of
X are defined by
R(x)= 1 F(x)
and
R (x + t)
R (x | t) =
R(x)
, t > 0, x > 0, R ( ) > 0, (14)
respectively, where F() is the cdf of X. Similarly to h(x) and
FRA(x), the distribution of X belongs to the new better than
used (NBU), exponential, or new worse than used (NWU)
classes, when R (x | t) < R(x), R(t | x) = R(x), or R(x | t) >
R(x), respectively.
The R functions hra.exp.power( ) and crf.exp.power( )
given in SoftreliaR package can be used for the failure rate
average (fra) and conditional survival function(crf),
respectively.
II. MAXIMUM LIKELIHOOD ESTIMATION AND
INFORMATION MATRIX
Let x=(x
1
, . . . , x
n
) be a sample from a distribution with
cumulative distribution function (1). The log likelihood
function of the parameter L(o, ) is given by
( )
{ }
n
i
i 1
n n
i i
i 1 i 1
logL( , ) nlog n log ( 1) log x
x n exp x
=
= =
= + +
+ +
o o o
o o o o

(15)
Therefore, to obtain the MLEs of o and we can
maximize eq.(15) directly with respect to o and or we can
solve the following two non-linear equations using iterative
procedure [2] and [4]:
( ) ( )
{ }
n
i
i 1
n
i i i
i 1
logL n
nlog log x
x log( x ) 1 exp x 0
=
=
c
= + + +
c
=
o o
o o

(16)
( )
{ }
n
1
i i
i 1
logL n
x 1 exp x 0
=
c
= + =
c
o o o
o
o

(17)
Let us denote
( )

, u = o as the MLEs of ( ) , u = o . It is
not possible to obtain the exact variances of
( )

, u = o . The
asymptotic variances of
( )

, u = o can be obtained from the
following asymptotic property of
( )

, u = o

( )
( )
( )
1
2
N 0, I( )

u u u (18)
where I(u) is the Fishers information matrix given by

2 2
2
2 2
2
ln L ln L
E E
I( )
ln L ln L
E E
( | | | |
c c
( | |
| |
co c
co (
\ . \ .
u =
(
| | | |
( c c
| |
(
| |
co c
c
( \ . \ .
(19)
In practice, it is useless that the MLE has asymptotic
variance ( )
1
I( )

u because we do not know u. Hence, we
approximate the asymptotic variance by plugging in the
estimated value of the parameters. The common procedure is
to use observed Fisher information matrix
O( ) u (as an
estimate of the information matrix I(u)) given by
2 2
2
2 2
2
( , )
ln L ln L
O( ) H( )
ln L ln L
u=u
o
| |
c c
|
coc
co |
u = = u
|
c c
|
|
coc
c \ .
(20)
where H is the Hessian matrix, u=(o, ) and

= ( , ) u o . The
observed Fisher information is evaluated at MLE rather than
determining the expectation of the Hessian at the observed
data. This is simply the negative of the Hessian of the log-
likelihood at MLE. If the Newton-Raphson algorithm is used to
maximize the likelihood then the observed information matrix
can easily be calculated. Therefore, the variance-covariance
matrix is given by
( )
1
Var( ) cov( , )
H( )

cov( , ) Var( )
u=u
| |
o o
u = |
|
o
\ .
. (21)
Hence, from the asymptotic normality of MLEs,
approximate 100(1-)% confidence intervals for o and can be
constructed as

/ 2
z Var( )
o o and
/ 2

z Var( )
(22)
where z
/2
is the upper percentile of standard normal variate.
III. BAYESIAN ESTIMATION IN OPENBUGS
The most widely used piece of software for applied
Bayesian inference is the OpenBUGS. It is a fully extensible
modular framework for constructing and analyzing Bayesian
full probability models. This open source software requires
incorporation of a module (code) to estimate parameters of
Exponential Power model.
A module dexp.power_T(alpha, lambda) is written in
component Pascal, enables to perform full Bayesian analysis of
Exponential Power model into OpenBUGS using the method
described in [15] and [16].
A. Implementation of Module - dexp.power_T(alpha, lambda)
The developed module is implemented to obtain the Bayes
estimates of the Exponential Power model using MCMC
method. The main function of the module is to generate

41 | P a g e
MCMC sample from posterior distribution under informative
set of priors, i.e. Gamma priors.
1) Data Analysis
We are using software reliability data set SYS2.DAT - 86
time-between-failures [10] is considered for illustration of the
proposed methodology. In this real data set, Time-between-
failures is converted to time to failures and scaled.
B. Computation of MLE and Approximate ML estimates using
ANN
The Exponential Power model is used to t this data set.
We have started the iterative procedure by maximizing the log-
likelihood function given in eq.(15) directly with an initial
guess for o = 0.5 and = 0.06, far away from the solution. We
have used optim( ) function in R with option Newton-Raphson
method. The iterative process stopped only after 7 iterations.
We obtain o = 0.905868898,
= 0.001531423 and the

corresponding log-likelihood value = -592.7172. The similar
results are obtained using maxLik package available in R. An
estimate of variance-covariance matrix, using eq.(22), is given
by
Var( ) cov( , ) 7.265244e-03 -1.474579e-06

-1.474579e-06 1.266970e-08
cov( , ) Var( )
| |
o o (
= |
(
|
o
\ .

Thus using eq.(23), we can construct the approximate 95%
confidence intervals for the parameters of EP model based on
MLEs. Table I shows the MLEs with their standard errors and
approximate 95% confidence intervals for o and .
TABLE I MAXIMUM LIKELIHOOD ESTIMATE, STANDARD ERROR
AND 95% CONFIDENCE INTERVAL

An approximate ML estimates based on Artificial Neural
Networks are obtained by using the neuralnet package
available in R. We have chosen one hidden- layer feedforward
neural networks with sigmoid activation function [1]. The
results are quite close to exact ML estimates.
C. Model Validation
To study the goodness of t of the Exponential Power
model, we compute the Kolmogorov-Smirnov statistic between
the empirical distribution function and the fitted distribution
function when the parameters are obtained by method of
maximum likelihood. For this we can use R function
ks.exp.power( ), given in SoftreliaR package. The result of
K-S test is D = 0.0514 with the corresponding p-value =
0.9683, Therefore, the high p-value clearly indicates that
Exponential Power model can be used to analyze this data set,
and we also plot the empirical distribution function and the
fitted distribution function in Figure 3. From above result and
Figure 3, it is clear that the estimated Exponential Power model
provides excellent t to the given data.

Figure 3 The graph of empirical distribution and fitted distribution function.

The other graphical method widely used for checking
whether a fitted model is in agreement with the data is
Quantile-Quantile (Q-Q) plots.

Figure 4 Quantile-Quantile(Q-Q) plot using MLEs as estimate.

The Q-Q plots show the estimated versus the observed
quantiles. If the model fits the data well, the pattern of points
on the Q-Q plot will exhibit a 45-degree straight line. Note that
all the points of a Q-Q plot are inside the square

| |
1 1
1:n n:n 1:n n:n

F (p ) , F (p ) x , x

(

.
The corresponding R function qq.exp.power( ) is given in
SoftreliaR package. As can be seen from the straight line
pattern in Figure 4, the Exponential Power model fits the data
very well.
IV. BAYESIAN ANALYSIS UNDER INFORMATIVE PRIORS,
I.E., GAMMA PRIORS
OpenBUGS code to run MCMC:
Model
{
for( i in 1 : N )
{
x[i] ~ dexp.power_T(alpha, lambda)
}
# Prior distributions of the Model parameters
Parameter
MLE
Std.
Error
95% Confidence
Interval
alpha 0.905868 0.085236 (0.7388055, 1.0729322)
lambda 0.001531 0.000112 (0.0013108, 0.0017520)

42 | P a g e
# Gamma prior for alpha
alpha ~ dgamma(0.001, 0.001)
# Gamma prior for lambda
lambda ~ dgamma(0.001, 0.001)
}
Data
list(N=86, x=c(4.79, 7.45, 10.22, 15.76, 26.10, 28.59, 35.52, 41.49,
42.66, 44.36, 45.53, 58.27, 62.96, 74.70, 81.63, 100.71, 102.06, 104.83,
110.79, 118.36, 122.73, 145.03, 149.40, 152.80, 156.85, 162.20, 164.97,
168.60, 173.82, 179.95, 182.72, 195.72, 203.93, 206.06, 222.26, 238.27,
241.25, 249.99, 256.17, 282.57, 282.62, 284.11, 294.45, 318.86, 323.46,
329.11, 340.30, 344.67, 353.94, 398.56, 405.70, 407.51, 422.36, 429.93,
461.47, 482.62, 491.46, 511.83, 526.64, 532.23, 537.13, 543.06, 560.75,
561.60, 589.96, 592.09, 610.75, 615.65, 630.52, 673.74, 687.92, 698.15,
753.05, 768.25, 801.06, 828.22, 849.97, 885.02, 892.27, 911.90, 951.69,
962.59, 965.04, 976.98, 986.92, 1025.94))
Initial values
# chain 1
list(alpha=0.2 , lambda=0.01)
# chain 2
list(alpha= 1.0, lambda=0.10)

We run the model to generate two Markov Chains at the
length of 40,000 with different starting points of the
parameters. The convergence is monitored using trace and
ergodic mean plots, we find that the Markov Chain converge
together after approximately 2000 observations. Therefore,
burnin of 5000 samples is more than enough to erase the effect
of starting point(initial values). Finally, samples of size 7000
are formed from the posterior by picking up equally spaced
every fifth outcome, i.e. thin=5, starting from 5001.This is done
to minimize the auto correlation among the generated deviates.
Therefore, we have the posterior sample {o
1i
,
1i
}, i =
1,,7000 from chain 1 and o
2i
,
2i
}, i = 1,,7000 from chain
2.
The chain 1 is considered for convergence diagnostics
plots. The visual summary is based on posterior sample
obtained from chain 2 whereas the numerical summary is
presented for both the chains.
A. Convergence diagnostics
Sequential realization of the parameters o and can be
observed in figure 5. The Markov chain is most likely to be
sampling from the stationary distribution and is mixing well.
1) History(Trace) plot

Figure 5 Sequential realization of the parameters o

There is ample evidence of convergence of chain as the
plots show no long upward or downward trends, but look like a
horizontal band, then we has evidence that the chain has
converged.
2) Running Mean (Ergodic mean) Plot
The convergence pattern based on Ergodic average as
shown in figure 6 is obtained after generating a time series
(Iteration number) plot of the running mean for each parameter
in the chain. The running mean is computed as the mean of all
sampled values up to and including that at a given iteration.

Figure 6 The Ergodic mean plots for o
3) Autocorrelation
The graph shows that the correlation is almost negligible.
We may conclude that the samples are independent.

Figure7 The autocorrelation plots for o
4) Brooks-Gelman-Rubin Plot
Uses parallel chains with dispersed initial values to test
whether they all converge to the same target distribution.
Failure could indicate the presence of a multi-mode posterior
distribution (different chains converge to different local modes)
or the need to run a longer chain (burn-in is yet to be
completed).

Figure 8 The BGR plots for o
From the Figure 8, it is clear that convergence is achieved.
Thus we can obtain the posterior summary statistics.
B. Numerical Summary
In Table II, we have considered various quantities of
interest and their numerical values based on MCMC sample of
posterior characteristics for Exponential Power model under
Gamma priors.


43 | P a g e
TABLE II NUMERICAL SUMMARIES UNDER GAMMA PRIORS

C. Visual summary
1) Box plots
The boxes represent inter-quartile ranges and the solid
black line at the (approximate) centre of each box is the mean;
the arms of each box extend to cover the central 95 per cent of
the distribution - their ends correspond, therefore, to the 2.5%
and 97.5% quantiles. (Note that this representation differs
somewhat from the traditional.)

Figure 9 The boxplots for alpha and lambda.

2) Kernel density estimates
Histograms can provide insights on symmetric, behaviour
in the tails, presence of multi-modal behaviour, and data
outliers; histograms can be compared to the fundamental
shapes associated with standard analytic distributions.

Figure 10 Kernel density estimate and histogram of based on MCMC
samples, vertical lines indicates the corresponding ML and
Bayes estimates.

Figure 10 and 11 provide the kernel density estimate of o
and respectively. It can be seen that o and both are
symmetric.

Figure 11 Histogram and kernel density estimate of based on MCMC
samples

D. Comparison with MLE
For the comparison with MLE we have plotted two graphs.
In Figure 12, the density functions
f(x; , ) o using MLEs and

Bayesian estimates, computed via MCMC samples under
gamma priors, are plotted.


44 | P a g e

Figure 12 The density functions
f(x; , ) o using MLEs and Bayesian

estimates, computed via MCMC samples under gamma priors.

The Figure 13, exhibits the estimated reliability
function(dashed line) using Bayes estimate under gamma priors
and the empirical reliability function(solid line).

Figure 13 The estimated reliability function(dashed line) and the empirical
reliability function (solid line).

It is clear from the Figures, the MLEs and the Bayes
estimates with respect to the gamma priors are quite close and
fit the data very well.
V. CONCLUSION
In this research paper, we have presented the Exponential
Power model as software reliability model which was
motivated by the fact that the existing models were inadequate
to describe the failure process underlying some of the data sets.
We have developed the tools for empirical modelling, e.g.,
model analysis, model validation and estimation. The exact as
well as approximate ML estimates using ANN of the
parameters alpha () and lambda () have been obtained.

An attempt has been made to estimate the parameters in
Bayesian setup using MCMC simulation method under gamma
priors. The proposed methodology is illustrated on a real data
set. We have presented the numerical summary and visual
summary under different priors which includes Box plots,
Kernel density estimates based on MCMC samples. The Bayes
estimates are compared with MLE. We have shown that the
Exponential Power model is suitable for modeling the software
reliability data and the tools developed for analysis can also be
used for any other type of data sets.
ACKNOWLEDGEMENT
Our thanks to Dr. Andrew Thomas, St. Andrews
University, UK, Prof. Uwe Ligges, TU Dortmund University,
Germany and Prof R.S. Srivastava, DDU Gorakhpur
University, Gorakhpur, for their valuable suggestions to make
this work a success.
REFERENCES
[1] Cervellera, C., Macci,D. And Muselli,M.,(2008) Deterministic
Learning For Maximum-Likelihood Estimation Through Neural
Networks, IEEE Transactions on neural networks, vol. 19, no. 8.
[2] Chen, M. H. and Shao, Q. M. (1999). Monte Carlo estimation of
Bayesian credible intervals and HPD intervals, Journal of Computational
and Graphical Statistics. 8(1).
[3] Chen, M., Shao, Q. and Ibrahim, J.G. (2000). Monte Carlo Methods in
Bayesian Computation, Springer, NewYork.
[4] Chen, Z.(1999). Statistical inference about the shape parameter of the
exponential power distribution, Statistical Papers 40, 459-468 (1999)
[5] Hornik, K., (2004). The R FAQ (on-line). Available at
http://www.ci.tuwien. ac.at/~hornik/R/.
[6] Ihaka, R.; Gentleman, R.R. (1996). R: A language for data analysis and
graphics, Journal of Computational and Graphical Statistics, 5, 299314.
[7] Jalote, P. (1991). An Integrated Approach to Software Engineering.
Springer Verlag, New York.
[8] Lawless, J. F., (2003). Statistical Models and Methods for Lifetime Data,
2
nd
ed., John Wiley and Sons, New York.
[9] Leemis, L.M., (1986). Lifetime distribution identities, IEEE Transactions
on Reliability, 35, 170-174.
[10] Lyu M.R., (1996). Handbook of Software Reliability Engineering, IEEE
Computer Society Press, McGraw Hill, 1996.
[11] R Development Core Team (2008). R: A language and environment for
statistical computing. R Foundation for Statistical Computing, Vienna,
Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
[12] Rajarshi, M.B. & Rajarshi, S. (1988). Bathtub distributions: A review,
Communications in Statistics A: Theory and Methods, 17, 2597-2621.
[13] Singpurwalla, N.D. and S. Wilson (1994). Software Reliability
Modeling. International Statist. Rev., 62 3:289-317.
[14] Smith, R.M. & Bain, L.J. (1975). An exponential power life-test
distribution, Communications in Statistics, 4, 469-481.
[15] Thomas, A. (2004). OpenBUGS, URL http://mathstat.helsinki.fi/
openbugs/.
[16] Thomas, A. (2007). OpenBUGS Developer Manual, Version 3.0.2.
URLhttp://mathstat.helsinki.fi/openbugs/


45 | P a g e
AUTHORS PROFILE
ASHWINI KUMAR SRIVASTAVA is a research
Scholar and submitted thesis for the award of Ph.D. in
Computer Science. He received his M.Sc in
Mathematics from D.D.U.Gorakhpur University,
MCA(Hons.) from U.P.Technical university and M.
Phil in Computer Science from Allagappa University.
Currently working as Assistant Professor in Department
of Computer Application in Shivharsh Kisan P.G.
College, Basti, U.P. He has got 6 years of teaching
experience as well as 3 years research experience. His
main research interests are Software Reliability,
Artificial Neural Networks, Bayesian methodology and Data Warehousing.
VIJAY KUMAR received his M.Sc and Ph.D. in
Statistics from D.D.U.Gorakhpur University.
Currently working as Reader(Associate Professor) in
Department of Mathematics and Statistics in DDU
Gorakhpur University, Gorakhpur U.P. He has got 16
years of teaching/research experience. He is visiting
Faculty of Max-Planck-Institute, Germany. His main
research interests are Reliability Engineering,
Bayesian Statistics, and Actuarial Science.

46 | P a g e
Priority Based Dynamic Round Robin (PBDRR)
Algorithm with Intelligent Time Slice for Soft Real
Time Systems
#1
Prof. Rakesh Mohanty
#2
Prof. H. S. Behera
Department of Computer Science & Engineering
Veer Surendra Sai University of Technology, Burla
Sambalpur, Orissa, India
#1
[email protected]
#2
[email protected]
#3
Khusbu Patwari
#4
Monisha Dash
#5
M. Lakshmi Prasanna
Veer Surendra Sai University of Technology, Burla
Sambalpur, Orissa, India

AbstractIn this paper, a new variant of Round Robin (RR)
algorithm is proposed which is suitable for soft real time systems.
RR algorithm performs optimally in timeshared systems, but it is
not suitable for soft real time systems. Because it gives more
number of context switches, larger waiting time and larger
response time. We have proposed a novel algorithm, known as
Priority Based Dynamic Round Robin Algorithm(PBDRR),
which calculates intelligent time slice for individual processes and
changes after every round of execution. The proposed scheduling
algorithm is developed by taking dynamic time quantum concept
into account. Our experimental results show that our proposed
algorithm performs better than algorithm in [8] in terms of
reducing the number of context switches, average waiting time
and average turnaround time.
Keywords- Real time system; Operating System; Scheduling;
Round Robin Algorithm; Context switch; Waiting time;
Turnaround time.
I. INTRODUCTION
Real Time Systems (RTS) are the ones that are designed
to provide results within a specific time-frame. It must have
well defined fixed and response time constraints and the
processing must be done within the defined constraints or the
system will fail. RTS are basically divided into three types:
hard, firm and soft. In hard real time systems, failure to meet
deadline or response time constraints leads to system failure.
In firm real time systems, failure to meet deadline can be
tolerated. In soft real time systems, failure to meet deadline
doesnt lead to system failure, but only performance is
degraded[6]. Space research, weather forecast, seismic
detection, audio conferencing, video conferencing, money
withdrawal from ATM, railway and flight reservation etc are
some of the applications of real time systems. The simple RR
algorithm cannot be applied in soft real time systems as it
gives longer waiting and response time. Yashuwaanth and et.
al. [8] have proposed a scheduling algorithm for soft real time
systems where Intelligent Time Slice(ITS) for all the processes
has been calculated. The processes are scheduled using RR
with ITS as time quantum. By taking dynamic time concept
with ITS, we have proposed a new algorithm which gives
improved performance than the algorithm proposed in [8].
A. Real Time Scheduling Algorithms
Some of the well known real-time scheduling algorithms
are described as follows. Rate Monotonic Algorithm(RM) is a
fixed priority scheduling algorithm which consists of
assigning the highest priority to the highest frequency tasks in
the system, and lowest priority to the lowest frequency tasks.
At any time, the scheduler chooses to execute the task with the
highest priority. By specifying the period and computational
time required by the task, the behavior of the system can be
categorized apriori. Earliest-Deadline-First Algorithm
(EDF) uses the deadline of a task as its priority. The task with
the earliest deadline has the highest priority, while the task
with the latest deadline has the lowest priority. Minimum-
Laxity-First Algorithm (MLF) assigns a laxity to each task in
a system, and then selects the task with the minimum laxity to
execute next. Laxity is defined as the difference between
deadline by which the task must be completed and the amount
of computation remaining to be performed. Maximum-
Urgency-First Algorithm (MUF) is a combination of fixed
and dynamic priority scheduling. In this algorithm each task
is given an urgency which is defined as a combination of two
fixed priorities, and a dynamic priority. One of the fixed
priorities, called the criticality, has highest priority among the
three, and then comes the dynamic priority which has
precedence over the user priority (fixed priority). The dynamic
priority is inversely proportional to the laxity of a task.
B. Related Work
In real time systems, the rate monotonic algorithm is the
optimal fixed priority scheduling algorithm where as the
earliest-deadline-first and minimum-laxity-first algorithms are
the optimal dynamic priorities scheduling algorithms as

47 | P a g e
presented by Liu and Layland in their paper [1]. S. Baskiyar
and N. Meghanathan have presented a survey on
contemporary Real Time Operating System (RTOS) which
includes parameters necessary for designing a RTOS, its
desirable features and basic requirements[6]. A dynamically
reconfigurable system can change in time without the need to
halt the system. David B. Stewart and Pradeep K. Khosla
proposed the maximum-urgency-first algorithm, which can be
used to predictably schedule dynamically changing systems
[2]. The scheduling mechanism of the maximum-urgency-first
may cause a critical task to fail. The modified maximum
urgency first scheduling algorithm by Vahid Salmani, Saman
Taghavi Zargar, and Mahmoud Naghibzadeh resolves the
above mentioned problem [7]. C.Yashuwaanth proposed a
Modified RR(MRR) algorithm which overcomes the
limitations of simple RR and is suitable for the soft real time
systems [8].
C. Our Contribution
In our work, we have proposed an improved algorithm as
compared to the algorithm defined in [8]. Instead of taking
static time quantum, we have taken dynamic time quantum
which changes with every round of execution. Our
experimental results show that PBDRR performs better than
algorithm MRR in [8] in terms of reducing the number of
context switches, average waiting time and average turnaround
time.
D. Organization of Paper
Section II presents the pseudo code and illustration of our
proposed PBDRR algorithm. In section III, Experimental
results of the PBDRR algorithm and its comparison with the
MRR algorithm is presented. Section IV contains the
conclusion.
II. OUR PROPOSED ALGORITHM
The early the shorter processes are removed from the ready
queue, the better the turnaround time and the waiting time. So
in our algorithm, the shorter processes are given more time
quantum so that they can finish their execution earlier. Here
shorter processes are defined as the processes having less
assumed CPU burst time than the previous process.
Performance of RR algorithm solely depends upon the size of
time quantum. If it is very small, it causes too many context
switches. If it is very large, the algorithm degenerates to
FCFS. So our algorithm solves this problem by taking
dynamic intelligent time quantum where the time quantum is
repeatedly adjusted according to the shortness component.
A. Our Proposed Algorithm
In our algorithm, Intelligent Time Slice(ITS) is calculated
which allocates different time quantum to each process based
on priority, shortest CPU burst time and context switch
avoidance time. Let the original time slice (OTS) is the time
slice to be given to any process if it deserves no special
consideration. Priority component (PC) is assigned 0 or 1
depending upon the priority assigned by the user which is
inversely proportional to the priority number. Processes having
highest priority are assigned 1 and rest is assigned 0. For
Shortness Component(SC) difference between the burst time of
current process and its previous process is calculated. If the
difference is less than 0, then SC is assigned 1, else 0. For
calculation of Context Switch Component (CSC) first PC, SC
and OTS is added and then their result is subtracted from the
burst time. If this is less than OTS, it will be considered as
Context Switch Component (CSC). Adding all the values like
OTS, PC, SC and CSC, we will get intelligent time slice for
individual process.
Let TQ
i
is the time quantum in round i. The number of
rounds i varies from 1 to n, where value of i increments by 1
after every round till ready queue is not equal to NULL.

1. Calculate ITS for all the processes present in the
ready queue.
2. While(ready queue!= NULL)
{
For i=1 to n do
{
if ( i ==1)
{
TQ
i =
ITS, if SC= 0
ITS, otherwise
}
Else
{
TQ
i =
TQ
i-1
+ TQ
i-1
, if SC=0
2 * TQ
i-1,
otherwise
}
If (remaining burst time -TQ
i
) <=2
TQ
i
= remaining burst time
} End of For
} End of while
3. Average waiting time, average turnaround time
and no. of context switches are calculated
End

Fig-1: Pseudo Code of Proposed PBDRR Algorithm
C. Illustration
Given the CPU burst sequence for five processes as 50
27 12 55 5 with user priority 1 2 1 3 4 respectively.
Original time slice was taken as 4. The priority component
(PC) were calculated which were found as 1 0 1 0 0. Then
the shortness component (SC) were calculated and found to be
0 1 1 0 1. The intelligent time slice were computed as 5 5
6 4 5. In first round, the processes having SC as 1 were
assigned time quantum same as intelligent time slice whereas
the processes having SC as 0 were given the time quantum
equal to the ceiling of the half of the intelligent time slice. So
processes P1, P2, P3, P4, P5 were assigned time quantum as 3

48 | P a g e
5 6 2 5. In next round, the processes having SC as 1 were
assigned double the time slice of its previous round whereas
the processes with SC equals to 0 were given the time
quantum equal to the sum of previous time quantum and
ceiling of the half of the previous time quantum. Similarly
time quantum is assigned to each process available in each
round for execution.
III. EXPERIMENTS AND RESULTS
A. Assumptions
The environment where all the experiments are
performed is a single processor environment and all the
processes are independent. Time slice is assumed to be not
more than maximum burst time. All the parameters like burst
time, number of processes, priority and the intelligent time slice
of all the processes are known before submitting the processes
to the processor. All processes are CPU bound and no
processes are I/O bound.

B. Experimental Frame Work
Our experiment consists of several input and output
parameters. The input parameters consist of burst time, time
quantum, priority and the number of processes. The output
parameters consist of average waiting time, average
turnaround time and number of context switches.

C. Data set
We have performed three experiments for evaluating
performance of our new proposed PBDRR algorithm and
MRR algorithm. We have considered 3 cases of the data set as
the processes with burst time in increasing, decreasing and
random order respectively. The significance the performance
metrics for our experiment is as follows. Turnaround
time(TAT): For the better performance of the algorithm,
average turnaround time should be less. Waiting time(WT):
For the better performance of the algorithm, average waiting
time should be less. Number of Context Switches(CS): For the
better performance of the algorithm, the number of context
switches should be less.
D. Experiments Performed
To evaluate the performance of our proposed PBDRR
algorithm and MRR algorithm, we have taken a set of five
processes in three different cases. Here for simplicity, we
have taken 5 processes. The algorithm works effectively even
if it used with a very large number of processes. In each case,
we have compared the experimental results of our proposed
PBDRR algorithm with the MRR algorithm presented in [8].

Case 1: We assume five processes arriving at time = 0, with
increasing burst time (P1 = 5, P2 = 12, P3 = 16, P4 = 21, p5=
23) and priority (p1=2, p2=3, p3=1, p4=4, p5=5).
TABLE-1 ( MRR Case 1)

TABLE-2 ( PBDRR- Case 1)

TABLE 3 ( Comparison between MRR and PBDRR)

The TABLE-1 and TABLE-2 show the output using algorithm
MRR and our new proposed PBDRR algorithm. Table-3 shows
the comparison between the two algorithms. Figure-2 and
Figure-3 show Gantt chart for algorithms MRR and PBDRR
respectively.

Fig. 2 : Gantt Chart for MRR(Case-1)

Fig. 3: Gantt Chart for PBDRR (Case-1)

Process
id
Burst
time
Priority OTS PC SC CSC ITS
P1
P2
P3
P4
P5
5 5
1 12
2 16
21
23
2 4 0 0 1 5
3 4 0 0 0 4
1 4 1 0 0 5
4 4 0 0 0 4
5 4 0 0 0 4
Process
id
SC ITS ROUNDS
1
st
2
nd
3
rd
4
th
5
th

P1
P2
P3
P4
P5
0
0
0
0
0
5
4
5
4
4
5
2
3
2
2
0
3
5
3
3
0
7
8
5
5
0
0
0
8
8
0
0
0
3
5
Algorithm Average
TAT
Average
WT
CS
MRR 51.2 35.8 19
PBDRR 46.4 31 17

49 | P a g e

Fig.4 : Comparison of Performance of Algorithms - MRR with Static ITS and
PBDRR with dynamic ITS ( Case-1 )

Case 2: We Assume five processes arriving at time = 0, with
decreasing burst time (P1 = 31, P2 = 23, P3 = 16, P4 = 9, p5=
1) and priority (p1=2, p2=1, p3=4, p4=5, p5=3). The TABLE-4
and TABLE-5 show the output using algorithms MRR and
PBDRR respectively. TABLE-6 shows the comparison between
the two algorithms.
TABLE-4 ( MRR- Case 2)


TABLE 6 ( Comparison between MRR and PBDRR)
Algorithm Avg TAT Avg WT CS
MRR 54 38 18
PBDRR 50.4 34.4 12

Figure-5 and Figure-6 show Gantt chart for the algorithms
MRR and PBDRR respectively.


Fig. 6: Gantt Chart for PBDRR (Case-2)

Fig. 7 : Comparison of Performance of Algorithms - MRR with Static ITS and
PBDRR with dynamic ITS ( Case-2 )

Case 3: We assume five processes arriving at time = 0, with
random burst time (P1 = 11, P2 = 53, P3 = 8, P4 = 41, p5= 20)
and priority (p1=3, p2=1, p3=2, p4=4, p5=5). The TABLE-7
and TABLE-8 show the output using algorithms MRR and
PBDRR respectively. TABLE-9 shows the comparison between
the two algorithms. Figure-8 and Figure-9 show Gantt chart for
both the algorithms.
TABLE-7 ( MRR- Case 3)


Process
id
Burst
time
P1 31 2 4 0 0 0 4
P2 23 1 4 1 1 0 6
P3 16 4 4 0 1 0 5
P4 9 5 4 0 1 0 5
P5 1 3 4 0 1 0 1

Process
id
SC ITS ROUNDS
1
st
2
nd
3
rd
4
th
5
th
6
th

P1
P2
P3
P4
P5
0
0
1
0
1
4
5
8
4
5
2
3
8
2
5
3
5
0
3
10
6
8
0
5
5
0
12
0
8
0
0
18
0
12
0
0
7
0
11
0
Process
id
Burst
time
P1 11 3 4 0 0 0 4
P2 53 1 4 1 0 0 5
P3 8 2 4 0 1 3 8
P4 41 4 4 0 0 0 4
P5 20 5 4 0 1 0 5

50 | P a g e


Fig. 9 : Gantt Chart for PBDRR (Case-3)

Fig.10 : Comparison of Performance of Algorithms - MRR with Static ITS
and PBDRR with dynamic ITS ( Case-3 )
IV. CONCLUSION
From the above comparisons, we observed that our new
proposed algorithm PBDRR is performing better than the
algorithm MRR proposed in paper [8] in terms of average
waiting time, average turnaround time and number of context
switches thereby reducing the overhead and saving of memory
spaces. In the future work, deadline can be considered as one
of the input parameter in addition to the priority in the
proposed algorithm. Hard Real Time Systems have hard
deadline, failing which causes catastrophic events. In future
work, a new algorithm in hard real time systems with deadline
can be developed.
REFERENCES
[1] C. L. Liu and James W. Layland : Scheduling Algorithms
for Multiprogramming in a Hard-Real-Time Environment,
Journal of the ACM(JACM), Vol. 20, Issue 1, January, 1973.
[2] David B. Stewart and Pradeep K. Khosla: Real-Time
Scheduling of Dynamically Reconfigurable Systems,
Proceedings of the IEEE International Conference on Systems
Engineering, pp 139-142, August, 1991.
[3] Krithi Ramamrithm and John A. Stankovic: Scheduling
Algorithms and Operating System Support for Real Time
Systems, Proceedings of the IEEE, Vol. 82, Issue 1, pp 55-67,
January- 1994.
[4] R. I. Davis and A. Burns : Hierarchical Fixed Priority Pre-
emptive Scheduling, Proceedings of the 26
th
IEEE
International Real-Time Systems Symposium(RTSS), pp 389-
398, 2005.
[5] Omar U. Pereira Zapata, Pedro Meja Alvarez: EDF and
RM Multiprocessor Scheduling Algorithms: Survey and
Performance Evaluation, Technical Report, 1994.
[6] S. Baskiyar and N. Meghanathan: A Survey On
Contemporary Real Time Operating Systems, Informatica, 29,
pp 233-240, 2005.
[7] Vahid Salmani, Saman Taghavi Zargar, and Mahmoud
Naghibzadeh: A Modified Maximum Urgency First Scheduling
Algorithm for Real-Time Tasks, World Academy of Science,
Engineering and Technology. Vol. 9, Issue 4, pp 19-23, 2005.
[8] C. Yaashuwanth and R. Ramesh, : A New Scheduling
Algorithm for Real Time System, International Journal of
Computer and Electrical Engineering (IJCEE), Vol. 2, No. 6,
pp 1104-1106, December, 2010.
AUTHORS PROFILE
Prof. Rakesh Mohanty is a Lecturer in Department of Computer Science and
Engineering, Veer Surendra Sai University of Technology, Burla, Orissa,
India. His research interests are in operating systems, algorithms and data
structures .

Prof. H. S. Behera is a Senior Lecturer in Department of Computer Science
and Engineering, Veer Surendra Sai University of Technology, Burla, Orissa,
India. His research interests are in operating systems and data mining .

Khusbu Patwari, Monisha Dash and M. Lakshmi Prasanna have completed
their B. Tech. in Department of Computer Science and Engineering, Veer
Surendra Sai University of Technology, Burla, Orissa, India in 2010.

Algorithm Avg TAT Avg WT CS
MRR 80.8 54.2 29
PBDRR 76 49.4 18

51 | P a g e
Application of Expert System with Fuzzy Logic in
Teachers Performance Evaluation

Abdur Rashid Khan
1

Institute of Computing &
Information Technology (ICIT)
Gomal Unversity, Dera Ismail
Khan, KPK, Pakistan

1
[email protected],
[email protected]
2
Hafeez Ullah Amin

2
Technology, Kohat University of
Science & Technology Kohat
26000, KPK, Pakistan
2
[email protected]

Zia Ur Rehman
3
3
Technology, Kohat University of
Science & Technology, Kohat
26000, KPK, Pakistan
3
[email protected]

Abstract This paper depicts adaptation of expert systems
technology using fuzzy logic to handle qualitative and uncertain
facts in the decision making process. Human behaviors are
mostly based upon qualitative facts, which cannot be
numerically measured and hardly to decide correctly. This
approach is an attempt to cope with such problems in the
scenario of teachers performance evaluation. An Expert
System was developed and applied to the acquired knowledge
about the problem domain that showed interesting results
providing a sketch for students and researchers to find solutions
of such types of problems. Through Fuzzy Logic we numerically
weighted the linguistic terms, like; very good, good, bad, or high,
medium, low or satisfied, unsatisfied by assigning priorities to
these qualitative facts. During final decision making, key
parameters were given weights according to their priorities
through mapping numeric results from uncertain knowledge
and mathematical formulae were applied to calculate the
numeric results at final. In this way this ES will not only be
useful for decision-makers to evaluate teachers abilities but
may also be adopted in writing Annual Confidential Reports
(ACR) of about all the employees of an organization.
Keywords Expert System, Fuzzy Random Variables, Decision
Making, Teachers Performance, Qualitative & Uncertain
Knowledge.
I. INTRODUCTION
Computer programs using Artificial Intelligence (AI)
techniques to assist people in solving difficult problems
involving knowledge, heuristics, and decision-making are
called expert systems, intelligent systems, or smart systems
[18]. An expert system can be designed based on a set of
rules to determine what action to set off when a certain
situation is encountered [24]. In other words expert system is
such a technology that able the human being to collect &
control the human experts knowledge and expertise in a
particular problem domain for further use to solve similar
problems through computer system.
Experts of any fields are always few in numbers,
expensive to consult and they have short time due to much
work to do. So there is an urgent need of storing the experts
knowledge in the computer in such a way that have a great
extent of knowledge of problem domain solving problems of
the users and sparing experts for others works unpublished
[2]. In this paper we discuss the teachers performance
evaluation through AI technology at higher education
institutions of Pakistan. The proposed Fuzzy Expert System
considering various aspects of teachers attributes, like
research & publication, teaching learning process, personal
skills & abilities, compensation, achievements & recognition
etc that have deep influence on the teachers performance in
universities investigated by [1].
In this paper a fuzzy expert systems model is designed to
combine the knowledge and expertise of human experts with
reasoning capabilities that will provide a great support to
executives for decision-making in educational institutions.
This paper is organized as: the section II discusses the
applications of expert system & fuzzy logic in teachers
assessment and education, section III briefly describes the
teachers evaluation, and section IV explains the proposed
approach for the solution of the entitled problem.
II. STUDY BACKGROUND
From last few decades, academics and researchers began
to recognize the importance of expert system and its related
concepts became one of the most popular topics related to
decision making and knowledge management. From the
beginning, expert systems have been developed in divers
areas, like agriculture, chemistry, computer science,
engineering, geology, medicine, space technology etc. [14];
and widely applied to various studies and issues, including
performance assessment [3; 26; 27], commercial loan
underwriting [19], logistics strategy design [11], farm
productivity [25], mergers and acquisitions [28], defence
budget planning [29], earthquake design [6], system
dynamics [32], conveyor equipment selection [15], customer
service management [9] and knowledge inertia [20]. For
example, in [13] used the development and implementation
of an educational tool based on knowledge based technology
employing an expert system shell a knowledge base system
for postgraduate engineering courses. In [17] extract project
WBS from the obtained mind map of brainstorming project
team by artificial intelligence (AI) tools which is Prolog
programming language. In [5] used expert system technology
for providing developmental feedback to individuals from
different ethnic minority groups. Melek and Sadeghian, [22]
developed a theoretic framework for intelligent expert
systems in medical encounter evaluation. Shen et al., [31]
constructed an intelligent assessment system model and
compared with the current assessment in education, this new
intelligent assessment system expands the range of objects

52 | P a g e
for evaluation and takes some AI technologies to give more
heuristic and intelligent assessments.
According to Zadeh [33, 34] variables words or sentences
as their values is called linguistics variables and the variables
that represents the gradual transition from high to low, true to
false is called fuzzy variables and a set containing these
variables is the fuzzy set. Fuzzy logic can be incorporated
into expert system to enhance the performance and reliability
of expert system in decision making. Fuzzy logic principals
with expert system form a fuzzy expert system which is able
to implement human knowledge & expertise with imprecise,
ambiguous and uncertain data. Recently, many researchers
worked on the applications of fuzzy logic in education &
assessments. Chiang and Lin [10] presented a method for
applying the fuzzy set theory to teaching assessment. Bai and
Chen [4] presented a new method for evaluating students
learning achievement using fuzzy membership functions and
fuzzy rules. Chang and Sun [8] presented a method for fuzzy
assessment of learning performance of junior high school
students. Chen and Lee [7] presented two methods for
students evaluation using fuzzy sets. Ma and Zhou [21]
presented a fuzzy set approach to the assessment of student-
cantered learning. In [30] presented a method for applying
the fuzzy set theory and the item response theory to evaluate
the learning performance of students. In unpublished [2]
proposed an intelligent framework for teachers performance
evaluations in higher education.
The literature reveals that there is a vast potential of expert
system and fuzzy logic in education as general and
performance assessment, as a special.
III. TEACHERS PERFORMANCE EVALUATION
The world is divided into developed and developing
countries; the division is their capacity of educational and
scientific attainment and its applications for economic
progress and prosperity [16]. In developing countries, higher
education is seen as an essential means for creation and
development of resources and for improving the life of
people to whom it has to serve. The problem with developing
countries including Pakistan is that they have given a
relatively low priority to higher education [23]. There are
many reasons behind the poor status of Pakistan higher
education. One of the top issues regarding the quality of
higher education is the faculty (teachers) [23].
Permanent hired teachers in higher educations
institutions especially in colleges did not update their
knowledge and courses. They did the teaching as a routine
activity and follow some particular books from years. Thus,
students could not get updated knowledge and so fails to
compete in market unpublished [2].
To put the existing teachers on track, it is very necessary
to evaluate their performance, may be in quarterly, in
semester or annually, depends upon the resources universities
posses. Unfortunately, there exists no standard method for
evaluating teachers performance in higher education
institutions or computerized solution that covers all factors
affecting directly or indirectly the performance of university
teachers. Although, the Higher Education Commission (HEC)
has did lot of regarding quality assurance by establishing an
idea of the Quality Enhancement Cells (QEC) in universities;
but is rarely followed by universities due to time consuming
manual process and availability of funds.
IV. METHOD
In this research, expert system was adopted using fuzzy
logic principals for teachers evaluation process. In [2] they
have developed the knowledge acquisition tool for the
teachers assessment problem in the development of
intelligent expert system. They have extracted a set of 99
attributes from literature that have influence on teachers
performance by any means in higher education; the extracted
attributes were divided in to 15 groups.
TABLE I
GROUPS OF ATTRIBUTES (SOURCE: AMIN & KHAN (2009))
S. No Main Groups of Attributes Weights
1 Research Orientation 0.0753
2 Publication 0.0742
3 Teaching Learning Process 0.0729
4 Personal Abilities 0.0727
5 Responsibility & Punctuality 0.0726
6 Compensation & Rewards 0.0726
7 Professional Ethics 0.0720
8 Job Security & Environment Factors 0.0706
9 Supervision 0.0677
10 Administrative Skills 0.0674
11 Awards & Achievements 0.0605
12 Promotion Factors 0.0602
13 Organization Evaluation Policy 0.0577
14 Needs & Requirements 0.0550
15 Background Factors 0.0490
Total weight: 1.0000

They have received responses from 25 highly qualified
and well experienced subject experts from 11 different
universities of Pakistan about the various factors that affect
teachers performance and also the experts ranked these
factors. The initial results and priority assigned to those
factors are shown in Table-I.
The Research Orientation is ranked highest weight
(0.0753) which indicates that research work is much more
important than any other task in higher education. See Figure
1 for detail.

Fig. 1 Knowledge Acquisition Process [source: Amin & Khan (2009)]
Rules of thumb
Questions, Problems, Data
Formalized,
Structure the
Knowledge
Domain Experts
(Expert Knowledge)
Knowledge
Engineers
Knowledge Acquisition Tool: Questionnaire
Books
Journals
Reports
Data Bases
Internet etc
Concepts, Expertise,
Solutions, Knowledge
Reference Knowledge
Knowledge Refinements
Teachers
Performance
Evaluation
Criteria


53 | P a g e
In [33] developed an important logic concept that able
researchers to measure the linguistic variables with
ambiguous & uncertain knowledge into numeric values in
decision making process for decision makers in real world
problems. In this research, the author used the concept of
fuzzy set and the membership functions to map the linguistic
characteristics of teachers performance that are either ranked
High, Medium, or Low by the academic evaluators in higher
education institutions. The degree of membership in fuzzy set
is [1, 0], where 1 represents highest membership and
0represents no membership. A fuzzy variable set and their
membership value is defined as shown in Table-II.
TABLE II
FUZZY VARIABLES FOR INPUT PARAMETERS
Fuzzy variable Degree of Membership
Very High 1.0
High 0.8
Medium 0.6
Low 0.4
Very Low 0.2
Null 0

The inputs data for a particular teachers evaluation
comes from various sources that may be research
productivity & publications, academics awards &
achievements, students satisfactions, immediate head
satisfaction, colleagues opinions, and annual confidential
report. Most of these inputs are in non-numeric or linguistic
form. Therefore, a model is designed to process these inputs
through fuzzy concept to support decision makers in teachers
performance assessment at higher education institutions.
As shown in Figure 2; the extracted knowledge is
weighted according to the assigned priorities assigned by
subject experts in the knowledge acquisition process and
fuzzy concepts are then applied to handle the qualitative
knowledge for efficient decision making possibility. All the
fuzzy expert system components interact among each other
to perform their functionalities achieving results. Lets take
one group of attributes from Table-I with all its sub-factors
along with assigned weights and compute the result for a
particular case. Now observe Table-III, the maximum weight
of all the teachers performance criteria is 1.0000 and the
selected main attribute got maximum weight as 0.0729; while
its sub factors along with their weights are shown.
TABLE III
MAIN ATTRIBUTE ALONG WITH SUB-FACTORS
Teachers Performance Evaluation in Higher Education
Max Weight 1.0000
Main
Attribute Sub Factors
TEACHING
LEARNING
PROCESS
Weight
0.0729
Proficiency in teaching 0.0054
Personal Interest In Teaching 0.0075
Presentation & Communications skills 0.0063
Speaking Style & Body language 0.0050
Content knowledge 0.0059
Lecture preparation 0.0067
Language command 0.0059
Response to Student queries 0.0067
Question Tackling 0.0050
Courses taught 0.0038
Students Performance 0.0029
Work load 0.0046
Fairness in marking 0.0071

The knowledge representation of the above main attribute
with its sub-factors in fuzzy rules takes the following form.

IF proficiency_teaching is Very High THEN W= 0.0054
IF proficiency_teaching is High THEN W= 0.0043
IF proficiency_teaching is Medium THEN W=0.0032
IF proficiency_teaching is Low THEN W= 0.0022
IF proficiency_teaching is Very Low THEN W= 0.0011
IF personal_interest_teach is Very High THEN W= 0.0075
IF personal_interest_teaching is High THEN W= 0.0060
IF personal_interest_teaching is Medium THEN W=0.0045
IF personal_interest_teaching is Low THEN W= 0.0030
IF personal_interest_teaching is Very Low THEN W=
0.0015
IF present_Comm_skill is Very High THEN W= 0.0063
IF present_Comm_skill is High THEN W= 0.0050
IF present_Comm_skill is Medium THEN W=0.0038
IF present_Comm_skill is Low THEN W= 0.0025
IF present_Comm_skill is Very Low THEN W= 0.0013
IF style_body_language is Very High THEN W= 0.0050
IF style_body_language is High THEN W= 0.0040
IF style_body_language is Medium THEN W=0.0030
IF style_body_language is Low THEN W= 0.0020
IF style_body_language is Very Low THEN W= 0.0010
IF content_knowledge is Very High THEN W= 0.0059
IF content_knowledge is High THEN W= 0.0047
IF content_knowledge is Medium THEN W=0.0035
IF content_knowledge is Low THEN W= 0.0024
IF content_knowledge is Very Low THEN W= 0.0012
IF lecture_preparation is Very High THEN W= 0.0067
IF lecture_preparation is High THEN W= 0.0054
IF lecture_preparation is Medium THEN W=0.0040
IF lecture_preparation is Low THEN W= 0.0027
IF lecture_preparation is Very Low THEN W= 0.0013
IF language_command is Very High THEN W= 0.0059
IF language_command is High THEN W= 0.0047
IF language_command is Medium THEN W=0.0035
IF language_command is Low THEN W= 0.0024
IF language_command is Very Low THEN W= 0.0012
IF response_student_queries is Very High THEN W=
0.0067
IF response_student_queries is High THEN W= 0.0054
IF response_student_queries is Medium THEN W=0.0040
IF response_student_queries is Low THEN W= 0.0027
IF response_student_queries is Very Low THEN W= 0.0013
IF question_tack is Very High THEN W= 0.0050
IF question_tack is High THEN W= 0.0040
IF question_tack is Medium THEN W=0.0030
IF question_tack is Low THEN W= 0.0020
IF question_tack is Very Low THEN W= 0.0010
IF courses_taught is Very High THEN W= 0.0038
IF courses_taught is High THEN W= 0.0030
IF courses_taught is Medium THEN W=0.0023
IF courses_taught is Low THEN W= 0.0015
IF courses_taught is Very Low THEN W= 0.0008
IF student_perform is Very High THEN W= 0.0029
IF student_perform is High THEN W= 0.0023
IF student_perform is Medium THEN W=0.0017
IF student_perform is Low THEN W= 0.0012
IF student_perform is Very Low THEN W= 0.0006
IF workload is Very High THEN W= 0.0046
IF workload is High THEN W= 0.0037
IF workload is Medium THEN W=0.0028

54 | P a g e
IF workload is Low THEN W= 0.0018
IF workload is Very Low THEN W= 0.0009
IF fairness_marking is Very High THEN W= 0.0071
IF fairness_marking is High THEN W= 0.0057
IF fairness_marking is Medium THEN W=0.0043
IF fairness_marking is Low THEN W= 0.0028
IF fairness_marking is Very Low THEN W= 0.0014

The terms Very High, High, Medium, Low, Very Low are
fuzzy variables on the basis the weight W varies. This is
because the fuzzy membership function application which
able the system to map qualitative variables as numeric one.
Before entering input case to the fuzzy expert system, lets
discuss the computational formula which was applied in
fuzzy expert system for calculating decision making score.
To calculate the decision score of any single main
attribute with its all sub factors the following summation
formula is defined and used.
m
n
PnWn Ci
1
.. (1)
Where, Ci = i
th
main attribute.
M = number of sub-factors in the i
th
attribute.
Pn = Fuzzy value of n
th
input parameter
Wn = Expert weight of the relative input parameter

TABLE IV
EXAMPLES OF INPUT CASES
Three different examples as input cases to the Fuzzy
Expert system
All Sub Factors
Case
A
Case
B
Case
C
Proficiency in teaching 0.0054 H M L
Personal Interest In Teaching 0.0075 H M L
Presentation & Comm. Skills 0.0063 VH H L
Speaking Style & Body lang. 0.0050 M L M
Content knowledge 0.0059 M L M
Lecture preparation 0.0067 H M H
Language command 0.0059 VH H VH
Response to Student queries 0.0067 VH VH H
Question Tackling 0.0050 M L M
Courses taught (nature) 0.0038 M M L
Students Performance 0.0029 VH M M
Work load 0.0046 H H H
Fairness in marking 0.0071 M M M
The above input data (Table-IV) is entered to the Fuzzy
Expert System; through a built-in interface of the system for
computing decision score as shown in Figure-3. Numbers 5,
4, 3, 2, 1 entered as input representing 5=Very High, 4=High,
3=Medium, 2=Low, 1=Very Low. In Figure 3, the interface
two buttons have also shown, i.e., Explain and Why which
are available for explanation of inputs and reasoning
capabilities respectively.
After completion of the input data the Fuzzy Expert
System used the scale in Table-V, to rank the three cases A,
B, C respectively.
TABLE V
DECISION MAKING SCALE TO A LINGUISTIC DESCRIPTION
(MAX WEIGHT 0.0729)
Fuzzy Expert system output Linguistic Description
X< 0.0109 Poor
0.0109 X < 0.0.0218 Satisfied
0.0.0218 X< 0.0328 Good
0.0328 X < 0.0437 Very Good
0.0437 X < 0.0546 Excellent
X 0.0546 Outstanding
According to the developed scale in Table-V, the Fuzzy
Expert System mapped the calculated numeric results of the
three cases from qualitative input data into linguistic output
description, as shown in Table-VI.


55 | P a g e

Fig. 2 Fuzzy Expert System Model

Fig. 3 User Interface
TABLE VI
THREE CASES RANKING
Case System Calculation Description
A 0.0573 Outstanding
B 0.0465 Excellent
C 0.0451 Excellent

Users
I
n
p
u
t
s

R
e
s
u
l
t
s

Input Sources:
ACR, Students, HoD,
Colleagues, teachers
Personal data, others

Knowledge Representation
Decision
Making
Reference Knowledge
Fuzzy Logic Concept
for handling linguistic
mapping
Representation
of Extracted
Knowledge in
Fuzzy Rules
Explaining the
Reasoning
capabilities
Draw
Conclusions
Inference engine
User Interface
Explanation
Facility
A Pool of High Qualified
& Experience Subject
Experts
E
l
i
c
i
t

Set of attributes
extracted from subject
experts expertise &
knowledge for problem
solution
K
n
o
w
l
e
d
g
e

A
c
q
u
i
s
i
t
i
o
n

Intelligent Information
Knowledge Engineers
Questionnaire as a
knowledge
collection Tool

56 | P a g e
V. CONCLUSION & FUTURE DIRECTION
Regular teachers assessment is suggested to maintain
quality in higher education; literature clearly depicts taht
there is a vast potential of the applications of fuzzy logic &
expert system in teachers assessment. Expert system
technology using Fuzzy Logic is very interesting for
qualitative facts evaluation. A model of fuzzy expert system
is proposed to evaluate teachers performance on the basis of
various key performance attributes that have been validated
previously through subject experts. The fuzzy scale has been
designed to map & control the input data values from
absolute truth to absolute false. The qualitative variables are
mapped into numeric results by implementing the fuzzy
expert systems model through various input examples and
provided a basis to use the system ranking for further
decision making. Thus, the uncertain and qualitative
knowledge of the problem domain have been handled
absolutely through integration of expert system technology
with fuzzy logic concept.
The proposed model produced significant bases for
performance assessment and adequate support in decision
making, so the research on the issue can be continued.
Important aspect of this issue that could focus on in the
future is the fuzzy expert systems model that could be
extended to all type of employees assessment in universities
as well as in others government & private organizations.
REFERENCES
[1] H. Amin, A. R. Khan, Acquiring Knowledge for Evaluation of
Teachers Performance in Higher Education using a Questionnaire.
International Journal of Computer Science and Information Security
(IJCSIS) 2(2009), 180-187.
[2] H. Amin, An Intelligent Frame work for teachers performance
evaluation at higher education institutions of Pakistan, Master Thesis,
Institute of Information Technology, Kohat University of Science &
Technology, Kohat, NWFP, Islamic Republic of Pakistan.(2009).
(Unpublished)
[3] S.Ammar, W.Duncombe, B.Jump, R.Wright, Constructing a fuzzy-
knowledge-based-system: An application for assessing the financial
condition of public schools. Expert Systems with Applications,
27(2004), 349364.
[4] S.M.Bai, S.M.Chen, A new method for students learning achievement
using fuzzy membership functions. In Proceedings of the 11th
conference on artificial intelligence, Kaohsiung, Taiwan, Republic of
China. (2006).
[5] D.Biggs, M.Sagheb-Tehrani, Providing developmental feedback to
individuals from different ethnic minority groups using expert systems.
Expert Systems, 25(2008), 87-97.
[6] A.Berrais, A knowledge-based expert system for earthquake resistant
design of reinforced concrete buildings. Expert Systems with
Applications, 28 (2005), 519530.
[7] S.M.Chen, C.H.Lee, New methods for students evaluating using fuzzy
sets. Fuzzy Sets and Systems, 104(1999), 209218.
[8] D.F.Chang, C.M.Sun, Fuzzy assessment of learning performance of
junior high school students. In Proceedings of the 1993 first national
symposium on fuzzy theory and applications, Hsinchu, Taiwan,
Republic of China, 1993,pp. 110.
[9] C. F.Cheung, W.B.Lee, W. M.Wang, K.F.Chu, S.To, A multi-
perspective knowledge-based system for customer service management.
Expert Systems with Applications, 24(2003), 457470.
[10] T.T.Chiang, C.M.Lin, Application of fuzzy theory to teaching
assessment. In Proceedings of the 1994 second national conference on
fuzzy theory and applications, Taipei, Taiwan, Republic of China,
1994, pp. 9297.
[11] H. K. H.Chow, K.L.Choy, W.B.Lee, F.T.S.Chan, Design of a
knowledge-based logistics strategy system. Expert Systems with
Applications, 29(2005), 272290.
[12] J.A.Clark, F.Soliman, A graphical method for assessing knowledge-
based systems investments. Logistics Information Management,
12(1999), 6377.
[13] A.J.Day, A.K.Suri, A knowledge-based system for postgraduate
engineering courses. Journal of Computer Assisted Learning, 15(1999),
1427.
[14] J. Durkin, Application of Expert Systems in the Sciences. OHIO J. SCI.
90 (1990), 171-179.
[15] D.J.Fonseca, G.Uppal, T.J.Greene, A knowledge-based system for
conveyor equipment selection. Expert Systems with Applications,
26(2004), 615623.
[16] M.Hamidullah, Comparisons of the Quality of Higher Educations in
Public and Private Sector Institutions, PhD Thesis, University of Arid
Agriculture Rawalpindi, PAK, 2005.
[17] H.Iranmanesh, M.Madadi, An Intelligent System Framework for
Generating Activity List of a Project Using WBS Mind map and
Semantic Network. Proceedings of World Academy of Science,
Engineering and Technology. 30 (2008), 338-345.
[18] A.Kazaz, Application of an Expert System on the Fracture Mechanics
of Concrete. Artificial Intelligence Review. 19(2003), 177190.
[19] R.Kumra, R.M.Stein, I.Assersohn, Assessing a knowledgebase
approach to commercial loan underwriting. Expert Systems with
Applications, 30(2006), 507518.
[20] S. H. Liao, Problem solving and knowledge inertia. Expert Systems
with Applications, 22(2002), 2131.
[21] J.Ma, D.Zhou, Fuzzy set approach to the assessment of student-
centered learning. IEEE Transactions on Education, 43(2000), 237
241.
[22] W.W.Melek, A.Sadeghian, A theoretic framework for intelligent
expert systems in medical encounter evaluation. Expert Systems,
26(2009), 87-97.
[23] M.Naeemullah, Designing a Model for Staff Development in Higher
Education of Pakistan, PhD Thesis, University of Arid Agriculture
Rawalpindi, PAK, 2005.
[24] T.T.Pham, G.Chen, Some applications of fuzzy logic in rules-based
expert systems, Expert System,19(2002), 208-223.
[25] J.Pomar,C.Pomar, A knowledge-based decision support system to
improve sow farm productivity. Expert Systems with Applications,
29(2005), 3340.
[26] W. K.Wang, A knowledge-based decision support system for
measuring the performance of government real estate investment.
Expert Systems with Applications, 29(2005), 901912.
[27] W. K.Wang, H.C.Huang, M.C.Lai, Design of a knowledgebase
performance evaluation system: A case of high-tech state-owned
enterprises in an emerging economy. Expert Systems with
Applications. doi:10.1016/j.eswa.2007.01.032.
[28] W.Wen, W. K.Wang, T.H.Wang, A hybrid knowledgebased decision
support system for enterprise mergers and acquisitions. Expert Systems
with Applications, 28(2005a), 569582.
[29] W.Wen, W.K.Wang, C.H.Wang, A knowledge-based intelligent
decision support system for national defense budget planning. Expert
Systems with Applications, 28(2005b), 5566.
[30] M.H.Wu, Research on applying fuzzy set theory and item response
theory to evaluate learning performance. Master Thesis, Department of
Information Management, Chaoyang University of Technology,
Wufeng, Taichung County, Taiwan, Republic of China,2003.
[31] M.R.Shen, Y.Y.Tang, Z.T.Zhang, The intelligent assessment system in
Web-based distance learning education. 31st Annual Frontiers in
Education Conference , 1(2001), TIF-7-TIF-11.
[32] N.H.Yim, S.H.Kim, H.W.Kim, K.Y.Kwahk, Knowledge based
decision making on higher level strategic concerns: System dynamics
approach. Expert Systems with Applications, 27(2004), 143158.
[33] L.A. Zadeh, Fuzzy sets. Inform. and control, 8 (1965), 338-353.
[34] L.A.Zadeh, The concept of a linguistic variable and its application to
appropriate reasoning. Information sciences, 8(1975), 43-80.
AUTHORS PROFILE
Dr. Abdur Rashid Khan
Dr. Abdur Rashid Khan is presently working as Associate Professor at
Institute of Computing & Information Technology, Gomal University, Dera
Ismail Khan, Pakistan. He has completed his PhD in Computer Science

57 | P a g e
from Krygyze Republic in 2004, and have published a number of articles in
national and international journals. His current interest includes Expert
Systems, Software Engineering, Management Information System, and
Decision Support System.
Hafeez Ullah Amin
Mr. Hafeez is a research student at Institute of Information Technology,
Kohat University of Science & Technology, Kohat 26000, KPK, Pakistan.
He has completed BS(Hons) in Inforamtion Technology and MS in
Computer Science in 2006 & 2009 respectiviely from the above cited
institution. His current research interests includes Artificial Intelligence,
Information System, and Data Base.
Zia ur Rehman
Mr.Zia ur Rehman is currently working as a Lecturer in Computer Science
in Fuji Foundation School & College, Kohat, Pakistan . He has completed
his MS in computer science from Institute of Information Technology,
Kohat University of Science & Technology, Kohat, KPK, Pakistan. His
current research interest includes Software Engineering, Expert System
Develpoment, and Information System.


58 | P a g e
Dynamic Approach to Enhance Performance of
Orthogonal Frequency Division Multiplexing
(OFDM) In a Wireless Communication Network

James Agajo (M.Eng)
1
, Isaac O. Avazi Omeiza(Ph.D)
2
, Idigo Victor Eze(Ph.D)
3
,Okhaifoh Joseph(M.Eng.)
4

1
Dept. of Electrical and Electronic Engineering, , Federal Polytechnic, Auchi, Edo state, Nigeria
2
Dept. of Electrical and Electronics, University of Abuja, Nigeria
3
Dept. Electronics/Computer Engineering, Nnamdi Azikiwe University, Awka, Anambra state, Nigeria
4
Dept. of Electrical and Electronic, Federal University of Petroleum Resources Warri, Delta State Nigeria
Email: [email protected]

AbstractI n the mobile radio environment, signals are usually
impaired by fading and multipath delay phenomenon. This work
modeled and simulates OFDM in a wireless environment, it also
illustrates adaptive modulation and coding over a dispersive
multipath fading channel whereby simulation varies the result
dynamically. Dynamic approach entails adopting probabilistic
approach to determining channel allocation; First an OFDM
network environment is modeled to get a clear picture of the
OFDM concept. Next disturbances such as noise are deliberately
introduced into systems that are both OFDM modulated and
non-OFDM modulated to see how the system reacts. This enables
comparison of the effect of noise on OFDM signals and non-
OFDM modulated signals. Finally efforts are made using digital
encoding schemes such as QAM and DPSK to reduce the effects
of such disturbances on the transmitted signals. In the mobile
radio environment, signals are usually impaired by fading and
multipath delay phenomenon. In such channels, severe fading of
the signal amplitude and inter-symbol-interference (ISI) due to
the frequency electivity of the channel cause an unacceptable
degradation of error performance. Orthogonal frequency
division multiplexing (OFDM) is an efficient scheme to mitigate
the effect of multipath channel.
Keywords- OFDM, I nter-Carrier I nterference, I FFT,
multipath,Signal.
I. INTRODUCTION
Mobile radio communication systems are increasingly
demanded to provide a variety of high- quality services to
mobile users. To meet this demand, modern. mobile radio
transceiver system must be able to support high capacity,
variable bit rate information transmission and high bandwidth
efficiency. In the mobile radio environment, signals are usually
impaired by fading and multipath delay phenomenon. In such
channels, severe fading of the signal amplitude and inter-
symbol-interference (ISI) due to the frequency selectivity of the
channel cause an unacceptable degradation of error
performance. Orthogonal frequency division multiplexing
(OFDM) is an efficient scheme to mitigate the effect of
multipath channel. Since it eliminates ISI by inserting guard
interval (GI) longer than the
delay spread of the channel [1], [2]. Therefore, OFDM is
generally known as an effective technique for high data rate
services. Moreover, OFDM has been chosen for several
broadband WLAN standards like IEEE802.11a, IEEE802.11g,
and European HIPERLAN/2, and terrestrial digital audio
broadcasting (DAB) and digital video broadcasting (DVB) was
also proposed for broadband wireless multiple access systems
such as IEEE802.16 wireless MAN standard and interactive
DVB-T [3], In OFDM systems, the pilot signal averaging
channel estimation is generally used to identify the channel
state information (CSI) [5]. In this case, large pilot symbols are
required to obtain an accurate CSI. As a result, the total
transmission rate is degraded due to transmission of large pilot
symbols. Recently, carrier interferometry (CI) has been
proposed to identify the CSI of multiple-input multiple-output
(MIMO). However, the CI used only one phase shifted pilot
signal to distinguish all the CSI for the combination of
transmitter and receiver antenna elements.[3,4]
In this case, without noise whitening, each detected channel
impulse response is affected by noise [6]. Therefore, the pilot
signal averaging process is necessary for improving the
accuracy of CSI [7]. To reduce this problem, time, frequency
interferometry (TFI) for OFDM has been proposed. [8] [10].
The main problem with reception of radio signals is fading
caused by multipath propagation. There are also inter-symbol
interference (ISI), shadowing etc. This makes link quality vary.
As a result of the multipath propagation, there are many
reflected signals, which arrive at the receiver at different times.
Some of these reflections can be avoided by using a directional
antenna, but it is impossible to use them for a mobile user. A
solution could be usage of antenna arrays, but this technology
is still being developed.
This is why this research and development of the OFDM
have received considerable attention and have made a great
deal of progress in all parts of the world. OFDM is a wideband
modulation scheme that is specifically able to cope with the
problems of the multipath reception. This is achieved by
transmitting many narrowband overlapping digital signals in
parallel, inside one wide band. [5]

59 | P a g e
A. Objective of Project
The aim of this project is to simulate the physical layer of
an OFDM system. It investigates the OFDM system as a whole
and provides a simple working model on which subsequent
research can be built. Hence the successful completion of this
work shall involve;
1) Practical description of the OFDM system.
2) Algorithm development based on mathematical analysis
of the OFDM scheme.
3) Modeling of the algorithm and a software test based on
the MATLAB/Simulink environment.
Thus a typical OFDM system modeling the data source, the
transmitter, the air channel and the receiver side of the system
is simulated.
This project is intended to model and simulate an OFDM
network environment. A simple data source is provided to
serve as the input; likewise the transmitter, channel and
receiver are modeled using appropriate block-sets in the
Simulink. A simple representation of the OFDM system is
modeled, though with little deviation from the real
implementation. But all efforts had been taken in this work to
reduce the effects of such deviations.[6]
B. OVERVIEW
Orthogonal Frequency Division is where the spacing
between carriers is equal to the speed (bit rate) of the message.
In earlier multiplexing literature, a multiplexer was primarily
used to allow many users to share a communications medium
like a phone trunk between two telephone central offices. In
OFDM, it is typical to assign all carriers to a single user; hence
multiplexing is not used with its generic meaning.
Orthogonal frequency division multiplexing is then the
concept of typically establishing a communications link using a
multitude of carriers each carrying an amount of information
identical to the separation between the carriers. In comparing
OFDM and single carrier communication systems (SCCM), the
total speed in bits per second is the same for both, 1 Mbit/sec
(Mbps) in this example. For single carrier systems, there is one
carrier frequency, and the 1 Mbps message is modulated on this
carrier, resulting in a 1 MHz bandwidth spread on both sides of
the carrier. For OFDM, the 1,000,000 bps message is split into
10 separate messages of 100,000 bit/sec each, with a 100 KHz
bandwidth spread on both sides of the carrier.[7]
To illustrate how frequencies change with time, we can use
the analogy of the sounds of an orchestra or band. One carrier
wave is analogous to one instrument playing one note, whereas
many carriers is analogous to many instruments playing at
once. Single carrier systems using a high speed message is
analogous to a drum roll where the sticks are moving fast.
A more detailed understanding of Orthogonal arises when
we observe that the bandwidth of a modulated carrier has a so
called sinc shape (sinx/x) with nulls spaced by the bit rate. In
OFDM, the carriers are spaced at the bit rate, so that the
carriers fit in the nulls of the other carriers.
II. CHOICE OF APPROACH
The bottom-up design approach is chosen for this work
because of its concise form and ease of explanation.
A. Modeling the OFDM system
For the simulation, the Signal Processing and the
Communication Block-sets are used. The OFDM network can
be divided into three parts i.e. the transmitter, receiver and
channel. A data source is also provided which supplies the
signal to be transmitted in the network. Thereafter, the bit error
rate can be calculated by comparing the original signal at the
input of the transmitter and the signal at the output of the
receiver.
Transmitter
Convolutional encoder. In order to decrease the error rate of
the system, a simple convolution encoder of rate 1/2 is used as
channel coding.
Interleaver.
The interleaver rearranges input data such that consecutive
data are split among different blocks. This is done to avoid
bursts of errors. An interleaver is presented as a matrix. The
stream of bits fills the matrix row by row. Then, the bits leave
the matrix column by column. The depth of interleaver can be
adjusted.
Modulation.
A modulator transforms a set of bits into a complex
number corresponding to a signal constellation. The
modulation order depends on the subcarrier.
Bits flow through an interleaver with high SNR will be
assigned more bits than a subchannel with low SNR.
Modulations implemented here are QPSK, 16QAM and
64QAM.
Symmetrical IFFT. Data are transformed into time-
domain using IFFT. The total number of subcarriers
translates into the number of points of the IFFT/FFT. A
mirror operation is performed before IFFT in order to get
real symbols as output
Cyclic Prefix (CP). To preserve the orthogonality
property over the duration of the useful part of signal, a
cyclic prefix is then added. The cyclic prefix is a copy of
the last elements of the frame.
D/A. Convert digital symbols to analog signals. This
operation is done using the AIC codec inside the DSP.
Channel[8]
The channel must have the same characteristics as the pair
of twisted wires found in the telephone network. In order
to achieve this, we use a telephone line emulation
hardware. Also, we have the possibility to use the
adjustable filter ZePo and the noise generator. This can be
very useful to test the system performance.

60 | P a g e
=
=
1
0
) ( ) ( ) , (
L
l
t hl t h t o t t
=
=
1
0
) 2 exp(
L
l
l f j hl t t
}
=
0
) 2 exp( ) , ( ) , ( dt ft j t h t f H t t

Fig. 2.0 OFDM system
B. Receiver
A/D. Convert analog signals to digital symbols for
processing.
Synchronization. Due to the clock difference between
transmitter and receiver, a synchronization algorithm is
needed to find the first sample in the OFDM frame.
Remove cyclic prefix. This block simply removes the
cyclic prefix added in the transmitter.
Symmetrical FFT. Data are transformed back to
frequency-domain using FFT. Then the
complex conjugate mirror added in the transmitter is
removed.
Channel estimation. The estimation is achieved by pilot
frames.
Channel compensation. The channel estimation is used
to compensate for channel distortion.
Bit loading. The receiver computes the bit allocation
and send it to the transmitter.[9]
Demodulation. Symbols are transformed back to bits.
The inverse of the estimated channel response is used to
compensate the channel gain.
Deinterleaver (Interleaving inverse operation). The
stream of bits fills the matrix column by column. Then,
the bits leave the matrix row by row.
Convolution decoder. The decoder performs the Viterbi
decoding algorithm to generate transmitted bits from the
coded bits.
We assume that a propagation channel consists of L discrete
paths with different time delays. The impulse response h(,t) is
represented as

.( 2.1)
where hl and l are complex channel gain and the time
delay of lth propagation path, respectively,
The channel transfer function H(f,t) is the Fourier transform
of h(,t) and is given by

.( 2.2)

. (2.3)

g(t) ={ ... (2.4)

The guard interval Tg is inserted in order to eliminate the
ISI due to the multi-path fading, and hence, we have
. T = T s + T g 2.5
In OFDM systems, Tg is generally considered as Ts/4 or
Ts/5. Thus, we assume Tg = Ts/4 in this paper. In (3), g(t) is the
transmission pulse which gives g(t)

1 for Tg < t <Ts

0 otherwise


61 | P a g e
C. The Simulation Process
The simulation process is carried out in stages using
different digital modulators, and considering different effects of
the wireless interface. Hence, the wireless channel effects are
varied, as well as the type of digital modulators used for the
OFDM modulation. The effect of additive white Gaussian
noise (AWGN), is considered on a signal which is QAM
modulated. Finally, the combined effect of Phase noise and
AWGN on QAM modulated OFDM signal is modeled and
simulated. This enables comparison of the effect of noise on
OFDM signals using QAM modulation.[11]
D. Mathematical analysis of OFDM system.
This system compares the Error Rates of an OFDM
modulated signal with that of a non-OFDM modulated signal.
The error rates of an OFDM modulated are expected to be
lower than those of non-OFDM

. (2.6)
Where
(2.7)
This is of course a continuous signal. If we consider the
waveforms of each component of the signal over one symbol
period, then the variables Ac(t) and fc(t) modulated signals
which be written as equation 2.8.
OFDM spectrum
(a) A single Sub-channel
(b) Five carriers

Fig 2.1 Examples of OFDM spectrum
Mathematically, each carrier can be described as a complex
wave; [3] take on fixed values, which depend on the frequency
of that particular carrier, and so can be rewritten as:

.. (2.8)
If the signal is sampled using a sampling frequency of 1/T,
then the resulting signal is represented by:
(2.9)
At this point, we have restricted the time over which we
analyze the signal to N samples. It is convenient to sample over
the period of one data symbol. Thus we have a relationship
(2.10)
If we now simplify eqn. (2.9), without a loss of generality
by letting w0=0, then the signal becomes:
(2.10)
Now Eq. (2.10) can be compared with the general form of
the inverse Fourier transform:
. (2.11)
In eq. (2.10) and (2.11), the function
is no more than a definition of the signal in the sampled
frequency Domain, and is the time domain
representation. Eqns. (2.10) and (2.11) are equivalent if:
(2.12)
E. Factors influencing the control system
1) Signal- To- noise Ratio
AWGN is additive, which means that the noise signal adds
to the existing signal, resulting in a distorted version of the
original signal.
It is possible to determine the quality of a digitally
modulated signal influenced by AWGN using the probability
density function and the standard deviation signal. The signal

62 | P a g e
}

= < < =
01
2
) 2 2 /( )) 92 . 4 ( (
10 00 ) 0 ( ) / (
t
dx x
x P S U P
| |
) 52 ( 10 / 10 10 10 Log NoisePower power QPSKsignal Log + =
| |
NoisePower power QPSKsignal Log SNR dB QPSK / 10 10 =
to- noise ratio is defined as the ratio [4] of the power of the
signal to the noise power.
2
)) , ( / ) , ( ( ) / ( c rms c rms power power f t n f t S N S SNR = = (2.13)
Or in decibel;
) ( 10 10 SNR Log SNRdB =
..... (2.14)
F. Probability of Error in QPSK modulation
Because of the randomness of AWGN, it is impossible to
predict the exact locations of incorrectly decoded bits, it is
however possible to theoretically predict the amount of
incorrectly decoded bits in the long run, and from that calculate
error probabilities like the symbol- error rates and bit- error
rates.

(2.15)
QPSK encodes two data bits into a sinusoidal carrier wave
by altering the sinusoidal carrier waves phase. The probability
that a QPSK decoder will incorrectly decode a symbol 00 U

given that the correct transmitted symbol was S
10
is given as
[5].
SNR
QPSK-Db =

.2.16
TABLE 2.1 : THE OFDM SYMBOL DECODING PROCESS

Unfortunately, this function is not directly solvable and look up
tables are used to determine the results.
There exists a function, though that is closely related to p,
The Q- function [5]
The total symbol error rate of a QPSK decoder can finally
be calculated as the average symbol error probability of
) / ( 10 00 S U p , ) / ( 01 01 S U p ,
) / ( 10 10 S U p

and
) / ( 10 11 S U p
..(2.16.1)
The SER is given as
| | ) 2 / ( 2 A Q SER = (2.17)
The Bit- Error- Rate of a QPSK decoder is given as:
| | ) 2 / (A Q BER =
.. (2.18)
Expressing the SER and BER as a function of SNR, we
have:
| | ) ( 2 SNR Q SER =
.............................(2.19)
| | ) ( SNR Q BER =
.............................. (2.20)
From the above relationships of eqns (2.19) and (2.20), the
plot for the BER & SER of a QPSK modulated signal is as
shown below
Figure 2.2: shows the state transition diagram for the model of a cell operating
under the proposed algorithm.
- - 2.21
Since IEEE 802.11a OFDM signal has st N = 52 2.22
QPSK sub carriers, the signal has 52 times more power
SNR BER SER
2 .11 .05
4 .33 .28
6 .55 .45
8 .69 .63
10 .75 .70
12 .80 .74
14 .85 .76
16 .89 .79
18 .91 .80

63 | P a g e
: dB SNR SNR QPSK dB OFDM dB 6 . 17 + = (4.99d)..
2.23
G. STATE TRANSITION DIAGRAM
It is a very useful pictorial representation that clearly shows
the protocol (rule) operation.
Thus, its a tool to design the electronics that implements
the protocol and troubleshoots communication problems. In a
state diagram, all possible activity states of the system are
shown in nodes. At each state node, the system must respond to
some event occurring and then proceed to the appropriate next
state. [9]
H. Probabilitic Channel Allocation in OFDM
For a cell having S channels, the model has 2s+1 states,
namely 0, 1, 2, 3, S, I
O,
I,I
S-1
.
A cell is defined as cold cell if it is in state K, for 0 K n,
whereas a cell is called hot cell if it is not a state greater than n.
If a data call with rate d (respectively, a voice call with rate
v) arrives in a cold cell with state K, then the cell enters state
I
K
(respectively, state K+1). On the other hand, if a data call
with rate m or a new voice call with rate vn arrives in a hot
cell in state j for j>n, then the cell enters state Ij. A handoff
voice call is always assigned a whole channel. However, a new
voice is assigned a whole channel if the cell is a cold cell.
Let Pj denote the steady state probability that the process is
in state j, for j =0; 1; 2; _ _ _; S. Assuming that all channel
holding times are exponentially distributed.
For j = 0 (i.e., states 0 and I
0
):
(d + v)P
0
= P
1
(d + v) ........ (2.24)
d P
0
=P I
0
0
... (2.25)
It follows from (1) that
P
1
= d+v P0 =P
0.(2.26)
v+d
For j = 1; 2; .. n
(d + v)Pj + j(v + d)Pj = v P
j-1
+ (j + 1)(v + d)P
j+1
+
PI
j-1

j-1
..............(2.27)
dPj = PIj j (2.28)
Eq. (1.5) implies that d P
j-1
can be substituted for PI
j-1

j-1

in (4.4). Hence, Eq. (4.4) can be rewritten as
(d + v)P
j
+ j(v + d)Pj = vP
j-1
+ (j + 1)(v + d)P
j+1
+
dP
j-1
= (d + v)P
j-1
+ (j + 1)(d + v)P
j+1
.(4.6)

By solving (1.6) recursively by letting j = 1; 2; 3 and n-1 in
order to obtain
P
2
=1/2!
2
P
0
P
3
= 1/3!
3
P
0
P
4
= 1/4!
4
P
0
P
5
= 1/5!
5
P
0
and
P
n
= 1/n!
n
P
0
respectively.
Therefore, for 0 j n, we obtain

Pj =1/j!
j
P
0
.. (2.29)
For j = n+1, n+2, S-1, we have the following
balance equation in the equilibrium case
(m + vh)P
j
+ j(v + d)Pj = vhP
j-1
+ (j + 1)(v + d)P
j+1
+
mPI
j-1
(d +v) P
j
+ j(d + v)Pj =(d +v) P
j-1
+
(j+1)(d+v) P
(j+1)
...................................................(2.30)
Where d + v = m + vh
Note that Equation (10) is the same as Equation (1.6)
therefore Equ. (4.7) also holds for j= n+1, n+2,S-1.
For j = S
S(v + d)P
s
= vh P
s-1
+ PI
s-1
j-1
= vh PI
s-1
= (vh +m)P
s-1

=(d +v)P
s-1
(2.31)
P
s
=1/S!
s
P
0
....(2.32)
Thus for 0 J S, the steady state probability, P
j
is
P
j
=1/J!
j
P
0
.. (2.33)
The P
0
for handoff dropping probability, P
0d
, and new call
blocking probability, P
0b
are equal to the same steady-state
probability P
0s
, that is,
P
0
for P
d
= P
b
= P
s (2.34)

If a handoff calls requests a free packet slots of a channel
and is available. Then based on the algorithm proposed
P
d
=1/3j!
j
P
0
................................................(2.35)
When a new call is assigned a channel, the call is also
assigned a channel holding time, which is generated by an
exponential distribution function with a mean value of 15 time
slots. Call arrival is modeled with Markov Chain as a Poisson
process with different mean arrival rates, and the call duration
is exponentially distributed with a mean value of 15 time slots.
The traffic is characterized by the arrival rate of new calls and
by the transition probabilities of handoff calls. It is assumed
that base station has a buffer with a substantially large buffer
capacity to avoid significant packet loss.
III. SYSTEM IMPLEMENTATION
A. Software Subsystem Implementation
The OFDM system was modeled and simulated using
MATLAB & Simulink to allow various parameters of the
system to be varied and tested, including those established by
the standard as shown in fig 5.1 the simulation includes all the
stages for transmitter, channel and receiver, according to the
standard. Because of the MATLAB sampling time, the
transmission was implemented in baseband to avoid long
periods of simulation. Considering additive white gaussian
noise (AWGN) and multipath path Rayleigh fading effect, a
good approximation to the real performance can be observed,

64 | P a g e
over all in the degradation of the BER. At the transmitter,
OFDM signals are generated by Bernoulli Binary and mapped
by one of the modulation techniques. Then by using a
Sequence Generation, The transmitter section converts digital
data to be transmitted, into a mapping of sub carrier amplitude
and phase. It then transforms this spectral representation of the
data into the time domain using an Inverse Fast Fourier
Transform (IFFT). The OFDM symbol is equal to the length of
the IFFT size used (which is 1024) to generate the signal and it
has an integer number of cycles. The Cyclic Prefix and
Multiple Parameters were added before the signal conversion
from Parallel to Serial mode. The addition of a guard period to
the start of each symbol makes further improvement to the
effect of ISI on an OFDM signal. For generation an OFDM
signal, all the model variables parameters were setting in
suitable values in order to have smooth generated signal to be
transmitted. The channel simulation will allow for us to
examine the effects of noise and multipath on the OFDM
scheme. By adding small amount of random data of AWGN to
the transmitted signal, noise can be simulated. Generation of
random data at a bit rate that varies during the simulation. The
varying data rate is accomplished by enabling a source block
periodically for a duration that depends on the desired data rate.
The result above denotes the effect of noise (AWGN) on a
non-OFDM modulated signal. This result is to be compared
with the result of fig (4.3) so as to draw a comparison between
the effect of noise on a non-OFDM modulated data signals and
that of an OFDM modulated signal.
It is expected that OFDM performs better in noisy and
disturbed environment than any other modulation technique
compared with it here. It is also expected that the Bit Error
Rate (BER) and Symbol Error Rate (SER) of OFDM
modulated signals is always less than that of a non-OFDM
modulated signals. The theoretical symbol error probability of
PSK is Where erfc is the complementary error function,
O S
N E / is the ratio of energy in a symbol to noise power
spectral density, and M is the number of symbols.

..(3.1)
To determine the bit error probability, the symbol error
probability, PE, needs to be converted to its bit error
equivalent. There is no general formula for the symbol to bit
error conversion. Upper and lower limits are nevertheless easy
to establish. The actual bit error probability, Pb, can be shown
to be bounded by

..(3.2)
The lower limit corresponds to the case where the symbols
have undergone Gray coding. The upper limit corresponds to
the case of pure binary coding. Because increasing the value of
Eb/No lowers the number of errors produced, the length of
each simulation must be increased to ensure that the statistics
of the errors remain stable
Using the sim command to run a Simulink simulation from
the MATLAB Command Window, the following code
generates data for symbol error rate and bit error rate curves. It
considers Eb/No values in the range 0 dB to 12 dB, in steps of
2 dB.
IV. SIMULATION RESULT
The importance of modulation using (OFDM) can be seen
in the above simulations, since the un-modulated signal always
performs poorer than the modulated signal. Hence, modulation
makes a signal more conducive for transmission over the
transmission medium (in this case, the wireless channel). It is
also observed that appropriate choice of modulation techniques
could either increase or decrease the error rates of the signals.
Hence, a DPSK-modulated OFDM signal fig (4.6) is much
more conducive for transmission over the wireless channel than
any other type of modulation tested. The effect of the Additive
White Gaussian Noise (AWGN) is observed by modeling a
signal passing through a noisy channel without any form of
modulation. Afterwards, OFDM modulated signals (using
digital modulators such as QAM) are passed through the same
channel. The error rate is then compared.
A matlab file (see Appendix A) is written to vary the
signal-to-noise ratio (SNR) and plot the graph of the Bit Error
Rate (BER). The BER of the un-modulated signal is found to
be constant at 0.4904. The effect of a noisy channel on a QAM
signal is modeled as shown in fig 4.4. It is observed that an un-
modulated signal has a BER of about 50%, whereas OFDM
modulation reduces the BER significantly. It was also observed
that the DPSK-modulated OFDM signal reduces the BER
significantly, graph on Fig 4.7 shows a probabilistic approach
on the Comparison of the outage probability with the signal to
interference ratio of CCI, Appendiv 1 and Appendix 2
represent a classical representation of how OFDM and QPSK
simulation.


65 | P a g e

Fig 4.1 Modeling an OFDM network environment

Fig 4.2 Graph of Transmission spectrum

Fig 4.3 Graph of Receiver constellation


66 | P a g e

Fig 4.4 Receiver constellation

Fig 4.5 BER probability graph an OFDM modulated signal


67 | P a g e
Fig 4.6 BER Probability graph an OFDM Modulated Signal

Fig 4.7 Comparison of the outage probability with the signal to
interference ratio of CCI
A. Deployment
Modulation is a very important aspect of data transmission
since it makes a signal more conducive for transmission over
the transmission medium (in this case, the wireless channel).
Hence OFDM should be widely applied to Broadband wireless
access networks most especially in situations where the effects
of multipath fading and noise have to be eliminated.
V. CONCLUSION
This work was able to show that modulation using OFDM
technique is very important in Broadband wireles Access

Networks and noisy environments.The importance of
modulation can be seen in the above simulations, since the un-
modulated signal always performs poorer than the modulated
signal. Hence, modulation makes a signal more conducive for
transmission over the transmission medium (in this case, the
wireless channel). This work strongly recommends OFDM as a
strong candidate for Broadband Wireless Access Network.
APPENDIX 1


68 | P a g e
APPENDIX 2

REFERENCES
[1] CIMINI, L, Analysis and simulation of digital mobile channel using
OFDM. IEEE Trans. Commun., vol. 33, no. 7, pp. 665-675, 1985.
[2] Guglielmo Marconi, Early Systems of Wireless Communication: A
downloadable paper (PDF) based on R.W. Simons' address to the
Institution of Electrical Engineers PP112-123, 1984.
[3] S.B. Weinstein and P.M. Ebert, Data transmission by Frequency-division
multiplexing using the Discrete Fourier transform, IEEE Trans.
Commun. Technol., vol. COM-19, pp. 628-634, Oct. 1971.
[4] V.E. IDIGOOrthogonal Frequency Division Multiplexing implemented as
part of a Software Defined Radio (SDR) environment by Christoph
Sonntag department of Electrical/Electronic Engineering University of
Stellenbosch South Africa Dec. 2005
[5] V.E. IDOGHO, OFDM as a possible modulation technique for Multimedia
applications in the range of mm waves Duan Mati, Prim
10/30/98/TUD-TVS
[6] ABEL KUO, Joint Bit-Loading and Power-Allocation for
OFDM,Macmillian publisher,pp1411-151, 2006
[7] E. I. Tolochko, M. Pereira and M. Faulkner, SNR Estimation in Wireless
LANs with Transmitter Diversity, in the 3rd ATcrc
Telecommunications and Net- working Conference, Melbourne,
Australia, Dec. 2003.
[8] Fischer , Symbol-Error Rate Analysis of Fischers Bit-Loading Algorithm,
approximate analysis of the performance of Fischers algorithm for a
system with a large number of sub channels, Sept 2004.
[8] AHN, C., SASASE, I. The effects of modulation combination, target BER,
Doppler frequency, and adaptive interval on the performance of adaptive
OFDM in broadband mobile channel,.IEEE Trans. Consum. Electron., ,
vol. 48, no. 1, pp.167 - 174, Feb. 1999.
[9] YOKOMAKURA, K., SAMPEI, S., MORINAGA, N. A carrier
interferometry based channel estimation technique for one-cell reuse
MIMO-OFDM/TDMA cellular systems. In Proc. VTC 2006, pp. 1733-
1737, 2006.
[10] YOKOMAKURA, K., SAMPEI, S., HARADA, H., MORINAGA, N. A
channel estimation technique for dynamic parameter controlled-
OF/TDMA systems. In Proc. IEEE PIMRC, vol.1, pp. 644- 648, 2005.
[11] AHN, C. Accurate channel identification with time-frequency
interferometry for OFDM. IEICE Trans. Fundamentals, vol. E90- A, no.
11, pp. 2641-2645, Nov. 2007.

[12] YOFUNE, M., AHN, C., KAMIO, T., FUJISAKA, H., HAEIWA,K.
Decision direct and linear prediction based fast fading compensation for
TFI-OFDM. In Proc. of ITC-CSCC2008, pp. 81 to 84, July 2008.
[13] YOSHIMURA, T., AHN, C., KAMIO, T., FUJISAKA,H.,HAEIWA, K.
Performance enhancement
AUTHORS PROFILE
Engr. James Agajo is into a Ph.D Programme in the field of Electronic
and Computer Engineering, He has a Masters Degree in Electronic and
telecommunication Engineering from Nnamdi Azikiwe University Awka
Anambra State, and also possesses a Bachelor degree in Electronics and
Computer Engineering from the Federal University of Technology Minna
Nigeria. His interest is in intelligent system development with a high flare for
Engineering and Scientific research. He has Designed and implemented the
most resent computer controlled robotic arm with a working grip mechanism
2006 which was aired on a national television , he has carried out work on
using blue tooth technology to communicate with microcontroller. Has also
worked on thumb print technology to develop high tech security systems with
many more He is presently on secondment with UNESCO TVE as a supervisor
and a resource person. James is presently a member of the following
association with the Nigeria Society of Engineers(NSE), International
Association of Engineers(IAENG) UK, REAGON, MIRDA,MIJICT.
Dr. Isaac Avazi Omeiza holds B.Eng, M.Eng and Ph.D degrees in
Electrical/Electronics Engineering. His lecturing career at the university level
has spanned a period of about two decades. He has lectured at the Nigerian
Defence Academy Kaduna( the Nigerian Military university), University of
Ilorin and the Capital-City-University of Nigeria University of Abuja. He
has also been a member of the Nigerian society of Engineers ( NSE) and a
member of the Institute of Electrical and Electronic Engineers of America (
IEEE ). He has supervised several undergraduate final-year projects in
Electronic designs and has done a number of research works in digital image
processing, Fingerprint processing and the processing of video signals.
Engr. Joseph Okhaifoh is into a PH.D programme, he holds a Masters
degree in Electronics and telecommunication Engineering and a Bachelor
Degree in Electrical and Electronics Engineering, he is presently a member of
Nigeria society of Engineers.
Dr V.E Idigo holds a Ph.D, M.Eng, BEng. in Communication
Engineering, a Member of IAENG,MNSE and COREN, he is presently the
Head of Department of Electrical Electronics in Nnamdi Azikiwe University
Awka Anambra State, Nigeria

69 | P a g e
Sectorization of Full Kekres Wavelet Transform for
Feature extraction of Color Images

H.B.Kekre
Sr. Professor
MPSTME, SVKMs NMIMS (Deemed-to be-University)
Vile Parle West, Mumbai -56,INDIA
[email protected]
Dhirendra Mishra
Associate Professor & PhD Research Scholar
MPSTME, SVKMs NMIMS (Deemed-to be-University)
Vile Parle West, Mumbai -56,INDIA
[email protected]

AbstractAn innovative idea of sectorization of Full Kekres
Wavelet transformed (KWT)[1] images for extracting the
features has been proposed. The paper discusses two planes i.e.
Forward plane (Even plane) and backward plane (Odd plane).
These two planes are sectored into 4, 8, 12 and 16 sectors. An
innovative concept of sum of absolute difference (AD) has been
proposed as the similarity measuring parameters and compared
with the well known Euclidean distance (ED).The performances
of sectorization of two planes into different sector sizes in
combination with two similarity measures are checked. Class
wise retrieval performance of all sectors with respect to the
similarity measures i.e. ED and AD is analyzed by means of its
class (randomly chosen 5 images) average precision- recall cross
over points, overall average (average of class average) precision-
recall cross over points and two new parameters i.e. LIRS and
LSRR.
Keywords- CBIR, Kekres Wavelet Transform (KWT), Euclidian
Distance, Sum of Absolute Difference, LI RS, LSRR, Precision and
Recall.
I. INTRODUCTION
Content based image retrieval i.e. CBIR [2-6] is well
known technology being used and being researched upon for
the retrieval of images from the large image databases. CBIR
has been proved to be very much needed technology to be
researched on due to its applicability in various applications
like face recognition, finger print recognition, pattern
matching[7][8][9], verification /validation of images etc. The
concept of CBIR can be easily understood by the figure 1 as
shown below. Every CBIR systems needs functionality for
feature extraction of an image viz. shape, color, texture which
can represent the uniqueness of the image for the purpose of
best match in the database to be searched. The features of the
query image are compared with the features of all images in the
feature database using various mathematical construct known
as similarity measures. These mathematical similarity
measuring techniques checks the similarity of features
extracted to classify the images in the relevant and irrelevant
classes. The research in CBIR needs to be done to explore two
aspects first is the better method of feature extraction having
maximum components of uniqueness and faster, accurate
mathematical models of similarity measures. As the figure 1
shows the example of query image of an horse being provided
to CBIR system as query and the images of relevant classes are
retrieved. Relevance feedback of the retrieval is used for the
machine learning purpose to check the accuracy of the retrieval
which in turn helps one to focus on the modification in the
current approach to have improved performance.

Figure 1.The CBIR System [2]
Many researches are currently working on the very open
and demanding field of CBIR. These researches focus on to
generate the better methodologies of feature extractions in both
spatial domain and frequency domain. Some methodologies
like block truncation coding [10-11], various transforms: FFT
[12-14], Walsh transform [15-21], DCT [22], DST [23] and
other approaches like Hashing [24], Vector quantization [25],
Contour let transform [5], has already been developed
In this paper we have introduced a novel concept of
Sectorization of Full Kekres Wavelet transformed color
images for feature extraction. Two different similarity
measures parameters i.e. sum of absolute difference and
Euclidean distance are used. Average precision, Recall, LIRS
and LSRR are used for performances study of these
approaches.
II. KEKRES WAVELET [1]
Kekres Wavelet transform is derived from Kekres
transform. From NxN Kekres transform matrix, we can
generate Kekres Wavelet transform matrices of size
(2N)x(2N), (3N)x(3N),, (N2)x(N2). For example, from
5x5.Kekres transform matrix, we can generate Kekres
Wavelet transform matrices of size 10x10, 15x15, 20x20 and

70 | P a g e
25x25. In general MxM Kekres Wavelet transform matrix can
be generated from NxN Kekres transform matrix, such that M
= N * P where P is any integer between 2 and N that is, 2 P
N. Kekres Wavelet Transform matrix satisfies [K][K]T = [D]
Where D is the diagonal matrix this property and hence it is
orthogonal. The diagonal matrix value of Kekres transform
matrix of size NxN can be computed as

(1)
III. PLANE FORMATION AND ITS SECTORIZATION
[12-19],[22-23]

The components of Full KWT transformed image shown in
the red bordered area (see Figure 2) are used to generate feature
vectors. The average of zeoeth row, column and last row and
column components are augmented to feature vector generated.
Color codes are used to differentiate between the co-efficients
plotted on Forward (Even) plane as light red and light blue for
co-efficients belonging to backward (Odd) plane. The co-
efficient with light red background i.e. at position
(1,1),(2,2);(1,3),(2,4) etc. are taken as X1 and Y1 respectively
and plotted on Even plane. The co-efficient with light blue
background i.e. at position (2,1),(1,2);(2,3),(1,4) etc. are taken
as X2 and Y2 respectively and plotted on Odd plane.

Figure 2: KWT component arrangement in an Transformed Image.

Even plane of Full KWT is generated with taking KWT
components into consideration as all X(i,j), Y(i+1, j+1)
components for even plane and all X(i+1, j), Y(i, j+1)
components for odd plane as shown in the Figure 3. Henceforth
for our convenience we will refer X(i,j) = X1, Y(i+1,j+1) =Y1
and X(i+1,j) = X2 and Y(i,j+1) = Y2.
X(i,j) Y(i,j+1)
X(i+1,j) Y(i+1,j+1)
Figure 3: Snapshot of Components considered for Even/Odd Planes.

As shown in the Figure 3 the Even plane of Full KWT
considers X1 i.e. all light red background cells (1,1),
(2,2),(1,3),(2,4) etc. on X axis and Y1 i.e. (1,2), (2,1),(1,4),(2,3)
etc. on Y axis. The Odd plane of Full KWT considers X1 i.e.
all light blue background cells (1,2), (2,1),(1,4),(2,3) etc. on X
axis and Y1 i.e. (1,2), (2,1),(1,4),(2,3) etc. on Y axis.
IV. RESULTS AND DISCUSSION.
Augmented Wang image database [4] has been used for the
experiment. The database consists of 1055 images of 12
different classes such as Flower, Sunset, Barbie, Tribal,
Cartoon, Elephant, Dinosaur, Bus, Scenery, Monuments,
Horses, Beach. Class wise distribution of all images in the
database has been shown in the Figure 7.

Figure 7: Class wise distribution of images in the Image database consists of
Sunset:51, Cartoon:46,
Flower:100,Elephants:100,Barbie:59,Mountains:100,Horse:100,Bus:100,Triba
l:100,Beaches:99,Monuments:100,Dinasour :100

Figure8. Query Image

The query image of the class dinosaur has been shown in
Figure 8. For this query image the result of retrieval of both
approaches of Full KWT wavelet transformed image
sectorization of even and odd planes. The Figure 9 shows First
20 Retrieved Images sectorization of Full KWT wavelet
Forward (Even) plane (16 Sectors) with sum of absolute
difference as similarity measure. There are two retrieval from
irrelevant class.
The first irrelevant image occurred 15
th
position and second
on 20
th
position (shown with red boundary) in the even plane
sectorization. The result of odd plane sectorization shown in
Figure 10; the retrieval of first 20 images containing 2
irrelevant retrievals but the first irrelevant class has occurred at
17
th
position.


71 | P a g e

Figure 9: First 20 Retrievals of Full KWT Forward (Even) plane sectorization
into 16 Sectors with sum of absolute difference as similarity measure.

Figure 10: First 20 Retrievals of Full KWT Backward (Odd) plane
sectorization into 16 Sectors with sum of absolute difference as similarity
measure..
Feature database includes feature vectors of all images in
the database. Five random query images of each class were
used to search the database. The image with exact match gives
minimum sum of absolute difference and Euclidian distance.
To check the effectiveness of the work and its performance
with respect to retrieval of the images we have calculated the
overall average precision and recall and its cross over values
and plotted class wise. The Equations (2) and (3) are used for
precision and recall calculation whilst two new parameters i.e.
LIRS (Length of initial relevant string of images) and LSRR
(Length of string to recover all relevant images) are used as
shown in Equations (4) and (5).

All these parameters lie between 0-1 hence they can be
expressed in terms of percentages. The newly introduced
parameters give the better performance for higher value of
LIRS and Lower value of LSRR [8-13].

Figure 11: Class wise Average Precision and Recall cross over points of
Forward Plane (Even) sectorization of Full KWT Wavelet with sum of
Absolute Difference (AD) and Euclidean Distance (ED) as similarity measure.


72 | P a g e

Figure 12: Class wise Average Precision and Recall cross over points of
Backward Plane (Odd) sectorization of Full KWT Wavelet with Absolute
Difference (AD) and Euclidean Distance (ED) as similarity measure.

Figure 13: Comparison of Overall Precision and Recall cross over points of
sectorization of Full KWT Wavelet with sum of Absolute Difference (AD)
and Euclidean Distance (ED) as similarity measure.

Figure 14: The LIRS Plot of sectorization of forward plane of Full KWT
transformed images . Overall Average LIRS performances (Shown with
Horizontal lines :0.082 (4 Sectors ED), 0.052 (4 Sectors AD), 0.071(8 Sectors
ED), 0.051(8 Sectors AD), 0.075(12 Sectors ED), 0.069(12 Sectors AD),
0.053(16 Sectors ED), 0.053(16 Sectors AD) ).

Figure 15: The LIRS Plot of sectorization of Backward plane of Full KWT
transformed images . Overal Average LIRS performances (Shown with
O
v
e
r
a
l
l

A
v
e
r
a
g
e

P
r
e
c
i
s
i
o
n

a
n
d

R
e
c
a
l
l

C
r
o
s
s

o
v
e
r

p
o
i
n
t

Methods (Combination of Plane and
Similarity measures)
Sectorization of Kekre's Wavelet (Full)
(With Augmentation)
Sector 4
Sector 8
Sector 12
Sector 16

73 | P a g e

Figure 16: The LSRR Plot of sectorization of forward plane of Full KWT
transformed images . Overall Average LSRR performances (Shown with
ED), 0.71(8 Sectors AD), 0.76(12 Sectors ED), 0.73(12 Sectors AD), 0.74(16
Sectors ED), 0.71(16 Sectors AD) ).

Figure 17: The LSRR Plot of sectorization of backward plane of Full KWT
transformed images . Overall Average LSRR performances (Shown with
Horizontal lines :0.77(4 Sectors ED), 0.729 (4 Sectors AD), 0.76(8 Sectors
V. CONCLUSION
The work experimented on 1055 image database of 12
different classes discusses the performance of sectorization of
Full KWT wavelet transformed color images for image
retrieval. The work has been performed with both approaches
of sectorization of forward (even) plane and backward (odd)
planes. The performance of the methods proposed checked
with respect to various sector sizes and similarity measuring
approaches viz. Euclidian distance and sum of absolute
difference. We calculated the average precision and recall cross
over point of 5 randomly chosen images of each class and the
overall average is the average of these averages. The
observation is that sectorization of both planes of full KWT
wavelet transformed images give less than 30% of the overall
average retrieval of relevant images as shown in the Figure 13.
The class wise plot of these average precision and recall cross
over points as shown in Figure 11 and Figure 12 for both
approaches depicts that the retrieval performance varies from
class to class and from method to method wherein horses,
flower and dinosaur classes have retrieval more than 50%.
They have the performance above the average of all methods as
shown by horizontal lines. New parameter LIRS and LSRR
gives good platform for performance evaluation to judge how
early all relevant images is being retrieved (LSRR) and it also
provides judgement of how many relevant images are being
retrieved as part of first set of relevant retrieval (LIRS).The
value of LIRS must be minimum and LSRR must be minimum
for the particular class if the overall precision and recall cross
over point of that class is maximum. This can be clearly seen in
Figures 14 to Figure 17. This observation is very clearly visible
for dinosaur class however the difference of LIRS and LSRR
of other classes varies. The sum of absolute difference as
similarity measure is recommended due to its lesser complexity
and better retrieval rate performance compared to Euclidian
distance.
REFERENCES
[1] H.B.Kekre, Archana Athawale and Dipali sadavarti, Algorithm to
generate Kekres Wavelet transform from Kekres Transform,
International Journal of Engineering, Science and Technology,
Vol.2No.5,2010 pp.756-767.
[2] Dr. Qi, semantic based CBIR(content based image
retrieval),http://cs.usu.edu/htm/REU-Current-Projects.
[3] Kato, T., Database architecture for content based image retrieval in
Image Storage and Retrieval Systems (Jambardino A and Niblack W
eds),Proc SPIE 2185, pp 112-123, 1992.
[4] Ritendra Datta,Dhiraj Joshi,Jia Li and James Z. Wang, Image
retrieval:Idea,influences and trends of the new age,ACM Computing
survey,Vol 40,No.2,Article 5,April 2008.
[5] Ch.srinivasa rao,S. srinivas kumar,B.N.Chaterjii, content based image
retrieval using contourlet transform, ICGST-GVIP Journal, Vol.7 No.
3, Nov2007.
[6] John Berry and David A. Stoney The history and development of
fingerprinting, in Advances in Fingerprint Technology, Henry C. Lee
and R. E. Gaensslen, Eds., pp. 1-40. CRC Press Florida, 2
nd
edition,
2001.
[7] Arun Ross, Anil Jain, James Reisman, A hybrid fingerprint matcher,
Intl conference on Pattern Recognition (ICPR), Aug 2002.
[8] A. M. Bazen, G. T. B.Verwaaijen, S. H. Gerez, L. P. J. Veelenturf, and
B. J. van der Zwaag, A correlation-based fingerprint verification
system, Proceedings of the ProRISC2000 Workshop on Circuits,
Systems and Signal Processing, Veldhoven, Netherlands, Nov 2000.
[9] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, DST Applied to
Column mean and Row Mean Vectors of Image for Fingerprint
Identification, International Conference on Computer Networks and
Security, ICCNS-2008, 27-28 Sept 2008, Vishwakarma Institute of
Technology, Pune.
[10] H.B.Kekre, Sudeep D. Thepade, Using YUV Color Space to Hoist the
Performance of Block Truncation Coding for Image Retrieval, IEEE
International Advanced Computing Conference 2009 (IACC09), Thapar
University, Patiala, INDIA, 6-7 March 2009.

74 | P a g e
[11] H.B.Kekre, Sudeep D. Thepade, Image Retrieval using Augmented
Block Truncation Coding Techniques, ACM International Conference
on Advances in Computing, Communication and Control (ICAC3-2009),
pp.: 384-390, 23-24 Jan 2009, Fr. Conceicao Rodrigous College of
Engg., Mumbai. Available online at ACM portal.
[12] H. B. Kekre, Dhirendra Mishra, Digital Image Search & Retrieval using
FFT Sectors published in proceedings of National/Asia pacific
conference on Information communication and technology(NCICT 10)
5
TH
& 6
TH
March 2010.SVKMS NMIMS MUMBAI
[13] H.B.Kekre, Dhirendra Mishra,Digital Image Search & Retrieval using
FFT Sectors of Color Images published in International Journal of
Computer Science and Engineering (IJCSE) Vol.
02,No.02,2010,pp.368-372 ISSN 0975-3397 available online at
http://www.enggjournals.com/ijcse/doc/IJCSE10-02- 02-46.pdf
[14] H.B.Kekre, Dhirendra Mishra, CBIR using upper six FFT Sectors of
Color Images for feature vector generation published in International
Journal of Engineering and Technology(IJET) Vol. 02, No. 02, 2010,
49-54 ISSN 0975-4024 available online at
http://www.enggjournals.com/ijet/doc/IJET10-02- 02-06.pdf
[15] H.B.Kekre, Dhirendra Mishra, Four walsh transform sectors feature
vectors for image retrieval from image databases, published in
international journal of computer science and information technologies
(IJCSIT) Vol. 1 (2) 2010, 33-37 ISSN 0975-9646 available online at
http://www.ijcsit.com/docs/vol1issue2/ijcsit2010010201.pdf
[16] H.B.Kekre, Dhirendra Mishra, Performance comparison of four, eight
and twelve Walsh transform sectors feature vectors for image retrieval
from image databases, published in international journal of
Engineering, science and technology(IJEST) Vol.2(5) 2010, 1370-1374
ISSN 0975-5462 available online at
http://www.ijest.info/docs/IJEST10-02-05-62.pdf
[17] H.B.Kekre, Dhirendra Mishra, density distribution in walsh transfom
sectors ass feature vectors for image retrieval, published in international
journal of compute applications (IJCA) Vol.4(6) 2010, 30-36 ISSN
0975-8887 available online at
http://www.ijcaonline.org/archives/volume4/number6/829-1072
[18] H.B.Kekre, Dhirendra Mishra, Performance comparison of density
distribution and sector mean in Walsh transform sectors as feature
vectors for image retrieval, published in international journal of Image
Processing (IJIP) Vol.4(3) 2010, ISSN 1985-2304 available online at
http://www.cscjournals.org/csc/manuscript/Journals/IJIP/Volume4/Issue
3/IJIP-193.pdf
[19] H.B.Kekre, Dhirendra Mishra, Density distribution and sector mean
with zero-sal and highest-cal components in Walsh transform sectors as
feature vectors for image retrieval, published in international journal of
Computer scienece and information security (IJCSIS) Vol.8(4) 2010,
ISSN 1947-5500 available online http://sites.google.com/site/ijcsis/vol-
8-no-4-jul-2010
[20] H.B.Kekre, Vinayak Bharadi, Walsh Coefficients of the Horizontal &
Vertical Pixel Distribution of Signature Template, In Proc. of Int.
Conference ICIP-07, Bangalore University, Bangalore. 10-12 Aug 2007.
[21] J. L. Walsh, A closed set of orthogonal functions American Journal of
Mathematics, Vol. 45, pp.5-24,year 1923.
[22] H.B.Kekre, Dhirendra Mishra, DCT sectorization for feature vector
generation in CBIR, International journal of computer application
(IJCA),Vol.9, No.1,Nov.2010,ISSN:1947-5500
http://ijcaonline.org/archives/volume9/number1/1350-1820
[23] H.B.Kekre, Dhirendra Mishra, DST Sectorization for feature vector
generation, Universal journal of computer science and engineering
technology(Unicse),Vol.1,No.1Oct 2010 available at
http://www.unicse.oeg/index.php?option=com content and
view=article&id=54&itemid=27
[24] H.B.Kekre, Dhirendra Mishra, Content Based Image Retrieval using
Weighted Hamming Distance Image hash Value published in the
proceedings of international conference on contours of computing
technology pp. 305-309 (Thinkquest2010) 13th & 14
th
March 2010.
[25] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, Image Retrieval
using Color-Texture Features from DST on VQ Codevectors
obtained by Kekres Fast Codebook Generation, ICGST International
Journal on Graphics, Vision and Image Processing (GVIP),
Available online at http://www.icgst.com/gvip
AUTHORS PROFILE
H. B. Kekre has received B.E. (Hons.) in Telecomm.
Engg. from Jabalpur University in 1958, M.Tech
(Industrial Electronics) from IIT Bombay in 1960,
M.S.Engg. (Electrical Engg.) from University of
Ottawa in 1965 and Ph.D.(System Identification) from
IIT Bombay in 1970. He has worked Over 35 years as
Faculty and H.O.D. Computer science and Engg. At
IIT Bombay. From last 13 years working as a professor in Dept. of Computer
Engg. at Thadomal Shahani Engg. College, Mumbai. He is currently senior
Professor working with Mukesh Patel School of Technology Management and
Engineering, SVKMs NMIMS University vile parle west Mumbai. He has
guided 17 PhD.s 150 M.E./M.Tech Projects and several B.E./B.Tech Projects.
His areas of interest are Digital signal processing, Image Processing and
computer networking. He has more than 350 papers in National/International
Conferences/Journals to his credit. Recently ten students working under his
guidance have received the best paper awards. Two research scholars working
under his guidance have been awarded Ph. D. degree by NMIMS University.
Currently he is guiding 10 PhD. Students. He is life member of ISTE and
Fellow of IETE.

Dhirendra Mishra has received his BE (Computer Engg) and M.E.
(Computer Engg) degree from University of Mumbai,
Mumbai, India He is PhD Research Scholar and
working as Associate Professor in Computer
Engineering department of Mukesh Patel School of
Technology Management and Engineering, SVKMs
NMIMS University, Mumbai, India. He is life member
of Indian Society of Technical education (ISTE),
Member of International association of computer
science and information technology (IACSIT),
Singapore, Member of International association of Engineers (IAENG). His
areas of interests are Image Processing, Image Databases; Pattern matching,
Operating systems, Information Storage and Management.

75 | P a g e
Dominating Sets and Spanning Tree based
Clustering Algorithms for Mobile Ad hoc Networks
R Krishnam Raju Indukuri
Dept of Computer Science
Padmasri Dr.B.V.R.I.C.E
Bhimavaram, A.P, India
[email protected]
Suresh Varma Penumathsa
Dept of Computer Science
Adikavi Nannaya University
Rajamundry, A.P, India.
[email protected]

Abstract The infrastructure less and dynamic nature of
mobile ad hoc networks (MANET) needs efficient clustering
algorithms to improve network management and to design
hierarchical routing protocols. Clustering algorithms in mobile
ad hoc networks builds a virtual backbone for network nodes.
Dominating sets and Spanning tree are widely used in
clustering networks. Dominating sets and Spanning Tree based
MANET clustering algorithms were suitable in a medium size
network with respect to time and message complexities. This
paper presents different clustering algorithms for mobile ad
hoc networks based on dominating sets and spanning tree.
Keywords : mobile ad hoc networks, clustering, dominating set
and spanning trees
I. INTRODUCTION
MANETs do not have any fixed infrastructure and consist
of wireless mobile nodes that perform various data
communication tasks. MANETs have potential applications
in rescue operations, mobile conferences, battlefield
communications etc. Conserving energy is an important
issue for MANETs as the nodes are powered by batteries
only.
Clustering has become an important approach to manage
MANETs. In large, dynamic ad hoc networks, it is very hard
to construct an efficient network topology. By clustering the
entire network, one can decrease the size of the problem into
small sized clusters. Clustering has many advantages in
mobile networks. Clustering makes the routing process
easier, also, by clustering the network, one can build a
virtual backbone which makes multicasting faster. However,
the overhead of cluster formation and maintenance is not
trivial. In a typical clustering scheme, the MANET is firstly
partitioned into a number of clusters by a suitable
distributed algorithm. A Cluster Head (CH) is then allocated
for each cluster which will perform various tasks on behalf
of the members of the cluster. The Performance metrics of a
clustering algorithm are the number of clusters and the
count of the neighbour nodes which are the adjacent nodes
between clusters that are formed.
In this paper we discussed various clustering algorithms
based on dominating sets [1] [4] [11] [14] [16] and
Spanning Trees 6] [8] [15]. The performance metrics of a
clustering algorithm are the number of clusters and the

count of the neighbor nodes which are the adjacent nodes
between clusters that are formed.
II. DOMINATING SETS BASED CLUSTERING ALGORITHMS
A dominating set [9] is a subset S of a graph G such that
every vertex in G is either in S or adjacent to a vertex in S.
Dominating sets are widely used in clustering networks.
Dominating sets can be classified into three main classes i)
Independent Dominating Set ii) Weakly Connected
Dominating Set and iii) Connected Dominating Set.
A. Independent Dominating Set (IDS)
IDS [6] [11] is a dominating set S of a graph G in which
there are no adjacent vertices. Fig.1. shows a sample
independent dominating set

Figure 1. Independent Dominating Set.
B. Weakly Connected Dominating Sets (WCDS)
WCDS [10] [12] is Sw is a subset S of a graph G that
contains the vertices of S, their neighbors and all edges of
the original graph G with at least one endpoint in S. A
subset S is a weakly-connected dominating set, if S is
dominating and Sw is connected. Fig.2. shows a Weakly
Connected Dominating Sets.

Figure 2. Weakly Connected Dominating Set.

76 | P a g e
C. Connected Dominating Set (CDS)
CDS [11] [13] is a subset S of a graph G such that S
forms a dominating set and S is connected. Fig.3. shows a
Weakly Connected Dominating Sets.

Figure 3. Connected Dominating Set.
D. Determining Dominating Sets
Algorithms that construct a CDS in ad hoc networks can
be divided into two categories: centralized algorithms that
depend on network-wide information or coordination and
decentralized that depend on local information only.
Centralized algorithms usually yield a smaller CDS than
decentralized algorithms, but their application is limited due
to the high maintenance cost.
Decentralized algorithms can be further divided into
cluster-based algorithms and pure localized algorithms.
Cluster-based algorithms have a constant approximation
ratio in unit disk graphs and relatively slow convergence (
O(n) in the worst case). Pure localized algorithms take
constant steps to converge, produce a small CDS on
average, but have no constant approximation ratio. A
cluster-based algorithm usually contains two phases. In the
first phase, the network is partitioned into clusters and a
clusterhead is elected for each cluster. In the second phase,
clusterheads are interconnected to form a CDS. Several
clustering algorithms [2] [4] [7] have been proposed to
elect clusterheads that have the minimal id, maximal degree,
or maximal weight. A host v is a clusterhead if it has the
minimal id (or maximal degree or weight) in its 1-hop
neighbourhood. A clusterhead and its neighbours form a
cluster and these hosts are covered. The election process
continues on uncovered hosts and, finally, all hosts are
covered.
Wu and Li [9] proposed a simple and efficient localized
algorithm that can quickly determine a CDS in ad hoc
networks. This approach uses a marking process where
hosts interact with others in the neighbourhood.
Specifically, each host is marked true if it has two
unconnected neighbours. These hosts achieve a desired
global objective set of marked hosts forms a small CDS.

Figure 4. Example of ad hoc networks.
In Wu and Lis approach, the resultant dominating set
derived from the marking process is further reduced by
applying two dominant pruning rules. According to
dominant pruning Rule 1, a marked host can unmark itself if
its neighbour set is covered by another marked host; that is,
if all neighbours of a gateway are connected with each other
via another gateway, it can relinquish its responsibility as a
gateway. In Fig. 4. either u or w can be unmarked (but not
both).According to Rule 2, a marked host can unmark itself
if its neighbourhood is covered by two other directly
connected marked hosts. The combination of Rules 1 and 2
is fairly efficient in reducing the number of gateways while
still maintaining a CDS.
III. LOCALIZED DOMINATING SET FORMATION ALGORITHM
A. Localized Dominating Set Formation
Fei Dai, Jie Wu [9] proposed a generalized dominant
pruning rule, called Rule k, which can unmark gateways
covered by k other gateways, where k can be any number.
Rule k can be implemented in a restricted way with local
neighbourhood information that has the same complexity as
Rule 1 and, surprisingly, less complexity than Rule 2.
Given a simple directed graph G=(V,E) where V is a set
of vertices (hosts) and E is a set of directed edges
(unidirectional links), a directed edge from u to v is denoted
by an ordered pair (u,v). If (u,v) is an edge in G, we say that
u dominates v and v is an absorbent of u. The dominating
neighbour set N
d
(u) of vertex u is defined as {w : (w,u)
E}. The absorbent neighbour set N
a
(u) as {v : (u,v) E}.
N(u) = N
d
(u) N
a
(u) Fig. 5. vertex x dominates vertex u, y
is an absorbent of u, and v is a dominating and absorbent
neighbour of u. The dominating neighbour set of vertex u is
N
d
(u) = {v,x}, the absorbent neighbour set of u
isN
a
(u)={v,y}, and the neighbour set of u is N(u)={v,x,y}.
The general disk graph and unit disk graph are special cases
of directed graphs.

Figure 5. Example of dominating set reduction.

77 | P a g e

A set V V is a dominating set of G if every vertex v
V V is dominated by at least one vertex u V. Also, a
set V V is called an absorbent set if for every vertex u V
V, there exists a vertex v V which is an absorbent of u.
For example, vertex set {u,v} in Fig. 5. is both dominating
and absorbent sets of the corresponding directed graphs. The
following marking process can quickly find a strongly
connected dominating and absorbent set in a given directed
graph.
Algorithm Marking process
1: Initially assign marker F to each u in V .
2: Each u exchanges its neighbour set Nd(u) and Na(u)
with all its neighbours.
3: u changes its marker m(u) to T if there exist vertices v
and w such that (w,u) E and (u,v) E, but (w,v) E.
The marking process is a localized algorithm, where
hosts only interact with others in the neighbourhood. Unlike
clustering algorithms, there is no sequential propagation
of information. The marking process marks every vertex in
G. m(v) is a marker for vertex v V , which is either T
(marked) or F (unmarked). Suppose the marking process is
applied to the network represented by Fig. 5. host u will be
marked because (x,u) E and (u,y) E, but (x,y) E host
v will also be marked because (u,v) E and (v,z) E, but
(u,z) E. All other hosts will remain unmarked because no
such pair of neighbour hosts can be found. V is the set of
vertices that are marked T in V ; that is, V={v : vV
m(v) = T }. The induced graph G is the subgraph of G
induced by V ; that is G=G[V]. Wu [9] showed that
marked vertices form a strongly connected dominating and
absorbent set and, furthermore, can connect any two vertices
with minimum hops.
B. Dominating Set Reduction
In the marking process, a vertex is marked T because it
may be the only connection between its two neighbours.
However, if there are multiple connections available, it is
not necessary to keep all of them. We say a vertex is
covered if its neighbours can reach each other via other
connected marked vertices. Two dominant pruning rules are
as follows: If a vertex is covered by no more than two
connected vertices, removing this vertex from V will not
compromise its functionality as a CDS. To avoid
simultaneous removal of two vertices covering each other, a
vertex is removed only when it is covered by vertices with
higher ids. Node id id(v) of each each vertex v V serves
as a priority. Nodes with high priorities have high
probability of becoming gateways. Id uniqueness is not
necessary, but equal ids will produce more gateways.

Rule 1. Consider two vertices u and v in G. If N
d
(u)
{v} N
d
(v) and N
a
(u) {v} Na(v) in G and id(u) < id(v),
change the marker of u to F; that is, G is changed to G
{u}.

Rule 2. Assume that v and w are bi-directionally
connected in G. If N
d
(u) {v,w} N
d
(v) U N
d
(w) and N
a
(u)
{v,w} N
a
(v) N
a
(w) in G and id(u) < min{id(v),id(w)},
then change the marker of u F.
C. Generalized Pruning Rule
Assume G=(V,E) is the induced subgraph of a given
directed graph =(V,E) from marked vertex set V. In the
following dominant pruning rule, N
d
(V
k
) to represent the
dominating (absorbent) neighbour set of a vertex set V
k
that
is, N
d
(V
k
) = U
uiVk
N
d
(u
i
).
Rule k. V {v1, v2, ... , vk} is the vertex set of a
strongly connected subgraph in G. If N
d
(u) V
k
N
d
(V
k
)
and N
a
(u) - V
k
N
a
(V
k
) in G and id(u) < min{ id(v1),
id(v2),...,id(vk) }, then change the marker of u to F.
Rules 1 and 2 are special cases of Rule k, where |V| is
restricted to 1 and 2, respectively. Note that V
k
may contain
two subsets: V
k1
that really covers us neighbour set, and
V
k2
that acts as the glue to make them a connected set.
Obviously, if a vertex can be removed from V by applying
Rule 1 or Rule 2, it can also be removed by applying Rule k.
On the other hand, a vertex removed by Rule k is not
necessarily removable via Rule 1 or Rule 2. For example, in
Fig. 6(a), both vertices u and v can be removed using Rule k
(for k >= 3) because they are covered by vertices w, x, y,
and z; in Fig. 6(b), vertex u can be removed because it is
covered by vertices w, x, and y. Note that, although x and y
are not bi directionally connected, they can reach each other
via vertex w. However, none of these vertices can be
removed via Rule 1 or Rule 2.

Figure 6. Limitation of Rule 1 and 2.
D. Performance Analysis
The restricted Rule k is a more efficient dominant
pruning rule than the combination of the restricted Rules 1
and 2, especially in dense networks with a relatively high
percentage of unidirectional links. For these networks, the
resultant dominating set can be greatly reduced by Rule k
without any performance or resource penalty. One
advantage of the marking process and the dominant pruning
rules is their capability to support unidirectional links. For
networks without unidirectional links, the marking process
and the restricted Rule k is as efficient as several cluster-
based schemes and another pure localized algorithm, in
terms of the size of the dominating set; this is achieved with
lower cost and higher converging speed.

78 | P a g e
IV. A ZONAL CLUSTERING ALGORITHM
Zonal distributed algorithm [3] is to find a small weakly
connected dominating set of the input graph G = (V,E). The
graph is first partitioned into non-overlapping regions.
Then a greedy approximation algorithm [1] is executed to
find a small weakly-connected dominating set of each
region. Taking the union of these weakly-connected
dominating sets we obtain a dominating set of G. Some
additional vertices from region borders are added to the
dominating set to ensure that the zonal dominating set of G
is weakly-connected.
A. Graph partitioning using minimum spanning forests
The first phase of zonal distributed clustering algorithm
partitions a given graph G = (V,E) into non overlapping
regions. This is done by growing a spanning forest of the
graph. At the end of this phase, the subgraph induced by
each tree defines a region. This phase is based on an
algorithm of Gallager, Humblet, and Spira GHS [8[ that is
based on Kruskal's classic centralized algorithm for
Minimum Spanning Tree (MST), by considering all edge
weights are distinct, breaking ties using the vertex IDs of the
endpoints.
The MST is unique for a given graph with distinct edge
weights. The algorithm maintains a spanning forest.
Initially, the spanning forest is a collection of trees of single
vertices. At each step the algorithm merges two trees by
including an edge in the spanning forest. During the process
of the algorithm, an edge can be in any of the three states:
tree edge, rejected edge, or candidate edge. All edges are
candidate edges at the beginning of the algorithm. When an
edge is included in the spanning forest, it becomes a tree
edge. If the addition of a particular edge would create a
cycle in the spanning forest, the edge is called a rejected
edge and will not be considered further. In each iteration,
the algorithm looks for the candidate edge with minimum
weight, and changes it to a tree edge merging two trees into
one. During the algorithm, the tree edges and all the vertices
form a spanning forest. The algorithm terminates when the
forest becomes a single spanning tree.
The partitioning process consists of a partial execution
of the GHS algorithm [8], which terminates before the MST
is fully formed. The size of components is controlled by
picking a value x. Once a component has exceeded size x, it
no longer participates.
B. Computing Weakly-Connected Dominating Sets of the
Regions
Once the graph G is partitioned into regions and a
spanning tree has been determined for each region, runs the
following algorithm within each region. This color-based
algorithm is a distributed implementation of the centralized
greedy algorithm for finding small weakly-connected
dominating sets [10] [12] in graphs.
For given a graph G = (V;E) assign color (white, gray, or
black) with each vertex. All vertices are initially white and
change color as the algorithm progresses. The algorithm is
essentially an iteration of the process of choosing a white or
gray vertex to dye black. When any vertex is dyed black,
any neighbouring white vertices are changed to gray. At the
end of the algorithm, the black vertices constitute a weakly-
connected dominating set.
The term piece is used to refer to a particular substructure of
the graph. A white piece is simply a white vertex. A black
piece contains a maximal set of black vertices whose weakly
induced subgraph is connected plus any gray vertices that
are adjacent to at least one of the black vertices of the piece.
The improvement of a (non-black) vertex u is the number of
distinct pieces within the closed neighborhood of u. That is,
the improvement of u is the number of pieces that would be
merged into a single black piece if u were to be dyed black.
In each iteration, the algorithm chooses a single white or
gray vertex to dye black. The vertex is chosen greedily so as
to reduce the number of pieces as much as possible until
there is only one piece left. In particular, a vertex with
maximum improvement value is chosen (with ties broken
arbitrarily). The black vertices are the required weakly-
connected dominating set S.
C. Fixing the Borders
After calculating a small weakly-connected dominating
set S
i
for each region R
i
of G, combining these solutions
does not necessarily give us a weakly connected dominating
set of G. it is likely need to include some additional vertices
from the borders of the regions in order to obtain a weakly-
connected dominating set of G. The edges of G are either
dominated (that is, they have either endpoint in some
dominating set S
i
) or free (in which case neither endpoint is
in a dominating set). Two regions R
i
and R
j
joined by a
dominated edge can comprise a single region with
dominating set S
i
S
j
, and do not need to have their shared
border fixed.
The root of region R can learn, by polling all the vertices
in its region, which regions are adjacent and can determine
which neighbouring regions are not joined by a dominated
edge. For each such pair of adjacent regions, one of the
regions must "fix the border". To break ties, the region with
lower region ID takes control of this process, where the
region ID is the vertex ID of the region root. In other words,
if neighboring regions R
i
and R
j
are not joined by a shared
dominated edge, the region with the lower subscript adds a
new vertex from the R
i
/R
j
border into the dominating set.

79 | P a g e

Figure 7. Fixing Boarders.
For example, in Fig. 7, have regions have weakly-
connected dominating sets indicated by the solid black
vertices. Region R1 is adjacent to regions R2, R3, R4, and
R5. Among these, regions R2 and R3 do not share
dominated edges with R1. As R1 has a lower region ID than
either R2 or R3, R1 is responsible for fixing these borders.
The root of R1 adds u and v into the dominating set. R2 is
adjacent to two regions, R1 and R3, but it is only
responsible for fixing the R2/R3 border, due to the region
IDs. The root of R2 adds w to the dominating set. A detailed
description of this process for a given region R follows. The
goal is for the root r to find a small number of dominated
vertices within R to add to the dominating set. Here every
vertex knows the vertex ID, color, and region ID of all of its
neighbors. (This can be done with a single round of
information exchange.) Root r collects the above
neighborhood information from all of the border vertices of
R.
This define a problem region with regard to R to be any
region R0 that is adjacent to R, does not share dominated
edges with R, and has a higher region ID than R. Region R
is responsible for fixing its border with each problem
region.

Figure 8. Bipartite Graph

A bipartite graph B(X,Y,E) can be constructed from the
collected information for root r. Vertex set X contains a
vertex for each problem region with regard to R, and vertex
set Y contains a vertex for every border vertex in R. There is
an edge between vertices y
i
and x
j
iff y
i
is adjacent to a
vertex in problem region x
i
in the ordinal graph. Fig. 8.
shows the bipartite graph constructed by region R
1
in the
example of Fig. 7. In this bipartite graph, X = {R
2
, R
3
} and
Y = {u, y, v}. In this case, {u,v} is a possible solution for R
1

to add to the weakly-connected dominating set in order to x
its borders with R
2
and R
3
. To find the smallest possible set
of vertices to add to the dominating set, r must find a
minimum size subset of Y to dominate X.
D. Performance Analysis:
The execution time of this algorithm is O(x(log x+|S
max
|))
and it generates O(m + n(log x + |S
max
|)) messages, where
S
max
is the largest weakly connected dominating set
generated by all regions and can be trivially bounded by
O(x) from above. This zonal algorithm is regulated by a
single parameter x, which controls the size of regions. When
x is small, the algorithm finishes quickly with a large
weakly-connected dominating set. When it is large, it
behaves more like the non-localized algorithm and generates
smaller weakly-connected dominating
V. CLUSTERING USING A MINIMUM SPANNING TREE
An undirected graph is defined as G = (V,E), where V is a
finite nonempty set and E V V . V is a set of nodes v
and the E is a set of edges e. A graph G is connected if there
is a path between any distinct v. A graph GS = (VS,ES) is a
spanning subgraph of G = (V,E) if VS = V . A spanning tree
[6] [8] [15] of a graph is an undirected connected acyclic
spanning subgraph. Intuitively, a minimum spanning
tree(MST) for a graph is a subgraph that has the minimum
number of edges for maintaining connectivity.
Gallagher, Humblet and Spira [8] proposed a distributed
algorithm which determines a minimum weight spanning
tree for an undirected graph that has distinct finite weights
for every edge. Aim of the algorithm is to combine small
fragments into larger fragments with outgoing edges. A
fragment of an MST is a subtree of the MST. An outgoing
edge is an edge of a fragment if there is a node connected to
the edge in the fragment and one node connected that is not
in the fragment. Combination rules of fragments are related
with levels. A fragment with a single node has the level L =
0. Suppose two fragments F at level L and F at level L.
If L < L, then fragment F is immediately absorbed as
part of fragment F. The expanded fragment is at level L.
Else if L = L and fragments F and F have the same
minimum-weight outgoing edge, then the fragments
combine immediately into a new fragment at level L+1
Else fragment F waits until fragment F reaches a high
enough level for combination.
Under the above rules the combining edge is then called
the core of the new fragment. The two essential properties
of MSTs for the algorithm are:
Property 1: Given a fragment of anMST, let e be a
minimum weight outgoing edge of the fragment. Then
joining e and its adjacent non-fragment node to the
fragment yields another fragment of an MST.
Property 2: If all the edges of a connected graph have
different weights, then the MST is unique

80 | P a g e
The algorithm defines three different states of operation
for a node. The states are Sleeping, Find and Found. The
states affect what of the following seven messages are sent
and how to react to the messages: Initiate, Test, Reject,
Accept, Report (W), Connect (L) and Change-core. The
identifier of a fragment is the core edge, that is, the edge
that connects the two fragments together. A sample
MANET and a minimum spanning tree constructed with
Gallagher, Humblet, Spiras algorithm can be seen in Fig.
9. where any node other than the leaf nodes which are
shown by black color depict a connected set of nodes. The
upper bound for the number of messages exchanged during
the execution of the algorithm is 5Nlog2N +2E, where N is
the number of nodes and E is the number of edges in the
graph. A worst case time for this algorithm is O(NlogN).
Dagdeviren et. al. proposed the Merging Clustering
Algorithm (MCA) [6] which finds clusters in a MANET by
merging the clusters to form higher level clusters as
mentioned in Gallagher et. al.'s algorithm [28]. However,
they focused on the clustering operation by discarding the
minimum spanning tree. This reduces the message
complexity from O(nlogn) to O(n). The second contribution
is to use upper and lower bound parameters for clustering
operation which results in balanced number of nodes in the
clusters formed. The lower bound is limited by a parameter
which is defined by K and the upper bound is limited by 2K.

Figure 9. A MANET and its Spanning Tree.
VI. CONCLUSIONS
In this paper we discussed dominating set and spanning
tree based clustering in mobile ad hoc networks and it
performance analysis. The efficiency of dominating set
based routing mainly depends on the overhead introduced in
the formation of the dominating set and the size of the
dominating set. We discussed two algorithms which have
less overhead in dominating set formation. Finally we
discussed spanning tree approach in clustering MANET.
Distributed spanning tree and dominating set approaches
can be merged to improves clustering in MANET.
VII. FUTURE WORK
The interesting open problem in mobile ad hoc networks
is to study the dynamic updating of the backbone efficiently
when nodes are moving in a reasonable speed integrate the
mobility of the nodes. The work can be extended to develop
connected dominating set construction algorithms when
hosts in a network have different transmission ranges.
REFERENCES
[1] Baruch Averbuch Optimal Distributed Algorithms for Minimum
Weight Spanning tree Counting, Leader Election and elated
problems, 9th Annual ACM Symposium on Theory of Computing,
(1987).
[2] Fabian, ivan Connectivity Based k-Hop Clustering in Wireless
Networks Telecom System, (2003).
[3] Chen, Y P, Liestman A L, A Zonal Algorithm for Clustering Ad Hoc
Networks International Journal of Foundations of Computer Science,
vol. 14(2), (2003).
[4] Chan, H, Luk, M, Perrig, A, Using Clustering Information for Sensor
Network Localization, DCOSS (2005).
[5] Das B, Bharghavan V, Routing in ad-hoc networks using minimum
connected dominating sets, Communications, ICC97, (1997).
[6] Dagdeviren O, Erciyes K, Cokuslu D, Merging Clustering
Algorithms in Mobile Ad hoc Networks ICDCIT (2005).
[7] Gerla, M Tsai, Multicluster, mobile, multimedia radio network,
Wireless Networks, Wireless Networks, vol. 1(3), (1995).
[8] Gallagher, R. G., Humblet Distributed Algorithm for Minimum-
Weight Spanning Tree Transactions on Programming Languages and
Systems (1983).
[9] Fei Dai & Jie Wu, An extended localized algorithm for connected
dominating set formation in ad hoc wireless, IEEE Transaction on
Parallel and Distributed System, (2004).
[10] Yuanzhu Peter Chen, Arthur L. Liestman Maintaining weakly-
connected dominating sets for clustering ad hoc networks Elsevier
(2005).
[11] Deniz Cokuslu Kayhan Erciyes and Orhan Dagdeviren A Dominating
Set Based Clustering Algorithm for Mobile Ad hoc Networks,
Springer Computational Science ICCS (2006).
[12] Bo Hana Weijia Jiab, Clustering wireless ad hoc networks with weakly
connected dominating set, ELSEVIER (2007).
[13] K. Alzoubi, P.J. Wan O. FriederNew Distributed Algorithm for
Connected Dominating Set in Wireless Ad Hoc Networks, Proceedings
of the 35th Annual Hawaii International Conference on System
Sciences (HICSS'02)-Volume 9 (2002).
[14] Stojmenovic, Dominating sets and neighbor elimination-based
broadcasting algorithm IEEE Transaction on Parllel and Distributed
Systems, (2001).
[15] P. Victer Paul, T. Vengattaraman, P. Dhavachelvan & R. Baskaran,
Improved Data Cache Scheme Using Distributed Spanning Tree in
Mobile Adhoc Network, International Journal of Computer Science &
CommunicationVol. 1, No. 2, July-December (2010).
[16] G.N. Purohit and Usha Sharma, Constructing Minimum Connected
Dominating Set Algorithmic approach, International journal on
applications of graph theory in wireless ad hoc networks and sensor
networks GRAPHHOC (2010).

81 | P a g e
AUTHORS PROFILE
Dr. Suresh Varma Penumathsa, currently is a Principal and Professor
in Computer Science, Adikavi Nannaya University, Rajahmudry, Andhra
Pradesh, India.. He received Ph.D. in Computer Science and Engineering
with specialization in Communication Networks from Acharya Nagarjuna
University in 2008. His research interests include Communication
Networks and Ad hoc Networks. He has several publications in reputed
national and international journals. He is a member of
ISTE,ORSI,ISCA,IISA and AMIE.
Mr. R Krishnam Raju Indukuri, currently is working as Sr. Asst.
Professor in the Department of CS, Padamsri Dr. B.V.R.I.C.E,
Bhimavaram, Andhra Pradesh, India. He is a member of ISTE. He has
presented and published papers in several national and International
conferences and journals. His areas of interest are Ad hoc networks and
Design and analysis of Algorithms.

82 | P a g e
Distributed Group Key Management with Cluster
based Communication for Dynamic Peer Groups
Rajender Dharavath
1

Aditya Engineering College
Kakinada, India.
[email protected]
K Bhima
2

Brilliant Institute of Engineering &Technology
Hyderabad, India.
[email protected]

AbstractSecure group communication is an increasingly
popular research area having received much attention in recent
years. Group key management is a fundamental building block
for secure group communication systems. This paper introduces
a new family of protocols addressing cluster based
communication, and distributed group key agreement for secure
group communication in dynamic peer groups. In this scheme,
group members can be divided into sub groups called clusters.
We propose three cluster based communication protocols with
tree-based group key management. The protocols (1) provides the
communication within the cluster by generating common group
key within the cluster, (2) provides communication between the
clusters by generating common group key between the clusters
and (3) provides the communication among all clusters by
generating common group key among the all clusters. In our
approach group key will be updated for each session or when a
user joins or leaves the cluster. More over we use Certificate
Authority which guarantees key authentication, and protects our
protocol from all types of attacks.
Keywords- Secure Group Communication; Key Agreement; Key
Tree; Dynamic peer groups,Cluster.
I. INTRODUCTION
As a result of the increased the popularity of group oriented
applications such as pay-TV, distributed interactive games,
video and teleconference and chat rooms. There is a growing
demand for the security services to achieve the secure group
communication. A common method is to encrypt messages
with a group key, so that entities outside the group cannot
decode them. A satisfactory group communication system
would possess the properties of group key security, forward
secrecy, backward secrecy, and key independence [1,2,3].
In this paper research efforts have been put into the design
of a group key management and three different cluster based
communication protocols. There are three approaches for
generating such group keys: centralized, decentralized, and
distributed. Centralized key distribution uses a dedicated key
server, resulting in simpler protocols. However, centralized
methods fail entirely once the server is compromised, so that
the central key server makes a tempting target for adversaries.
In addition, centralized key distribution is not suitable for
dynamic peer groups, in which all nodes play the same function
and role, thus it is unreasonable to make one the key server,
placing all trust in it. In the decentralized approach, multiple
entities are responsible for managing the group as opposed to a
single entity. In contrast to both approaches, the distributed key
management requires each member to contribute a share to
generate the group key, resulting in more complex protocols.
And each member is equally responsible for generating and
maintaining the group key.
In this paper the group key or common key is generated
based on distributed key management approach. The group key
is updated on every membership change, and for every session,
for forward and backward secrecy [1, 2, 3], a method called
group rekeying.
To reduce the number of rekeying operations, Woung.et al
[7] proposed a logical data structure called a key tree. And Kim
et al [1], proposed a tree-based key agreement protocol, TGDH
which is combination of key tree and Diffie-Hellman key to
generate and maintain the group key. But it suffers from the
impersonation attack because of not regularly updation of keys
and generates unnecessary messages. Based on above two ideas
Zhou, L., C.V. Ravishanker and Kim et al [6], proposed an
AFTD (Authenticated Fault-tolerant Tree-based Diffie-
Hellman key exchange Protocol) protocol, which is the
combination of key trees and Diffie-Hellman key exchange
for group key generation.
Assume that the total network topology considered as a
group, which can be divided into subgroups called clusters.
Group is divided into clusters based on the location
identification number; LIDs of users, and cluster is assigned
with cluster identification numbers, CID, which are given by
the Certificate Authority, CA at the time of user joining into
cluster or group. Issuing location identification number and
public key certificate to the new user are the offline actions
performed by the certificate authority, CA.
Each cluster member maintains its own cluster key tree and
generates the cluster group key for secure communication. We
assume in every cluster, every node can receive a message
broadcasted from the other nodes. Each cluster is headed by a
cluster head or sponsor of cluster and he is responsible for
generating cluster group key, who is shallowest rightmost to
the user (in cluster key tree) joins or leaves from the cluster.
Cluster group key or cluster common key is shared by all
the cluster members and communicates with it. The
authentication is provided by certificate authority by issuing the

83 | P a g e
public key certificate and location identification number, LID
prior to the time of joining in the cluster or group.
The rest of the paper is organized as follows. Section 2
focuses on related work in this field. We present our proposed
scheme in Section 3, communication protocols and group key
management techniques are discussed in Section 4. Dynamic
network peer groups are presented in Section 5, security
analysis in section6. Finally we make a conclusion in Section 7.
II. RELATED WORK
Key trees [6] were first proposed for centralized key
distribution, while Kim et al.[1], adapted it to distributed key
agreement protocol TGDH. In TGDH [1] every group member
creates a key tree separately. Each leaf node is associated with
a real group member, while each non-leaf node is considered as
virtual member. In TGDH, every node on the key tree has a
Diffie-Hellman key pair based on the prime p and generator ,
used to generate the group key. Secret-public key pair for real
member M
i
is as follows.
} mod , { p BKM KM
i
KM
i i
o = (1)
And Secret-public key pair for virtual member V
i
is as
follows.
} mod , { p BKV KV
i
KV
i i
o =

(2)
Public key BKM
i
is also called as blinded key. Consider a
node Mv whose left child is Mlv and right child node is Mrv (to
simplify the description, we do not distinguish real members
from virtual members here). Secret key of M
i
s can be
computed in the usual Diffie-Hellman key exchange fashions
as follows.
} mod ) ( ) ( { p BKMrv BKMlv KMv
KMlv KMrv
(3)
With all blinded keys well-known, each group member can
compute the secret keys of all nodes on its key path,
comprising the nodes from the leaf node up to the root. The
root nodes secret key KV
0
is known to all group members, and
becomes the group key. In Figure 2, cluster member U
12
knows
the key pairs of U
12
, V
11
and V
10
. V
10
s secret key is the cluster
group key.
In AFTD [6], as increasing the group size, the group
rekeying operation becomes complex and it leads to the
performance degradation and generates more messages to
distribute the group key, this is the main limitation of the
AFTD protocol.
Renuka A. and K.C.Shet [9] were proposed the cluster
based communications, which is different from our approach in
key management and in communication protocols. Our detailed
communication protocols and key management scheme are
discussed in this paper.
Lee et al. [4,5] have designed several tree-based distributed
key agreement protocols, reducing the rekeying complexity by
performing interval based rekeying. They also present an
authenticated key agreement protocol. As the success of their
scheme is partially based on a certificate authority, their
protocol will encounter the same problems as centralized trust
mechanisms.
Nen-Chung Wang, Shian-Zhang Fang [10], have proposed
A hierarchical key management scheme for secure group
communications in mobile ad hoc networks. This paper
involves very complex process to form the cluster and for
communications.
Gouda et al. [11], who describe a new use of key trees.
They are concerned about using the existing subgroup keys in
the key tree to securely multicast data to different subgroups
within the group. Unlike their approach, which depends on a
centralized key server to maintain the unique key tree and
manage all keys, our paper solves this problem in a distributed
fashion.
III. PROPOSED SCHEME
A. Sytem Model
To overcome the limitations of AFTD [6] protocol the
entire set of group members in the network is divided into a
number of subgroups called clusters and the layout of the
network is as shown in Figure 1.
The cluster is formed based on location identification
number, LIDs of the users and clusters are assigned with
cluster identification numbers, CID, which are given offline by
the Certificate Authority CA. If the CID is equal to the LID
then those users are belongs to that particular cluster.CID and
LID are unique for each cluster.
In this paper each cluster member maintains its own cluster
key tree as shown in Figure 2 (a,b,c), the leaf nodes in cluster
key tree are the cluster users (real users), and non leaf nodes
are the virtual users. We propose three different types of
communication protocols with distributed tree-based group key
management.
The cluster communications protocols are given below.
- Intra Cluster Communication protocol (ICC),
- Inter Cluster Communication protocol (IRCC) and
- Global Communication (GC) protocol.
Communication among the users within the cluster is called
Intra Cluster Communication. Communication between the
clusters is called Inter Cluster Communication. When IRCC
occurs between the clusters then the respective cluster key tree
is generated as shown in Figure 4, for generating group key.
Communication among all clusters is called Global
Communication and corresponding cluster key tree is generated
as shown in Figure 5, for generating group key. The
illustrations of communications are as shown in Figure 3.

84 | P a g e

Figure 1. Network Layout and Initialization

a. Cluster C
1
Key Tree

b. Cluster C
2
Key Tree c. Cluster C
3
Key Tree
Figure 2. Key trees of clusters

Figure 3. Illustration of communications.
B. Group Key Management Scheme
In fact an update of a blinded key need be sent only to a
cluster members, instead of entire group (all clusters) based on
the type of communications. We send each nodes blinded keys
only to its cluster members. In this paper each cluster member
constructs a key independently. Each real user U
ij
of a cluster
C
i
has two key pairs first one is: Diffie-Hellman key pair,
which is used to generate the group key is given below.
} mod , { p BKU KU
ij
KU
ij ij
o = (4)
And an RSA secret-public key pair {Dij, Eij}, which is
used to provide source authentication. In key tree non-leaf
nodes are virtual users (virtual clusters for global
communication or for inter cluster communications), and have
only a Diffie-Hellman key pair as given below.
} mod , { p BKV KV
ij
KV
ij ij
o = (5)
Group key management for user communications is occurs
in two phases.
- Initialization phase
- Group key generation and distribution phase
1) Initialization Phase
Certificate authority, CA will distribute the appropriate
public key certificates to clusters and it does not issue renewed
public key certificates for existing group members during the
process of cluster or group key updation.
New member wishing to join the group may obtain joining
certificate and LID (based on location where user wants to join)
from the CA at any time prior to join.
The certificate authority (CA), uses an RSA secret- public
key pair {Sk, Pk} and establishes public key certificates for
each cluster user U
ij
by signing U
ij
s public key with its secret
key Sk. User U
ij
s public key certificate <U
ij
, PUBU
ij
, E
ij
>Sk is
now distributed to its cluster user since public key Pk is well
known, any user of cluster can verify this certificate and
obtains U
ij
s public key.
2) Group Key Generation and Distribution Phase
Group key generation and distribution for cluster
communication occurs in three different ways.
- Group key generation and distribution in ICC.
- Group key generation and distribution in IRCC.
- Group key generation and distribution in GC.
The above group key generation and distribution techniques
for cluster communications are implemented in respective
communication protocols and in dynamic peer groups (in
section 5).

85 | P a g e
IV. COMMUNICATION PROTOCOLS
The communication protocols are as follows.
- Intra Cluster Communications (ICC).
- Inter Cluster Communications (IRCC).
- Global Communications (GC).
A. Intra Cluster Communications(ICC)
Communication among the users within the cluster is called
Intra Cluster Communication. Example of intra cluster
communication is shown in Figure 3, and corresponding cluster
key tree is shown in Figure 2.
In order to communicate users with each other within the
cluster, they need to have the common cluster group key,
which is generated from their cluster key trees based on Diffie-
Hellman key exchange fashion.
Steps for generation and distribution of cluster group key in
ICC (algorithm for cluster common key generation in ICC).

- Select the cluster in which Intra Cluster
Communication is to be done.
- Each cluster (C
i
) generates its own cluster key trees.
- The root node (V
ij
) of cluster C
i
s secret key KV
ij
is
generated using the DH Key exchange fashion from its
leaf nodes (the generation of Cluster group key or
common key is explained in dynamic peer groups).
- The secret key of the root node V
ij
is KV
ij
will become
the cluster group key or common key for cluster C
i
and
that will be shared by all members of a cluster.
- For each session the cluster group key will be changed
by changing their contribution.
- New generated cluster group key KV
ij
will be
distributed among all members of cluster.
B. Inter Cluster Communications(IRCC)
Communicating one cluster with another cluster is called an
Inter Cluster Communication. The example of IRCC is shown
in Figure 3, and corresponding reduced cluster key tree is
generated as shown in Figure 4. In this figure VC0 is virtual
cluster and it has only DH key pair as shown below.
} mod , { p BKVC KVC
i
KVC
i i
o = (6)
The secret-public key pair of virtual cluster VCi is for
generating clusters common key, which is generated according
DH Key fashion and distributed to the both clusters for
communicating each other.

Figure 4. Reduced IRCC Key Tree.
The steps for Generation and distribution of common key
for clusters in IRCC (algorithm for group key generation in
IRCC)
- Select the clusters for IRCC and form reduced cluster
key tree as shown in Figure 4.
- Each cluster has its own cluster group key or clusters
common key, which is generated from their cluster key
tree based on DH key fashion.
- Cluster C
i
and cluster C
j
s secret keys KC
i
, KC
j
are
calculated respectively (as explained in intra cluster
communication algorithm).
- Using KC
i
and KC
j
, the root node VC
i
(parent node of
C
i
and C
j
, or virtual cluster) calculates its secret key
KVC
i
using DH key exchange fashion.
- The root nodes VC
i
is, KVCi which is common key for
both cluster C
i
and cluster C
j
.
- KVC
i
is distributed to both cluster and that will be
shared by all members of each cluster for
communicating each other.
- For each session the common key for clusters is
recalculated by changing their shares of each clusters
members and distributed to all members of both
clusters.
C. Global Communication(GC)
Communicating all clusters in a group is called Global
Communication. When cluster C1, C2 and C3 are
communicating, then reduced global communication key tree is
generated as shown in Figure 5, and common global key is
generated according to DH key exchange fashion. In this figure
leaf nodes are real clusters and non-leaf nodes are virtual
clusters.

Figure 5. Reduced GC Key Tree

86 | P a g e
Steps for global key generation and distribution in GC
(algorithm for global key generation & distribution in GC).
- Each cluster generates its own cluster key trees
- For each cluster key tree there will be generated the
roots secret keys, which are common keys for all
respective clusters.
- Cluster C
i
, C
j
and C
k
s secret keys KC
i
, KC
j
and KC
k

are calculated respectively from their cluster key trees.
- With these three clusters, Reduced Global
Communication Key tree will be formed as shown in
Figure 5.
- The root node VC
i
s (from Reduced GC key tree)
secret key KVC
i
is calculated using DH key fashion,
which is common key for all clusters C
i
, C
j
and C
k
.
- VC
i
s secret key KVC
i
is distributed to all clusters and
that will be shared by all members of each cluster for
communicating globally.
- For each session the global key recalculated by
changing their shares of each clusters members and
distributed to all members of clusters.
V. DYNAMIC PEER GROUPS
The numbers of nodes or clusters in the network are not
necessarily fixed. New node (user) or cluster may join the
network or existing nodes or cluster may leave the network.
A. User Joins the Cluster
Assume that a new user U
ij
+1 wish to join a k-users cluster
{U
1
, U
2
. U
K
}. U
ij
+1, is required to authenticate itself by
presenting a join request signed with SK. U
ij
+1 may obtain a
signature on its join request by establishing credentials with the
offline certificate authority.
When the users of clusters receive the joining request, they
independently determine U
ij
+1s insertion node in the key tree,
which is defined as in [1], which is the shallowest rightmost
node or the root node when the key tree is well-balanced. They
also independently determine a real user called join sponsor Us
[1], to take responsible for coordinating the join, which is the
rightmost leaf node in the sub tree rooted at the insertion node.
No keys change in the key tree at a join, except the blinded
keys for nodes on the key path for the sponsor node. The
sponsor simply re computes the cluster group key, and sends
updates for blinded keys on its own key path to their
corresponding clusters. The join works as shown below.
Steps for group key or cluster common key generation and
distribution when user joins in cluster (algorithm for user joins
in cluster).
- New User U
ij
+1 takes the LID and public key
certificates from the CA.
- User U
ij
+1 selects appropriate cluster by comparing its
LID with CID (for LID=CID).
- The user U
ij
+1 broadcast the signed join request to its
cluster C
i
.
- Cluster C
i
s members determine the insertion point,
and update their key trees by creating a new
intermediate node and promoting it to become the
parent of the insertion node and U
ij
+1.
- Each cluster member adjusts the cluster key tree by
adding U
ij
+1 to its selected clusters adjacent to the
insertion point.
- The sponsor Us compute the new cluster group key or
cluster common key.
- Then sponsor Us sends the updated blinded keys of
nodes on its key path to their corresponding clusters.
- These messages are signed by the sponsor Us.
- U
ij
+1 takes the public keys needed for generating the
cluster group key, generates group key.
The cluster group key (for cluster C
3
) or cluster common
key for Figure 6 is generated as follows (steps for group key or
common key generation).
-
Let U
31
s secret share is KU
31,
and then secret-public
key pair of U
31
(according to DH Key fashion)

is as
shown below.

} mod , {
31
31 31
p BKU KU
KU
o =
(7)

- Let U
32
32
then secret-public key
pair of U
32
(according to DH Key fashion) is shown
below.
} mod , {
32
32 32
p BKU KU
KU
o = (8)
- Let U
33
33
pair of U
33
below.
} mod , {
33
33 33
p BKU KU
KU
o = (9)
- Let U
34
34
pair of U
34
below.
} mod , {
34
34 34
p BKU KU
KU
o = (10)
- Let U
35
35
pair of U
35
below.
} mod , {
35
35 35
p BKU KU
KU
o = (11)

87 | P a g e
- Now V
33
s Secret-Public keys (KV
33
, BKV
33
) are
calculated as follows (according to the DH Key
Exchange fashion from U
31
and U
32
).
} mod ) ( ) ( {
31 32
32 31 33
p BKU BKU KV
KU KU
(12)
} mod {
33
33
p BKV
KV
o = (13)
- Now V
32
s Secret-Public keys (KV
32
, BKV
32
) are
calculated as follows (according to the DH Key
Exchange fashion from U
34
and U
35
).
} mod ) ( ) ( {
34 35
35 34 32
p BKU BKU KV
KU KU
(14)
} mod {
32
32
p BKV
KV
o = (15)
- Now V
31
s Secret-Public key pair (according to the DH
Key Exchange fashion from V
33
and U
33
) is
} mod ) ( ) ( {
33 33
33 33 31
p BKU BKV KV
KV KU
(16)
} mod {
31
31
p BKV
KV
o (17)
- Finally V
30
s Secret-Public key pair (according to the
DH Key Exchange fashion from V
31
and V
32
) is
} mod ) ( ) ( {
31 32
32 31 30
p BKV BKV KV
KV KV
(18)
} mod {
30
30
p BKV
KV
o = (19)
- The root node V
30
s Secret key is considered as cluster
C
3
s Group key or cluster common key, through which
communication is need to done.
- And this common cluster key is distributed to all
cluster members.
Like above steps for group key or common key generation,
the common key or group key for all the different cluster
communication and in dynamic peer, are generated.
In Figure 6, a new user U
36
wants to joins in C
3
cluster. The
join sponsor U
33
creates a new intermediate node V
34
in the key
tree and promotes it to become the parent of U
33
and U
36
. The
sponsor U
33
computes the new cluster group key, and sends the
updated BKV
34
and BKV
31
to remaining members
{U
31
,U
32
,U
34
,U
35
} of the cluster C
3
.

Figure 6. User joins in Cluster C3
B. User Leaves the Cluster
Assume that a member U
ij
wishes to leave an n-member
cluster. First U
ij
initiates the leave protocol by sending a leave
request. When the other users of cluster receive the request,
they independently determine the sponsor node, which is the
right-most leaf node of the Sub tree rooted at the leaving
members sibling node which is defined as in [1]. The leave
protocol works as given below.
Steps for group key generation and distribution when user
leaves the cluster (algorithm for user leave from cluster).
- User U
ij
broadcasts its leave request to remaining users
of that cluster C
i
.

- The former sibling node of U
ij
is promoted to replace
U
ij
s parent node.
- The size of the cluster that formerly contained U
ij
is
decreased by one.
- The sponsor Us picks a new secret key KUs, and
computes the new cluster group key, and sends the
updated blinded keys of nodes on its key path to their
corresponding cluster users.
- These messages are signed by the sponsor Us
- Group prepared based on DH key exchange fashion, as
explained in dynamic peer groups.
In Figure 7, U
36
leaves a cluster C
3
. The sponsor U
33
picks a
new secret key KU
33
and computes the new group key, sends
updated BKU
33
, BKV
31
and BKV
30
to their cluster users {U
31
,
U
32
, U
34
, and U
35
}.

88 | P a g e

Figure 7. User U
36
leaves the Cluster C
3
.
C. Updating Secret Shares &RSA keys
In this scheme, each group user is required to update its
Diffie-Hellman keys before each group session, or during a
session when it is selected as a sponsor on a users leaving.
Source authentication of the updated blinded keys is guaranteed
by the senders RSA signature. Further, to ensure the long-term
secrecy of the RSA keys, group user to renew its RSA key pair
periodically, and send it to its cluster users securely using its
current RSA secret key.
VI. PERFORMANCE ANALYSIS
Security Analysis: Users in a network group are usually
considered to be part of the security issue since there are no
fixed nodes to perform the service of authentication. The
Certificate Authority, which may be distributed, is on-line
during initialization, but remains offline subsequently. During
initialization, the CA distributes key certificates and location
IDs, so that the function of key authentication can be realized
and distributed across appropriate clusters.
A. Forward Secrecy
If a hacker (or old member) can compromise any node and
obtain its key, it is possible that the hacker can start new key
agreement protocol by impersonating the compromised node.
For our scheme we can conclude that a passive hacker who
knows a contiguous subset of old group keys cannot discover
any subsequent group key. In this way, forward secrecy can be
achieved.
B. Backward Secrecy
A passive hacker (or new joined member) who knows a
contiguous subset of group keys cannot discover how a
previous group key is changed upon a group join or leave.

C. Key Independence
This is the strongest property of the dynamic peer groups. It
guarantees that a passive adversary who knows some previous
group key cannot determine new group keys.
VII. CONCLUSION
In this paper, we have presented three communication
protocols with distributed group key management for dynamic
peer groups using key trees, by dividing group into subgroups
called clusters. We provided the strong authentication with
LIDs, CIDs for cluster formations. We provide the source
authentication of user in communication with RSA keys. The
DH secret-public key pairs are used for common key
generations. Certificate Authority provided the RSA keys,
LIDs for all users and CIDs for all clusters for all types of
cluster communications.
In future we can extend this application with cluster head
communications, sponsor coordination and cluster merging or
cluster disjoining in dynamic network.
ACKNOWLEDGMENT
We would like to thank to K Sahadeviah for help full
discussion about different key management schemes and
modes of providing authentications. We thank Krishna Prasad
for discussion of effective presentations of concepts. We also
thank our friends for designing of network frame work.
REFERENCES
[1] Kim, Y., Perrig, A., Tsudik, G.: Simple and fault-tolerant key
agreement for dynamic collaborative groups. In: Proceedings of the
CCS00. (2000).
[2] Steiner, M., Tsudik, G., Waidner, M.: Key agreement in dynamic peer
groups. IEEE TRANSACTIONS on Parallel and Distributed Systems 11
(2000).
[3] Perrig, A.: Efficient collaborative key management protocols for secure
automonomous group communication. In: Proceedings of CrypTEC99.
(1999).
[4] Lee, P., Lui, J., Yau, D.: Distributed collaborative key agreement
protocols for dynamic peer groups. In: Proceedings of the ICNP02.
(2002).
[5] Lee, P., Lui, J., Yau, D.: Distributed collaborative key agreement
protocols for dynamic peer groups. Technical report, Dept. of Computer
Science and Engineering, Chinese University of Hong Kong (2002).
[6] Zhou, L., C.V.Ravishankar: Efficient, authenticated, and fault-tolerant
key agreement for dynamic peer groups. Technical Report 88, Dept. of
Computer Science and Engineering, University of California, Riverside
(2004).
[7] Wong, C., Gouda, M., Lam, S.: Secure group communication using key
graphs. In: Proceedings of the ACM SIGCOMM98, Vancouver, Canada
(1998).
[8] Steiner, M., Tsudik, G., Waidner, M.: Cliques: A new approach to group
key agreement. In: Proceedings of the ICDCS98, Amsterdam,
Netherlands (1998).

89 | P a g e
[9] Renuka A. and K.C.Shet: Cluster Based Group Key Management in
Mobile Ad hoc Networks (2009).
[10] Nen-Chung Wang, Shian-Zhang Fang.: A hierarchical key management
scheme for secure group communications in mobile ad hoc networks
(2007).
[11] M.G.Gouda, Huang, C., E.N.Elnozahy: Key trees and the security of
interval multicast. In: Proceedings of the ICDCS02, Vienna, Austria
(2002).
[12] Wallner, D., Harder, E., Agee, R.: Key management for multicast: Issues
and architecture. In: Internet Draft, draft-wallner-key-arch-01.txt.
(1998).
[13] Ateniese, G., Steiner, M., Tsudik, G.: New multiparty authentication
services and key agreement protocols. IEEE Journal of Selected Areas in
Communications 18 (2000).
[14] Pereira, O., Quisquater, J.: A security analysis of the cliques protocols
suites. In: Proceedings of the 14-th IEEE Computer Security
Foundations Workshop. (2001).
[15] L.Zhou et Z. J. Haas, Securing Ad Hoc Networks, IEEE Network
Magazine, 13(6),1999.
[16] Eui-Nam Huh, Nahar Sultana, Application Driven Cluster Based Group
Key Management with Identifier in Mobile Wireless Sensor
Networks(2007).
[17] D. Balenson, D. Mcgew, and A. Sherman, Key management for large
dynamic groups: One way function trees and amortized initializations,
IETF, Feb 1999.
[18] Y.Kim,A. Perrig and G.Tsudik, A common efficient group key
agreement, Proc. IFTP-SEP 2001, pp,229-244,2001.
[19] Del Valle Torres Gerardo, Gomez Cardenas Roberto,Overview the Key
Management in Ad Hoc Networks (2004).
[20] Rafaeli, S. and Hutchison, D. (2003) A survey of key management for
secure group communications, ACM Computing for secure group
communication, ACM Computing Surveys, Vo. 35, No.3 pp.309-329.
[21] Bing Wu, Jie Wuand Yuhong Dong,(2008) An efficient group key
management scheme for mobile ad hoc networks, Int. J. Security and
Networks, 2008.
AUTHORS PROFILE

Mr. Rajendar Dharavath, currently is a Assistant
Professor in the Department of Computer Science
and Engineering, Aditya Engineering College,
Kakinada, Andhra Pradesh, India. He completed
B.Tech in CSE from CJITS Jangaon, Warangal, and
M.Tech in CSE from JNTU Kakinada. His research
interest includes: Mobile ad hoc networks, Network
Security and Data Mining & Data Warehouse.

Mr. Bhima K, currently is a Associate Professor and
Head of Department of Computer Science and
Engineering,Brilliant Institute of Engineering and
Technology, Hyderabad, Andhra Pradesh, India. He
completed B.Tech in CSE from RVR&JC Engg.
College, Guntur, and M.Tech in SE from NIT
Alahabad. His research interest includes: Mobile ad
hoc networks, Network Security, Computer
Networks and Software Engineering.


90 | P a g e
Extracting Code Resource from OWL by Matching
Method Signatures using UML Design Document
UML Extractor

Gopinath Ganapathy
1
1
Department of Computer Science
Bharathidasan University
Trichy, India.
[email protected]
S. Sagayaraj
2
2
Department of Computer Science
Sacred Heart College
Tirupattur, India
[email protected]

AbstractSoftware companies develop projects in various
domains, but hardly archive the programs for future use. The
method signatures are stored in the OWL and the source code
components are stored in HDFS. The OWL minimizes the
software development cost considerably. The design phase
generates many artifacts. One such artifact is the UML class
diagram for the project that consists of classes, methods,
attributes, relations etc., as metadata. Methods needed for the
project can be extracted from this OWL using UML metadata.
The UML class diagram is given as input and the metadata about
the method is extracted. The method signature is searched in
OWL for the similar method prototypes and the appropriate
code components will be extracted from the HDFS and reused in
a project. By doing this process the time, manpower system
resources and cost will be reduced in Software development.
Keywords- Component: Unified Modeling language, XML, XMI
Metadata I nterchange, Metadata, Web Ontology Language, J ena
framework.
I. INTRODUCTION
The World Wide Web has changed the way people
communicate with each other. The term Semantic Web
comprises techniques that dramatically improve the current
web and its use. Todays Web content is huge and not well-
suited for human consumption. The machine processable Web
is called the Semantic Web. Semantic Web will not be a new
global information highway parallel to the existing World
Wide Web; instead it will gradually evolve out of the existing
Web [1]. Ontologies are built in order to represent generic
knowledge about a target world [2]. In the semantic web,
ontologies can be used to encode meaning into a web page,
which will enable the intelligent agents to understand the
contents of the web page. Ontologies increase the efficiency
and consistency of describing resources, by enabling more
sophisticated functionalities in development of knowledge
management and information retrieval applications. From the
knowledge management perspective, the current technology
suffers in searching, extracting, maintaining and viewing
information. The aim of the Semantic Web is to allow much
more advanced knowledge management system.

To develop such a knowledge management system the
software companys can make use of the already developed
coding. That is to develop new software projects with reusable
codes. The concept of reuse is not a new one. It is however
relatively new to the software profession. Every Engineering
discipline from Mechanical, Industrial, Hydraulic, Electrical,
etc, understands the concept of reuse. However, Software
Engineers often feel the need to be creative and like to design
one time use components. The fact is they come with unique
solution for every problem. Reuse is a process, an applied
concept and a paradigm shift for most people. There are many
definitions for reuse. In plain and simple words, reuse is, The
process of creating new software systems from existing
software assets rather then building new ones.
Systematic reuse of previously written code is a way to
increase software development productivity as well as the
quality of the software [3, 4, 5]. Reuse of software has been
cited as the most effective means for improvement of
productivity in software development projects [6, 7]. Many
artifacts can be reused including; code, documentation,
standards, test cases, objects, components and design models.
Few organizations argue the benefits of reuse. These benefits
certainly will vary organization to organization and to a degree
in economic rational. Some general reusability guidelines,
which are quite often similar to general software quality
guidelines, include [8] ease of understanding, functional
completeness, reliability, good error and exception handling,
information hiding, high cohesion and low coupling, portability
and modularity. Reuse could provide improved profitability,
higher productivity and quality, reduced project costs, quicker
time to market and a better use of resources. The challenge is to
quantify these benefits.
For every new project Software teams design new
components and code by employing new developers. If the
company archives the completed code and components, they
can be used with no further testing unlike open source code and
components. This has a recursive effect on the time of
development, testing, deployment and developers. So there is a
base necessity to create system that will minimize these factors.


91 | P a g e
Code re-usability is the only solution for this problem. This
will reduce the development of an existing work and testing.
As the developed code has undergone the rigorous software
development life cycle, it will be robust and error free. There is
no need to re-invent the wheel. To reuse the code, a tool can be
create that can extract the metadata such as function, definition,
type, arguments, brief description, author, and so on from the
source code and store them in OWL. This source code can be
stored in the HDFS repository. For a new project, the
development can search for components in the OWL and
retrieve them at ease. The OWL represents the knowledgebase
of the company for the reuse code.
The projects are stored in OWL and the source code is
stored in the Hadoop Distributed File System (HDFS) [9]. The
client and the developer decide and approve the design
document. For the paper the UML class diagram is one such
design document considered as the input for the system. The
method metadata is extracted from the UML and passed to the
SPARQL to extract the available methods from the OWL.
Selecting appropriate method from the list the code component
is retrieved from the HDFS. The purpose of using an UML
diagram as input is before developing software this tool can be
used to estimate how many methods is to be developed by
extraction. The UML diagram is a powerful tool that acts
between the developer and the user. So it is like a contract
where both parties agree for software development using UML
diagram. After extracting the methods from the UML diagram
these methods are matched in the OWL. From the retrieved
methods the developer can account for how many are already
available in the repository and how many to be developed. If
the retrieved methods are more the development time will be
shorter. To have more method matches the corporate should
store more projects. The uploading of projects in the OWL and
HDFS the corporate knowledge grows and the developers will
use more of reuse code than developing themselves. Using the
reuse code the development cost will come down, development
time will become shorter, resource utilization will be less and
quality will go up.
The paper begins with a note on the related technology
required in Section 2. The detailed features and framework for
source code retriever is found in Section 3. The Keyword
Extractor for UML is in section 4. The Method Retriever by
Jena framework is in section 5. The Source Retriever from the
HDFS is in section 6. The implementation scenario is in
Section 7. Section 8 deals with the findings and future work of
the paper.
II. RELATED WORK
A. Metadata
Metadata is defined as data about data or descriptions of
stored data. Metadata definition is about defining, creating,
updating, transforming, and migrating all types of metadata that
are relevant and important to a users objectives. Some
metadata can be seen easily by users, such as file dates and file
sizes, while other metadata can be hidden. Metadata standards
include not only those for modeling and exchanging metadata,
but also the vocabulary and knowledge for ontology [10]. A lot
of efforts have been made to standardize the metadata but all
these efforts belong to some specific group or class. The
Dublin Core Metadata Initiative (DCMI) [11] is perhaps the
largest candidate in defining the Metadata. It is simple yet
effective element set for describing a wide range of networked
resources and comprises 15 elements. Dublin Core is more
suitable for document-like objects. IEEE LOM [12], is a
metadata standard for Learning Objects. It has approximately
100 fields to define any learning object. Medical Core
Metadata (MCM) [13] is a Standard Metadata Scheme for
Health Resources. MPEG-7 [14] multimedia description
schemes provide metadata structures for describing and
annotating multimedia content. Standard knowledge ontology
is also needed to organize such types of metadata as content
metadata and data usage metadata.
B. Hadoop & HDFS
The Hadoop project promotes the development of open
source software and it supplies a framework for the
development of highly scalable distributed computing
applications [15]. Hadoop is a free, Java-based programming
framework that supports the processing of large data sets in a
distributed computing environment and it also supports data
intensive distributed application. Hadoop is designed to
efficiently process large volumes of information[16]. It
connects many commodity computers so that they could work
in parallel. Hadoop ties smaller and low-priced machines into a
compute cluster. It is a simplified programming model which
allows the user to write and test distributed systems quickly. It
is an efficient, automatic distribution of data and it works
across machines and in turn it utilizes the underlying
parallelism of the CPU cores. The monitoring system then re-
replicates the data in response to system failures which can
result in partial storage. Even though the file parts are
replicated and distributed across several machines, they form a
single namespace, so their contents are universally accessible.
Map Reduce [17] is a functional abstraction which provides an
easy-to-understand model for designing scalable, distributed
algorithms.
C. Ontology
The key component of the Semantic Web is the collections
of information called ontologies. Ontology is a term borrowed
from philosophy that refers to the science of describing the
kinds of entities in the world and how they are related. Gruber
defined ontology as a specification of a conceptualization
[18].Ontology defines the basic terms and their relationships
comprising the vocabulary of an application domain and the
axioms for constraining the relationships among terms [19].
This definition explains what an ontology looks like [20].The
most typical kind of ontology for the Web has taxonomy and a
set of inference rules. The taxonomy defines classes of objects
and relations among them. Classes, subclasses and relations
among entities are a very powerful tool for Web use.
III. SOURCE CODE RETRIEVER FRAMEWORK
The Source Code Retriever makes use of OWL is
constructed for the project and the source code of the project is
stored in the HDFS [21]. All the project information of a
software company is stored in the OWL. The size of the project
source will be of terabytes and the corporate branches are

92 | P a g e
spread over in various geographical locations so, it is stored in
Hadoop repository to ensure distributed computing
environment. Source Code Retriever is a frame work that takes
UML class diagram or XMI (XML Metadata Interchange) file
as an input from the user and suggests the reusable methods for
the given Class Diagram. The Source Code Retriever consists
of three components: Keyword Extractor for UML, Method
Retriever and Source Retriever. The process of the Source
Code Retriever Framework is presented in the Fig. 1 The
Keyword Extractor for UML extracts the metadata from the
UML class diagram. The class diagram created by Umberllo
tool is passed as input to the Keyword Extractor for UML.
The input for the frame work can be an existing UML class
diagram or created by the tool. Both types of input are loaded
in to Umberllo and the file type for storing UML class diagram
is XMI format. The file is parsed for metadata extraction. The
parser extracts method signatures from the XMI file and passes
it to the Method Retriever component. Method Retriever
component
UML Extractor
Source Retriever
Method Retriever
HDFS
Retrieved
Method &
source

Figure 1. Process of Source Retriever

retrieves the matched methods from the repository. Method
Retriever constructs SPARQL query to retrieve the matched
results. The user should select the appropriate method from the
list of methods and retrieve the source code by Source
Retriever component which interacts with HDFS and displays
the source code.
IV. KEYWORD EXTRACTOR FOR UML
Unified Modeling Language (UML) is a visual language for
specifying, constructing, and documenting the artifacts of
systems. It is a standardized general-purpose modeling
language in the field of software engineering. To create UML
class diagram Umberllo UML Modular open source tool is
used. The diagram is stored in XMI format. Umbrello UML
Modeller is a Unified Modeling Language diagram program for
KDE. UML allows the user to create diagrams of software and
other systems in a standard format. Umbrello It can support in
the software development process especially during the
analysis and design phases of this process. UML is the
diagramming language used to describing such models.
Software ideas can be represented in UML using different
types of diagrams. Umbrello UML Modeller 1.2 supports Class
Diagram, Sequence Diagram, Collaboration Diagram, Use
Case Diagram, State Diagram, Activity Diagram, Component
Diagram and Deployment Diagram.
The XMI is an Object Management Group (OMG) standard
for exchanging metadata information using XML. The initial
proposal of XMI "specifies an open information interchange
model that is intended to give developers working with object
technology the ability to exchange programming data over the
Internet in a standardized way, thus bringing consistency and
compatibility to applications created in collaborative
environments. "The main purpose of XMI is to enable easy
interchange of metadata between modeling tools and between
tools and metadata repositories in distributed heterogeneous
environments. XMI integrates three key industry standards:
(a) XML - a W3C standard (b) UML - an OMG (c) MOF -
Meta Object Facility and OMG modeling and metadata
repository standard. The integration of these three standards
into XMI marries the best of OMG and W3C metadata and
modeling technologies allowing developers of distributed
systems share object models and other Meta data over the
Internet.
The process flow of Keyword Extractor for UML is given
in the Fig. 2. The XMI or UML file is parsed with the help of
the SAX (Simple API for XML) Parser. SAX is a sequential
access parser API for XML. SAX provides a mechanism for
reading data from an XML document. SAX loads the XMI or
UML file and get the list of tags by passing name. It gets the
attribute value of the tags by attributes.getValue(<Name of the
attributes>) method. The methods used to retrieve the attributes
are Parse, Attributes and getValue(nameOfAttibute). The
Parse() method will parse the XMI file. The Attribute is to hold
the attribute value. GetValue(nameOfAttibute) method returns
class information, method information and parameter
information of the attribute.
UML Extractor
Class Name
(Name, scope)
Method Information
( Name, type)
Parameter Information
UML or XMI
file
Figure 2. Process of Keyword Extractor for UML
The XMI file consists of XML tags. To extract class
information, method information and parameter information
are identified with the appropriate tag as given in the Table I.
TABLE I. TAGS USED TO EXTRACT METADATA FROM XMI FILE
Tag Purpose
UML:DataType It holds the data type information
UML:Class It holds the class informations
like name of the class, visibility
of the class ,etc.,

93 | P a g e
UML:Attribute Attribute is a sub tag of class. It
holds the informations of the class
attributes like name of the
attributes, type of the attribute,
and visibility of the attribute etc.,
UML:Operation It holds the methods information
of the class like name of the
method, return type of the
methods, visibility of the method.
UML:BehavioralFeatu
re.parameter
It holds the information of the
methods parameters like name of
the parameter, data type of the
parameter.
Using the tags the metadata of the UML or the XMI is
extracted. The extracted metadata are class, methods, and
attributes etc., which are passed to the Method Retriever
component.
V. METHOD RETRIEVER
Method Retriever component interact with the OWL and
returns the available methods from the OWL for the given class
diagram is represented diagrammatically in Fig. 3. The
extracted information from the UML file by the Keyword
Extractor for UML is passed to the Method Retriever
component. It interacts with OWL and retrieves matched
method information using SPARQL query. SPARQL is a
Query language for RDF. The SPARQL Query is executed on
OWL file. Jena is a Java framework for building Semantic Web
applications. It provides a programmatic environment for RDF,
RDFS and OWL, SPARQL and includes a rule-based inference
engine. Jena is a Java framework for manipulating ontologies
defined in RDFS and OWL Lite [22]. Jena is a leading
Semantic Web toolkit [23] for Java programmers. Jena1 and
Jena2 are released in 2000 and August 2003 respectively. The
main contribution of Jena1 was the rich Model API. Around
this API, Jena1 provided various tools, including I/O modules
for: RDF/XML [24], [25], N3 [26], and N-triple [27]; and the
query language RDQL [28]. In response to these issues, Jena2
has a more decoupled architecture than Jena1. Jena2 provides
inference support for both the RDF semantics [29] and the
OWL semantics [30].
SPARQL is an RDF query language; its name is a recursive
acronym that stands for SPARQL Protocol and RDF Query
Language used to retrieve the information from the OWL.
SPARQL can be used to express queries across diverse data
sources, whether the data is stored natively as RDF or viewed
as RDF via middleware. SPARQL contains capabilities for
querying required and optional graph patterns along with their
conjunctions and disjunctions. SPARQL also supports
extensible value testing and constraining queries by source
RDF graph. The results of SPARQL queries can be results sets
or RDF graphs.
A. Query processor
A query processor executes the SPARQL Query and
retrieves the matched results. The SPARQL Query Language
for RDF[31] and the SPARQL Protocol for RDF[32] are
increasingly used as a standardized query API for providing
access to datasets on the public Web and within enterprise
settings. The SPARQL query takes method parameters and the
returns the results. The retrieved results contains project details
like name of the project, version of the project and method
details like name of the package, name of the class, method
name , method return type, method parameter. Query processer
takes the extracted method name and the method parameter as
an input and retrieves the methods and project information
from the OWL.
Extracted Method
OWL
Project Detail
Query Processor
Method Detail
Matched Results form OWL
Figure 3. Method Retriever Process

B. SPARQL query
The SPARQL query is constructed to extracting project
name, version of the project, package name, class name,
method name, return type, and return identifier name, method
parameter name and type. The sample query is as follows
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?pname ?version ?packname ?cname ?mname
?rType ?identifier ?paramName ?parmDT ?paramT
WHERE {
?project rdf:type base:Project .
?project base:Name ?pname .
?project base:Project_Version ?version .
?project base:hasPackage ?pack .
?pack base:Name ?packname .
?pack base:hasClass ?class .
?class base:Name ?cname .
?class base:hasMethod ?subject .
?subject base:Name ?mname .
?subject base:Returns ?rType.
?subject base:Identifier ?identifier.
?subject base:hasParameter ?parameter.
?parameter base:Name ?paramName.
?parameter base:DataType ?parmDT.
?parameter base:DataType ?parmT.
FILTER regex ( ?mname , "add" , "i" ) .
FILTER regex ( ?parmT , "java.lang.String" , "i" ) .
}
VI. SOURCE RETRIEVER
Source Retriever component retrieves the appropriate
source code of the user selected method from the HDFS. It is
the primary storage system used by Hadoop applications.

94 | P a g e
HDFS creates multiple replicas of data blocks and distributes
them on compute nodes throughout a cluster to enable reliable,
extremely rapid computations. The source code file location of
the Hadoop repository path is obtained from the OWL and
retrieved from the HDFS by the
copyToLocal(FromFilepath,localFilePath) method.
QDox is a high speed, small footprint parser for extracting
class/interface/method definitions from source files. When the
java source file or folder that consists java source file loaded to
QDox; it automatically performs the iteration. The loaded
information is stored in the JavaBuilder object. From the java
builder object the list of packages as an array of string are
returned. This package list has to be looped to get the class
information. From the class information the method
information is extracted. It returns the array of JavaMethod.
From this java method the information like scope of the
method, name of method, return type of the method and
parameter informations are extracted from the JavaMethod.
QDox finds the methods from the source code. The file
that is retrieved from the HDFS is stored in the local temporary
file. This file is passed to the Qdox addSource() method for
parsing. Through Qdox each method is retrieved one by one.
The retrieved methods are compared with methods that the user
requested for source code retrieval method. If it matches the
source code is retrieved by getSourceCode() method. Then the
temporary file is deleted after the process. In Hadoop
repository files are organized in the same hierarchy of java
folder. So it gets the source location from the OWL and
retrieve the java source file to a temp file. The temporary file is
loaded into QDox to identify methods. Each method is
compared with method to be searched. If it matches; the source
code of the method is retrieved by getMethodSourceCode()
method.
VII. CASE STUDY
The input for the frame work is a UML class diagram. The
sample class diagram is given below

The entire process of the framework is given in the Table
II. The Keyword Extractor for UML uses the class diagram and
retrieves the method validateLogin(username:string). The
output is given to the Method Extractor and generates the
SPAQL query and extracts the matched methods which are
listed in the Table III. From the list the appropriate method will
be selected and the QDox retrieves the source code from the
HDFS and displays the method definition of the selected
methods as shown in the output of the Source Retriever in
Table II.
TABLE II. PROCESS FLOW OF THE FRAMEWORK
Proces Input Output
UML
Extraction
Given
Class
Diagram
Method Information
Name : validateLogin
Return : Boolean
visibility : public
Parameters : User Name
DataType : username
Method
Retriever
validateLo
gin(String
userName)
Refer Table 2
Source
Retriever
validateLo
gin(String
userName)
boolean returnStatus = false;
DatabaseOperation
databaseOperation = new
DatabaseOperation();
String strQuery = "SELECT *
FROM login WHERE
uname='"+userId+"'";
ResultSet resultSet =
databaseOperation.selectFro
mDatabase(strQuery);
try {

while(resultSet.next()){
returnStatus = true;
}
} catch (SQLException e) {

e.printStackTrace();
}
return returnStatus;

To test the performance of this framework the reusable
OWL files are created by uploading the completed projects.
The first OWL file is uploaded with first java project. The
second OWL file is uploaded with first and the second java
projects. The third OWL file is uploaded with first, second and
third java projects. Similarly five OWL files are constructed.
The purpose of creating OWL is to show how reusability
increases when the knowledgebase grows. A sample new
project is considered and it contains ten methods to be
developed. The OWL files are listed with the number of
packages, number of classes, number of methods and number
of parameters. These methods are matches with the OWL files
and the number of matches is listed in the Table IV.
TABLE III. METHOD RETRIEVER OUTPUT
Sl.
No.
Information
1
Project Name : CBR_1.0
Package : com.cbr.my.engine
Class Name : Login
Method
Name : ValidateLogin
Parameters : UserName
Return Type : boolean
2
Project Name : RBR_1.0
Package : com.my.rbr.utils.engine
Class Name : LoginManger
Method
Name : LoginLog
Parameters : UserName,ActivityCode

95 | P a g e
Return Type : Boolean
Method
Name : LoginContol
Parameters : UserName,password

3
Project Name : BHR_1.0
Package :
com.boscoits.BHR.utils.Action
Class Name : ControlManager
Method
Name : ManageLogin
Parameters :
UserName,password,memberId,ActionId
Method
Name : ValidateLogin
Parameters : UserName,password

These data in the row of the Table IV shows that the
number of matched methods. The reusability graph shown in
the Fig. 4 shows that how the matches increases when the
number of projects in the OWL grows. For the graph only five
new method names are used instead of ten listed in the Table
IV. The X-axis represents the OWL file numbers and the Y-
axis represents the number of method matched for the new
method legends. This progress shows that by uploading more
projects in the knowledgebase can able to provide nearly
hundred percent of the methods for reuse during software
development.
TABLE IV. NEW METHOD MATCHES WITH VARIOUS
KNOWLEDGEBASE

1
OWL
2
OWL
3
OWL
4
OWL
5
OWL
Classes 86 116 129 297 321
Methods 50 1088 1130 3405 3697
Packages 12 15 22 27 31
Parameters 765 1119 1174 4552 4802
Method Name
ValidateLogin 5 26 27 46 40
getUserType 0 0 0 0 2
addStudent 4 6 6 18 18
ManageRole 6 14 0 28 29
connect 4 5 8 11 16
InsertQuery 2 3 3 5 6
deleteQuery 2 3 3 5 6
updateQuery 2 3 3 5 6
selectQuery 2 3 3 5 6
connect 2 3 3 5 6

Figure 4. The number of matches for methods to the Projects
VIII. CONCLUSION
The paper presents a framework to extract the method code
components from the OWL using the UML design document.
OWL is semantically much more expressive than needed for
the results of our searching. With these sample tests the paper
argues that it is indeed possible to extract code from OWL
using the UML class diagram. The purpose of the paper is to
achieve the code reusability for the software development. The
OWL for the source code has already been created and this
paper searches and extracts the code and components and
reuses to shorten the software development life cycle. Before
starting the coding phase of the development the framework
helps the software development team to access the possibilities
of how much code can be reused and how much code need to
be developed. This assessment can help project manager to
allot resources to the project and reduce cost, time and
resource. The software companies can make use of this
framework and develop the project quickly and grab the project
at the lower cost among the competitors.
After developing OWL Ontology and storing the source
code in the HDFS, the code components can be reused. This
paper has taken design document from the user as input, then
extracted the method signature and try to search and match in
the OWL. The knowledgebase gets uploaded with more and
more projects the reuse rate is also higher. The future work can
take the SRS as input; text mining can be performed to extract
the keywords as classes and the process as methods. The SRS
artifact is much earlier phase than the UML. So considerable
amount of time can be reduced than using UML as input. The
method prototype can be used to search and match with the
OWL and the required method definition can be retrieved from
the HDFS. The purpose of storing the metadata in OWL is to
minimize the factors like time of development, time of testing,
time of deployment and developers. By creating OWL using
this framework can reduce these factors.
REFERENCES
[1]. Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer,
PHI Learning Private Limited, New Delhi, 2010, pp 1-3.
[2]. Bung. M, Treatise on Basic Philosophy. Ontology I. The Furniture of
the World. Vol. 3, Boston: Reidel.
[3]. Gaffney Jr., J. E,, Durek, T. A., Software reuse - key to enhanced
Productivity: Some quantitative models, Information and Software
Technology 31(5): 258-267.
[4]. Banker, R. D., Kauffman, R. J., Reuse and Productivity in Integrated
Computer-Aided Software Engineering: An Empirical Study, MIS
ValidateLo
gin
getUserTyp
e
addStudent

96 | P a g e
Quarterly 15(3): 374-401.
[5]. Basili, V. R.,Briand L. C., Melo, W. L., How Reuse Influences
Productivity in Object-Oriented Systems, Communications of the ACM
39(10): 104-116.
[6]. Boehm B.W., Pendo M., Pyster A., Stuckle E.D., and William R.D., An
Environment for Improving Software Productivity, In IEEE Computer,
June 1984.
[7]. Paul R.A., Metric-Guided Reuse, In proceedings of 7
th
International
Conference on tools with artificial Intelligence (TAI95), 5-8 November,
1995, pp. 120-127.
[8]. Poulin Jeffrey S., Measuring Software Reusability, In proceedings of
3rd International Conference on Software Reuse, Brazil, 1-4 November
1994, pp. 126-138.
[9]. Gopinath Ganapathy and S. Sagayaraj, Automatic Ontology Creation by
Extracting Metadata from the Source code , in Global Journal of
Computer Science and Technology,Vol.10, Issue 14( Ver.1.0) Nov.
2010. pp.310-314.
[10]. Won Kim: On Metadata Management Technology Status and Issues,
In Journal of Object Technology, vol. 4, no. 2, 2005, pp. 41-47.
[11]. Dublin Core Metadata Initiative. <
http://dublincore.org/documents/>,2002.
[12]. IEEE Learning Technology Standards Committee,
http://ltsc.ieee.org/wg12, IEEE Standards for Learning Object Metadata.
[13]. Darmoni, Thirion, Metadata Scheme for Health Resources
American Medical Informatics Association, 2000 JanFeb; 7(1): 108
109.
[14]. MPEG-7 Overview: ISO/IEC JTC1/SC29/WG11 N4980,
Kla-genfurt, July 2002.
[15]. Jason Venner, Pro Hadoop : Build Scalable, Distributed Applications,
in The Cloud, Apress, 2009.
[16]. Gopinath Ganapathy and S. Sagayaraj, Circumventing Picture
Archiving and Communication Systems Server with Hadoop
Framework in Health Care Services, in Science Publication 6 (3) :
2010: pp.310-314.
[17]. Tom White, Hadoop: The Definitive Guide, OReilly Media, Inc.,
2009.
[18]. Gruber, T. What is an Ontology? (September, 2010):
http://www.ksl-stanford.edu/kst/what-is-an-ontology.html.
[19]. Yang, X. Ontologies and How to Build Them. (January, 2011):
http://www.ics.uci.edu/~xwy/publications/area-exam.ps.
[20]. Bugaite, D., O. Vasilecas, Ontology-Based Elicitation of Business
Rules. In A. G. Nilsson, R. Gustas, W. Wojtkowski, W. G.
Wojtkowski, S. Wrycza, J. Zupancic, Information Systems
Development: Proc. of the ISD2004. Springer- Verlag, Sweden, 2006,
pp. 795-806.
[21]. Gopinath Ganapathy and S. Sagayaraj, To Generate the Ontology from
Java Source Code, in International Journal of Advanced Computer
Science and Applications(IJACSA), Volume 2 No 2 February 2011.
[22]. McCarthy, P. Introduction to Jena,
www-106.ibm.com/developerworks/java/library/j-jena/, ,
22.01.2011.
[23]. B. McBride, Jena IEEE Internet Computing, July2002.
[24]. J.J. Carroll,CoParsing of RDF & XML, HP Labs Technical Report,
HPL-2001-292, 2001.
[25] J.J. Carroll, Unparsing RDF/XML,WWW2002,
http://www.hpl.hp.com/techreports/2001/HPL-2001-292.html.
[26]. T. Berners-Lee et al., Primer: Getting into RDF & Semantic Web using
N3, http://www.w3.org/2000/10/swap/Primer.html.
[27]. J. Grant, D. Beckett, RDF Test Cases, 2004, W3C6.
[28]. L. Miller, A. Seaborne, and A. Reggiori, Three Implementations of
SquishQL, a Simple RDF Query Language, 2002, p 423.
[29]. P. Hayes, RDF Semantics, 2004, W3C.
[30]. P.F. Patel-Schneider, P. Hayes, I. Horrocks, OWL Semantics &
Abstract Syntax, 2004, W3C.
[31]. Prudhommeax, E., Seaborne, A., SPARQL Query Language for
RDF, W3C Recommendation, Retrieved November 20, 2010,
http://www.w3.org/TR/rdf-sparql-query/
[32]. Kendall, G.C., Feigenbaum, L., Torres, E.(2008), SPARQL Protocol for
RDF. W3C Recommendation, Retrieved November 20, 2009,
http://www.w3.org/TR/rdf-sparql-protocol/
AUTHORS PROFILE
Gopinath Ganapathy is the Professor & Head, Department of Computer
Science and Engineering in Bharathidasan University, India. He obtained his
under graduation and post-graduation from Bharathidhasan University, India in
1986 and 1988 respectively. He submitted his Ph.D in 1996 in Maduari
Kamaraj University, India. Received Young Scientist Fellow Award for the
year 1994 and eventually did the research work at IIT Madras. He published
around 20 research papers. He is a member of IEEE, ACM, CSI, and ISTE. He
was a Consultant for a 8.5 years in the international firms in the USA and the
UK, including IBM, Lucent Technologies (Bell Labs) and Toyota. His research
interests include Semantic Web, NLP, Ontology, and Text Mining.
S. Sagayaraj is the Associate professor in the Department of Computer
Science, Sacred Heart College, Tirupattur, India. He did his Bachelor Degree in
Mathematics in Madras University, India in 1985. He completed his Master of
Computer Applications in Bharadhidhasan University, India in 1988. Received
Master of Philosophy in Computer Science from Bharathiar University, India in
2001. Registered for Ph.D. programme in Bharathidhasan University, India in
2008. His Research interests include Data Mining, Ontologies and Semantic
Web.


97 | P a g e
Magneto-Hydrodynamic Antenna Design and
Development Analysis with prototype

Rajveer S Yaduvanshi
Electronic and Communication
Deparment,
AIT, Govt of Delhi
India-110031
E mail: [email protected]
Harish Parthasarathy
Electronic and Communication
Deparment,
NSIT, Govt of Delhi
India-110075
E [email protected] in

Asok De
Principal
AIT, Govt of Delhi
India-110031
E mail: [email protected]

AbstractA new class of antenna based on
magnetohydrodynamic technique is presented. Magneto-
hydrodynamic Antenna, using electrically conducting fluid, such
as NaCl solution under controlled electromagnetic fields is
formulated and developed. Fluid resonator volume and electric
field with magnetic field decides the resonant frequency and
return loss respectively to make the antenna tuneable in the
frequency range 4.5 to 9 GHz. The Maxwells equations, Navier
Stokes equations and equations of mass conservation for the
conducting fluid and field have been set up. These are expressed
as partial differential equations for the stream function electric
and magnetic fields, these equations are first order in time. By
discretizing these equations, we are able to numerically evaluate
velocity field of the fluid in the near field region and
electromagnetic field in the far field region. We propose to
design, develop, formulate and fabricate an prototype MHD
antenna [1-3]. Formulations of a rotating fluid frame, evolution
of pointing vector, permeability and permittivity of MHD
antenna have been worked out. Proposed work presents tuning
mechanism of resonant frequency and dielectric constant for
frequency agility and configurability. Measured results of
prototype antenna possess return loss up to -51.1dB at 8.59 GHz
resonant frequency. And simulated resonant frequency comes out
to be10.5GHz.
Keywords- Frequency agility, reconfigurability, MHD, radiation
pattern, saline water.
I. INTRODUCTION
MHD antenna uses fluid as dielectric. The word magneto
hydrodynamics (MHD) is derived from magneto- meaning
magnetic field, and hydro- meaning liquid, and -dynamics
meaning movement. MHD is the study of flow of electrically
conducting liquids in electric and magnetic fields [3-5]. Here
we have developed and tested magneto-hydrodynamic proto-
type antenna with detailed physics. Ting and King determined
in 1970 that dielectric tube can resonate. To our knowledge no
work has been done on MHD antenna as described here. Based
on our own developed theory, we have proposed this prototype
model with return loss results. Fluid antenna has advantage of
shape reconfigurability and better coupling of electromagnetic
signal with the probe, as no air presents in between [12]. We
have developed physics as per equations (1-12) for
electromagnetic wave coupling with conducting fluid in
presence of electric and magnetic field. Design and testing
stages of MHD antenna is shown as per figs. 1-13. Here, we
demonstrate, how the directivity, radiation resistance and total
energy radiated by this magnetohydrodynamic antenna can be
computed, by the elementary surface integrals. We have
developed, equations for rotating frame of conducting fluid,
velocity field, electric field, magnetic field, pointing vector,
current density, permittivity, permeability and vector potentials
to realise an MHD Antenna [6-8].We have used saline water,
ionised with DC voltage applied with the help of electrodes, in
presence permanent magnetic field. Fluid acts as radiating
element in the PPR (propylene random copolymer) cylindrical
tube. SMA connector is used to supply RF input. Volume and
shape of the fluid decides the resonant frequency. Excellent
results of radiation parameters were reported on measurements
of return loss and radiation pattern by the prototype, as listed in
tables 1-5. We have divided this paper into five parts. First part
consists introduction of MHD antenna system. Second part
deals with formulations [9-11]. Section three focuses on brief
explanation of the prototype development. Fourth section
speaks about working of prototype system. Section five
describes conclusion possible applications and scope of future
work.
II. FORMULATIONS
A. Motion of fluid in rotating frame
The equation of motion of a fluid in a uniformly rotating
frame with angular velocity is given by
, t + .
+ 2 x + x ( x r) =
(1)
Assuming the flow to be two dimensional and fluid to be
incompressible, obtain an equation for the stream function.
Velocity of fluid is given below
=
(t, x, y) +
(t, x, y) (2)
Angular velocity
=

The equation

98 | P a g e

= 0
It gives
(t, x, y) (3)

= -

(t, x, y)

for some scalar function called the stream function.
As we know vortisity
= x (4)
= -

Using this in the equation obtained by taking the curl of
the Navier Stokes equation, we have
, t +( x ) + 2 x ( x ) + x (( x r)) =
(5)
Note that
( x r) = (. r)
(6)
so that
x ( x ( x r)) = 0 (7)
Since is assumed to be constant, thus, the Navier Stokes
equation gives
, t + ( x ) + 2 x ( x )
=
(8)

B. Far field radiation Pattern
Space here r-radius, Angle of elevation, - azimuth
angle,

are first and second components of the

frequency and r are spherical co-ordinates.
0 180 ( rad)
0 360 ( rad)
r =

x = r Sin Cos
y = r Sin Sin
z = r Cos

v x B shall provide pointing vector in case of fluid. E x H
gives pointing vector , here H vector to embed v effect due to
conducting fluid .
are electric and magnetic

fields of MHD antenna and the pointing vector shall have the
effect of conducting fluid velocity generated due to E,
Radiation pattern shall depend on average radiated power. Any
spherical coordinate triplet (r, ), specify single point of three
space coordinates in radiation field.

and

also
=
|
|
|
|

Solution of above matrix shall provide us

) +
)

Hence

) =

And

(r.
) =

(9)
(Resulting Pointing Vector)
On substitution
Pointing vector =

Where
.
And

d d ,shall provide pointing

vector of radiator.
And
J= (E + V x B) shall be the resultant of MHD antenna
system, we need to calculate E at a given frequency. Here, first
and second component of vector potentials are.

(

)

|

or
(

)

And second component
(

)
)

|

or
, with the help of
and

we can evaluate total magnitude of radiated energy

99 | P a g e
per unit frequency per unit volume. This spectral density can be
evaluated by applying Parsevals theorem (mathematics of
DFT). As electric field E= - j
- j
),
Here
embeds the velocity component of fluid at a

given frequency. Now we shall evaluate
to
compute energy spectral density. On integration we can
evaluate total radiated energy. Also, we shall work to find x,
y, z component of pointing vector.
Where r denotes source and r denotes far field distance.

at a large distance shall contribute for =/ for plane

wave propagation.

and
should be function of (
also
X from spherical coordinates

shall provide us the pointing vector of the

radiated field
Here
component reside in , we need to calculate

component to enable us far field component
at large
distance, also
=
for plane wave. We can thus evaluate total energy

radiated.
We have =
, and after normalization Sin termgets

cancelled.
(E x B) pointing vector for x, y, z components and taking
as common, we can evaluate

Our objective is to evaluate total energy radiated per unit
frequency per unit volume.

H =

) here effect fluid velocity v have

been embedded in H field
,
assuming real part will effectively contribute.
= Cos Sin + Sin Sin + Cos

= Cos Cos + Sin Cos Sin
= Sin Sin + Cos Cos

Hence
(

)

(

)
(

)
}
And
(

)

Hence, pointing vector can be defined as
(, , ) +
(, ,
)
And

(, , ) +

(, , ))
Hence energy spectral density
D

Energy Spectral Density be evaluated by applying
Parsevals Theorem

or

(10)
This shall provide us total energy radiated by the MHD
antenna system.

C. Permeability of MHD antenna
We evaluate permeability of MHD antenna taking,
conductivity and permittivity as constant. Hence becomes
function of polynomial
In (E ,H, v) in MHD system, where E electric filed , h
magnetic field and v velocity of the fluid. Here p, q ,r are
integers.
And a=1, 2, 3
[

From Maxwells equation, we have

100 | P a g e
= -

= - (

)
And,

(20)
Thus, we observe that, permeability

becomes
coupled function of E, H and v.
We can minimize the difference or error (H -
) with
variational method, here p, q, r are integers
, Desired
outcome and a=1,2,3 . We derive the relationship of ,
permeability as function of E, H and v, where E, electric field
applied ,H magnetic field and v is velocity field
.Also
are weight or magnitude of the function and

Lagrange multipliers. For3D analysis we have,
(t, x, y, z),
(t, x, y, z),H(t, r), E(t, r)
[]
)
Hence on taking inner product and Lagrange multiplier in
variational method, we get,
[
] , permeability is hidden in H value.

As an example, Let F (
) and
) = 0
Where
i=1,2,..r,
(

This gives the error value by variational method
Lagrangian multiplier
are magnitude or function weight

H, satisfy Maxwells equation .
This way, we can minimize the difference or error (H -
),
and E, H, v are functions of permeability.
D. Permittivity of MHD antenna
As per Maxwells equation
x E =

(.E)
E =
)
And
x
= J +
)
On substitution, we get
) =
) +

When, is a function of (E, H, )

here
or
=
(11)
= Cyclic Tensor
(
) are electric field dipole moments.

Call log = function F(
) , hence (
)
(
)

On Summed over a, b and inverse of matrix
are known values and
are test functions.

Permittivity can be solved by variational method or by
measurement method or test function methods.

Let
(
) =
is matrix of element,
upto k
Hence


101 | P a g e

[]
) |
Permittivity
solution can be worked out by difference method or test
function method.
(
(12)
III. PROTOTYPE DEVELOPMENT
Two cylindrical tubes of PPR Pipes of diameters of 10.5cm
and 6.2cm with 6.5cm lengths were mounted on copper coated
plate 35 cm diameter circular sheet as ground plane. SMA
connector was mounted on outer tube for RF input. Two
electrodes tin electrodes of size 1.2 cm x 4.1 cm mounted on
the inside wall of bigger tube having direct contact with the
conducting fluid. DC voltage was given with DC source, 5-25
V with BIAS TEE arrangement. Copper coated plate 0.2mm
thickness was connected 0.9cm to ground of SMA connector.
Copper coated circular plate was used to create ground plane.
The probe of .08 cm in diameter and 0.75cm protruded from
SMA connector was inserted in such way, that it to makes
direct contact with the conducting fluid. Two permanent bar
magnets of 15cm x4cm x2cm were placed perpendicular to the
electric field to produce Lorentz force to create fluid flow.
Inside the tube saline water having 1200 - 9000 TDS (total
dissolved salt) value was used in ionised state, to produce
radiations. 300ml volume of saline water was used for perfect
impedance matching at resonant frequency. The RF signal was
given by network analyser through SMA connector with mixed
DC voltage to the fluid. S11prameters were recorded as per fig
3-5.
IV. DETAILED DESCRIPTION OF MHD ANTENNA
In this antenna, only ionised currents contribute to radiate
energy in conducting fluid. Radiating resistance and resonant
frequency shall depend on shape of fluid inside the tube and
nano particles of the fluid. The tube was applied to external
magnetic field which interacts with electric field to produce
Lorentz forces, resulting in fluid flow with velocity v. Now
there are three main fields i.e. electric field, magnetic field and
velocity fields, which are responsible for the possible
radiations. The radiated energy and its pattern are function of
RF input excitation, fields applied, fluid shape and nano
particle of fluid. Hence an adaptive mechanism can be built in
antenna to produce versatility in radiation pattern and broad
band effects, due to dynamic material perturbations.
We have formulated various equations (1-12) to focus on
physics of the design analysis of an MHD antenna. Here we
describe complete mechanism for beam formation, radiating
patterns and resonance. Radiation pattern in the far fields
depends not only on electromagnetic field but also on fluid
velocity field. We have described mathematical relations of
permeability as the function of E, H and v, when conductivity
and permittivity are kept constant. With proper filtering
techniques, MHD antenna can made to operate at one single
frequency. Fluid shape with fields decides resonant frequency.
The effective permeability can be controlled by applying a
static magnetic field. This leads to the possibility of
magnetically tuning of polarisation of the antenna.
Polarisation tuning of antenna was measured as a function of
strength for magnetisation parallel to the x- and y-directions.
The effects of magnetic bias on antenna have been investigated.
The principle of this class of antenna is essentially that of a
dielectric resonator, where salt (in solution) and electric field
modifies the dielectric properties. The resonator column shape
determined the operating frequency, allowing impedance match
and frequency of operation to be fully tuneable. Figure 1
presents complete test set up of MHD antenna under electric
and magnetic field, with RF input for S11 measurements.
Figures 1-13 presents results obtained and steps of prototype
development.VNA-L5230 was used to measure return loss at
resonant frequency. We have varied fluid salinity, electric field,
magnetic field and fluid height for all possible radiation
measurement in experimentations. We have recorded return
loss and radiation patterns for all possible combinations as
mentioned in tables 1-2.
This antenna with conducting fluid may have multiple
advantages viz reconfigurability, frequency agility, polarisation
agility, broadband and beam steering capability. Here, we
developed control of polarization with magnetic field biasing,
frequency control with fluid height and return loss with electric
field control. Non reflecting stealth property of the fluid, when
no field presents, makes it most suitable for military
applications.

Fig 1 Measurements of return loss/resonant frequency, MHD antenna with
VNA and power supply with Bias TEE

Fig 2 Return loss -33.1 dB at resonant freq 8.59 GHz, when TDS 9000,
electric field Applied 15 V, DC with permanent magnetic field.

102 | P a g e

Fig 3. Return loss -51.1 dB at resonant freq 8.59 GHz, when TDS 9000,
electric field applied 17 V, DC with permanent magnetic field.

Fig 4 return loss -49.1 dB at resonant freq 8.59 GHz, when TDS
9000,electric field applied 16.9 V, DC with permanent Magnetic field.

Fig 5 return loss -34.1 dB at resonant freq 8.59 GHz, when TDS
9000,electric field applied 15.0 V, DC with permanent magnetic field.

Fig 6 complete set for measurements of VSWR on MHD antenna with
additional magnetic field

Fig 7 MHD antenna with Bias TEE

Fig 8 Fabricated MHD Antenna, SMA connector and filled Saline water top
view

Fig 9 view of fabricated MHD antenna without
ground plane

Fig 10 View of outer part of MHD antenna Tube with two tin electrodes
attached


103 | P a g e

Fig 11 view of fabricated ground plane of MHD antenna with copper
coating.

Fig 12 HFSS generated cylindrical antenna

Fig 13 HFSS simulated resonant 10 GHz frequency

Table 1
Freq TDS Electric field Return Loss
8.59 GHz 9000 17.0V -51.1dB
4.50 GHz same same -16.7dB

Table 2
8.59 GHz 9000 16.9V -49.1dB

Table 3
8.59 GHz 9000 15.0V -39.1dB

Table 4
8.59 GHz 9000 13.2V -35.1dB

Table 5
TDS Fluid Height Resonant frequency
5000 3.5 cm 4.59 GHz
7000 6.0 cm 8.58 GHz

For measuring return loss, VSWR and resonant frequency,
we have used PNA-L Network analyser 10-40 GHz with DC
power supply. The resonant frequency for which antenna has
been fabricated destined was 8.59Ghz. However we have
frequency agility and reconfigurability in this antenna . Fluid
column height was varied from 2.5 cm to 6 cm and electric
field was varied from 2 V DC to 17 V DC, relevent results of
VSWR and Return loss were recorded.
We measured return loss by Agilent VNA(vector network
analyser) , the fluid tube height was kept fixed to 6 cm and
resonant frequency to 8.58Ghz , Dc voltage varied from 9V to
17 V. Return loss found varying proportionately to electric and
magnetic field. Also when TDS was increased from 200 to
9000 significant improvement in return loss were observed.
Mixed signal of DC and RF freq were fed to SMA connector of
antenna through Bias TEE. This test set up extended safety to
the network analyser.
V. CONCLUSION
It was observed from the measured results that there is
significant improvement in return loss when salinity of fluid is
enhanced. Also return loss improved due electric and magnetic
fields intensity. We have observed that electric field have
significant impact on return loss, these measured results are
placed in tables1-5. Bias TEE was used to feed mixed signal
from the same port .Return loss was significantly high at 17V,
DC. Height of fluid tube (fluid shape), nano particles of fluid
contribute to form resonant frequency of fluid antenna. When
height of fluid was 3.5 cm, our antenna resonated at 4.59 GHz
and when height of fluid increased to 6.0 cm , same antenna
resonated at 8.59 GHz. We have also simulated taking saline
water as dielectric in HFSS antenna software for resonant
frequency evaluation as per fig 12-13.We could thus achieved
reconfigurability and frequency agility in this antenna. It has
stealth property, as reflector is voltage dependent, hence can be
0.00 5.00 10.00 15.00
Freq [GHz]
-0.00000025
-0.00000020
-0.00000015
-0.00000010
-0.00000005
0.00000000
d
B
(
S
(
L
u
m
p
P
o
r
t
1
,
L
u
m
p
P
o
r
t
1
)
)
Ansoft Corporation HFSSDesign1
XY Plot 1
Curve Inf o
dB(S(LumpPort1,LumpPort1))
Setup1 : Sweep1

104 | P a g e
most suitable for Military applications. We can also use this
antenna as MIMO (multiple input outputs). More work towards
micro-fluidic frequency reconfiguration, fluidic tuning of
matching networks for bandwidth enhancement need to be
explored.
As a Future work, we will investigate radiation patterns as
a special case to this cylindrical antenna with detailed physics
involved.
VI. ACKNOWLEDGEMENT & BIOGRAPHY
Prof Raj Senani, Director NSIT, who inspired me for this
research work and enriched with all necessary resources
required in the college. I extends special thanks to my lab
technician Mr Raman, who helped me in lab for developing
this prototype MHD antenna.
REFERENCES
[1] Rajveer S Yaduvanshi and Harish Parthasarathy, Design,
Development and Simulations of MHD Equations with its proto type
implementations(IJACSA) International Journal of Advanced
Computer Science and Applications,Vol. 1, No. 4, October 2010.
[2] Rajveer S Yaduvanshi and Harish Parthasarathy, EM Wave transport
2D and 3D investigations (IJACSA) International Journal of
Advanced Computer Science and Applications,Vol. 1, No. 6, December
2010.
[3] Rajveer S Yaduvanshi and Harish Parthasarathy, Exact solution of 3D
Magnetohydrodynamic system with nonlinearity analysis IJATIT ,
Jan 2011.
[4] EM Lifshitz and LD Landau,Theory of Elasticity, 3rd edition Elsevier.
[5] EM Lifshitz and LD Landau, Classical theory of fields , 4th edition
Elsevier.
[6] Bahadir, A.R. and T. Abbasov (2005), A numerical investigation of
the liquid flow velocity over an infinity plate which is taking place in a
magnetic field International journal of applied electromagnetic and
mechanics 21, 1-10.
[7] EM Lifshitz and LD Landau, Electrodynamics of continuous media
Butterworth-Heinemann.
[8] EM Lifshitz and LD Landau, Fluid Mechanics Vol. 6 Butterworth -
Heinemann.
[9] EM Lifshitz and LD Landau, Theory of Fields Vol. 2 Butterworth-
Heinemann.
[10] JD Jackson, Classical Electrodynamics third volume, Wiley
[11] CA Balanis, Antenna Theory, Wiley.
[12] Gregory H. Huff, Member, IEEE, David L. Rolando, Student Member,
IEEE, Phillip Walters, Student Member, IEEE and Jacob McDonald,
A Frequency Reconfigurable Dielectric Resonator Antenna using
Colloidal Dispersions IEEE ANTENNAS AND WIRELESS
PROPAGATION LETTERS,VOL. 9, 2010.
AUTHORS PROFILE
Author: Rajveer S Yaduvanshi, Asst Professor

Author has 21 years of teaching and
research experience. He has successfully
implemented fighter aircraft arresting barrier
projects at select flying stations of Indian Air
Force. He has worked on Indigenization
projects of 3D radars at BEL and visited France
for Radar Modernisation as Senior Scientific
Officer in Min of Defence. Currently he is
working on MHD projects. He is teaching in
ECE Deptt. of AIT, Govt of Delhi-110031. He is fellow member of IETE. His
research includes Ten number of research papers published in international
journals and conferences.
Co- Author: Prof Harish Parthasarathy is an eminent academician and
great researcher. He is professor in ECE Deptt. at NSIT, Dwarka, Delhi. He
has extra ordinary research instinct and a great book writer in the field of signal
processing. He has published more than ten books and has been associated with
seven PhDs scholars in ECE Deptt of NSIT, Delhi.
Co-author: Prof Asok De is an eminent researcher and effective
administrator. He has set up an engineering college of repute under Delhi
Government. Currently he is Principal of AIT. His research interests are micro
strip antenna design.

105 | P a g e
An Architectural Decision Tool Based on Scenarios
and Nonfunctional Requirements

Mr. Mahesh Parmar
Department of Computer
Engineering
Lakshmi Narayan College of
Tech.(LNCT).
Bhobal (MP), INDIA
Prof. W.U. Khan
Department of Computer
Engineering
Shri G.S. Institute of Tech. &
Science(SGSITS)
Indore (M.P.), INDIA
Email : [email protected]
Dr. Binod Kumar
HOD & Associate Professor, MCA
Department.
Lakshmi Narayan College of
Tech.(LNCT).
Bhobal (MP), INDIA
Email : [email protected]

AbstractSoftware architecture design is often based on
architects intuition and previous experience. Little
methodological support is available, but there are still no
effective solutions to guide the architectural design. The most
difficult activity is the transformation from non-functional
requirement specification into software architecture. To achieve
above things proposed An Architectural Decision Tool Based on
Scenarios and Nonfunctional Requirements. In this proposed
tool scenarios are first utilized to gather information from the
user. Each scenario is created to have a positive or negative effect
on a non-functional quality attribute. The non-functional quality
attribute is then computed and compared to other non-quality
attributes to relate to a set of design principle that are relevant to
the system. Finally, the optimal architecture is selected by finding
the compatibility of the design principle.
Keywords- Software Architecture, Automated Design, Non-
functional requirements, Design Principle.
I. INTRODUCTION
Software architecture is the very first step in the software
lifecycle in which the nonfunctional requirements are
addressed [7, 8]. The nonfunctional requirements (e.g.,
security) are the ones that are blamed for a system re-
engineering, and they are orthogonal to system functionality
[7]. Therefore, software architecture must be confined to a
particular structure that best meets the quality of interest
because the structure of a system plays a critical role in the
process (i.e., strategies) and the product (i.e., notations) utilized
to describe and provide the final solution.
In this paper, we discuss an architectural decision tool
based on a software quality discussed in [14] in order to select
the software architecture of a system. In [14], we proposed a
method that attempted to bridge the chasm between the
problem domain, namely requirement specifications, and the
first phase in the solution domain, namely software
architecture. The proposed method is a systematic approach
based on the fact that the functionality of any software system
can be met by all kinds of structures but the structure that also
supports and embodies non-functional requirements (i.e.,
quality) is the one that best meets user needs. To this end, we
have developed a method based on nonfunctional requirements
of a system. The method applies a scenario-based approach.
Scenarios are first utilized to gather information from the user.
Each scenario is created to have a positive or negative affect on
a non-functional quality attribute. When creating scenarios, we
decided to start with some basic scenarios involving only single
quality attribute, multiple scenarios were then mapped to each
attribute that would have a positive or negative affect when the
user found the scenario to be true. Finally, it became clear to us
that we needed to allow each scenario to affect an attribute
positively or negatively in varying degrees.
In this work, we have studied and classified architectural
styles in terms of design principles, and a subset of
nonfunctional requirements. These classifications, in turn, can
be utilized to correlate between styles, design principles, and
quality. Once we establish the relationship between, qualities,
design principle, and styles, we should be able to establish the
proper relationship between styles and qualities, and hence we
should be able to select an architectural style for a given sets of
requirements [8], [13].
II. NON-FUNCTIONAL REQUIREMENT
Developers of critical systems are responsible for
identifying the requirements of the application, developing
software that implements the requirements, and for allocating
appropriate resources (processors and communication
networks). It is not enough to merely satisfy functional
requirements. Non-functional requirement is a requirement that
specifies criteria that can be used to judge the operation of a
system, rather than specific behaviours. This should be
contrasted with functional requirements that define specific
behaviour or functions. Functional requirements define what a
system is supposed to do whereas non-functional requirements
define how a system is supposed to be. Non-functional
requirements are often called qualities of a system. Critical
systems in general must satisfy non-functional requirement
such as security, reliability, modifiability, performance, and
other, similar requirements as well. Software quality is the
degree to which software possesses a desired combination of
attributes [15].

106 | P a g e
III. SCENARIOS
Scenarios are widely used in product line software
engineering: abstract scenarios to capture behavioral
requirements and quality-sensitive scenarios to specify
architecturally significant quality attributes. Scenario system
specific means translating it into concrete terms for the
particular quality requirement. Thus, a scenario is "A request
arrives for a change in functionality, and the change must be
made at a particular time within the development process
within a specified period." A system-specific version might be
"A request arrives to add support for a new browser to a Web-
based system, and the change must be made within two
weeks." Furthermore, a single scenario may have many system-
specific versions. The same system that has to support a new
browser may also have to support a new media type. A quality
attribute scenario is a quality-attribute-specific requirement
The assessment of a software quality using scenarios is
done in these steps:
A. Define a Representative set of Scenarios
A set of scenarios is developed that concretizes the actual
meaning of the attribute. For instance, the maintainability
quality attribute may be specified by scenarios that capture
typical changes in requirements, underlying hardware, etc.
B. Analyses the Architecture
Each individual scenario defines a context for the
architecture. The performance of the architecture in that
context for this quality attribute is assessed by analysis. Posing
typical question [15] for the quality attributes can be helpful.
C. Summaries the Rresults
The results from each analysis of the architecture and
scenario are then summarized into overall results, e.g., the
number of accepted scenarios versus the number not accepted.
We have proposed a set of six independent high-level non-
functional characteristics, which are defined as a set of
attributes of a software product by which its quality is
described and evaluated. In practice, some influence could
appear among the characteristics, however, they will be
considered independent to simplify our presentation. The
quality characteristics are used as the targets for validation
(external quality) and verification (internal quality) at the
various stages of development. They are refined (see Figure 1)
into sub-characteristics, until the quality attribute are obtained.
Sub characteristics (maturity, fault tolerance, confidentiality,
changeability etc) are refined into scenarios. Each non-
functional characteristic may have more than one sub
characteristics is refined into set of scenarios. When we
characterized a particular attribute then set of scenarios
developed to describe it.

Figure 1. Analysis Scenario Diagram
IV. THE APPROACH
To establish the correct relationship between architectural
styles using non functional requirements. The proposed
recommendation tool consists of four activities as follows:
Create a set of simple scenarios relevant to a single
nonfunctional requirement.
Identify those scenarios that may have positive or
negative impacts on one or more nonfunctional
requirements
Establish a relationship between a set of quality
attributes obtained in step 2 to a set of universally
accepted design principles (tactics).
Select a software architecture style that supports set
of design principles identified by step 3.
A. Quality Attribute
Product considerations and market demands require
expectations or qualities that must be fulfilled by a systems
architecture. These expectations are normally have to do with
how the system perform a set of tasks (i.e., quality) rather than
what system do (i.e., functionality). Functionality of a system,
which is the ability of a system to perform the work correctly
for which it was intended, and the quality of a system, is
orthogonal to one another.
In general, the quality attributes of a system is divided
between two groups: 1) Operational quality attributes such as
performance, and 2) non-operational, such as modifiability [8].
In this study, we have selected both operational and non-
operational quality attributes as follows:
Reliability (the extent with which we can expect a
system to do what it is supposed to do at any given
time)
Security (the extend by which we can expect how
secure the system is from tampering/ illegal access)
Modifiability (how difficult or time consuming it is to
perform change on the system)
Performance (how fast the system will run, i.e.,
throughput, latency, number of clock cycles spend
finishing a task)
Usability (the ease by which the user can interact
with system in order to accomplish a task),
Availability (the extend by which we expect the
system is up and running)
Reusability (the extent by which apart or the entire
system can be utilized)

Usability involves both architectural and nonarchitectural
aspects of a system. Example of nonarchitectural features
includes graphical user interface (GUI); examples of
architectural features include, undo, cancel, and redo.
Modifiability involves decomposition of system functionality
and the programming techniques utilized within a component.
In general, a system is modifiable if changes involve the
minimum number of decomposed units. Performance involves
the complexity of a system, which is the dependency (e.g.,
structure, control, and communication) among the elements of
NFR
Characteristics
Quality
Attribute
Sub-
Characteristics

Scenarios

107 | P a g e
a system, and the way system resources are scheduled and/or
allocated. In general, the quality of a system can never be
achieved in isolation. Therefore, the satisfaction of one quality
may contribute (or contradict) to the satisfaction of another
quality [12]. For example, consider security and availability;
security strives for minimally while availability strives for
maximally. Or, it is difficult to achieve a high secure system
without compromising the availability of that system. In this
case security contradicts the availability. This can be easily
solved by negotiating with the user to make her/his mind.
Another example has to do with security and usability: security
inhabits usability because user must do additional things such
as creating a password. Table I documents the correlation
among quality attributes.
To summaries, five types of quality attributes relationships
are identified. These relationships are defined by some
numerical values which belong to 0 to 1. These relationships
are: Very strong(0.9), Strong(0.7), Average(0.5), Below
average(0.3), Very low(0.1), Not available(0.0).

TABLE I. QUALITY VS QUALITY

S.N. Quality
Attribute
Quality
Attribute
Relationship values

1) Reliability Performance Very Strong 0.9
Security Very Strong 0.9

2) Performance Reliability Very Strong 0.9
Security Below Average 0.3

3) Security Reliability Average 0.5
Performance Very Low 0.1
B. Design Principle
According to [1, 2], a design can be evaluated in many
ways using different criteria. The exact selection of criteria
heavily depends on the domain of applications. In this work,
we adopted what is known as commonly accepted design
principles [3, 6, 7], and a set of design decisions known as
tactics [3, 8, 13, 14]. Tactics are a set of proven design
decisions and are orthogonal to particular software
development methods. Tactics and design principle have been
around for years and originally advocated by people like Parnas
and Dijkstra. Our set of design principles and tactics includes:
1) Generality (or abstractions), 2)Locality and separation of
concern, 3) Modularity, 4)Concurrency 5) Replicability, 6)
Operability, and 7) Complexity.
Examples of design principles and tactics include a high
degree of parallelism and asynchronized communication is
needed in order to partially meet the performance requirement;
a high degree of replicability (e.g., data, control, computation
replicability) is needed in order to partially meet availability; a
high degree of locality, modularity, and generality are needed
in order to achieve modifiability and understandability; a high
degree of controllability, such as authentication and
authorization, is needed in order to achieve security and
privacy.; and a high degree of locality, operatability (i.e., the
efficiently by which a system can be utilized by end-users) is
needed in order to achieve usability. Table II shows the
correlation among qualities and tactics.

TABLE II. TACTICS VS QUALITIES

C. Architecture Styles
In order to extract the salient features of each style we
have compiled its description, advantages and disadvantages.
This information was later utilized to establish a link among
styles and design principles. We have chosen, for the sake of
this work, main/subroutine, object-oriented, pipe/filter,
blackboard, client/server, and layered systems.
A main/subroutine (MS) architectural style advocates top-
down design strategy by decomposing the system into
components (calling units), and connectors (caller units). The
coordination among the units is highly synchronized and
interactions are done by parameters passing.
An Object-oriented (OO) system is described in terms of
components (objects), and connectors (methods invocations)
components are objects. Objects are responsible for their
internal representation integrity. The coordination among the
units is highly asynchronized and interactions are done by
method invocations. The style supports reusability, usability,
modifiability, and generality.
A Pipe/filter (P/F) style advocates bottom-up design
strategy by decomposing a system in terms of filters (data
transformation units) and pipes (data transfer mechanism). The
coordination among the filters are asynchronized by
transferring control upon the arrival of data at the input.
Upstream filters typically have no control over this behavior.
A Client/server (C/S) system is decomposed into two sets
of components (clients or masters), and (servers or slaves). The
interactions among components are done by remote procedure
calls (RPC) type of communication protocol. The coordination
and control transformation among the units are highly
synchronized.
A Blackboard (BKB) system is similar to a database
system; it decomposes a system into components (storage and
computational units known as knowledge sources (KSs). In a
Blackboard system, the interaction among units is done by
shared memory. The coordination among the units, for most
parts, is asynchronized when there is no race for a particular
data item, otherwise it is highly synchronized. The blackboard
S.N Tactics Quality
Attribute
Relationship values

1) Generality Reliability Very Strong 0.9
Security Average 0.5
Performance Very Strong 0.9

2) Locality Reliability Very Strong 0.9
Security Not Available 0.0
Performance Very Strong 0.9

3) Modularity Reliability Very Strong 0.9
Security Very Strong 0.9
Performance Strong 0.7

108 | P a g e
style enjoys some level of replications such data (e.g., the
distributed database and the distributed blackboard systems)
and computation.
A Layered (LYR) system typically decomposes a system
into a group of components (subtasks). The communication
between layers is achieved by the protocols that define how the
layers will interact. The coordination and control
transformation among the units (or subtasks) is highly
synchronized and interactions are done by parameters passing.
A Layered system incurs performance penalty stems from the
rigid chain of hierarchy among the layers. Table III illustrates
the relationships among design principles/tactics.
V. PROPOSED WORK
The implementation of our tool consists of six different
modules to perform its functions. Average weight module
calculates average weight corresponding to selected scenarios.
Effective weight module calculates effective weight of each
nonfunctional requirement and each nonfunctional requirement
has list of scenarios. Scenarios and its corresponding weight are
selected by user. Quality attribute weight module calculates
quality attribute weight. It depends on average and effective
module response. Quality attribute rank module calculates the
rank of quality attribute. It depends on quality attribute weight
module. Tactics rank module calculates tactics rank and
architecture style rank module calculates the architecture rank.
TABLE III. ARCHITECTURAL STYLES VS TACTICS

These are described in detail as follows.
Calculate average weight of each quality attribute that
is selected by user. In this step user first selects
scenarios corresponding to non-functional
requirement and chooses weight according to his
choice. Then calculate the average weight for each
non functional requirement by using following
formula

AQA
i
= Average weight of i
th
quality attribute

QWt
n
= Weight of

n
th
selected scenario.

n = Number of scenarios
N = Total number of selected scenarios

Calculate effective weight of each quality attribute.
Each scenario may affect more than one scenarios.
All effective scenarios questions for each scenario are
stored in the effect table in the database. Effect table
maintains the list of effected scenarios questions.
Calculate effective weight for each quality attribute
by using following formula.

EWtQA
i
= Effective weight of i
th
quality attribute

EQ
n
= n
th
Effective scenario.
e = Number of effective scenarios.
m = Number of scenarios

Calculate quality attribute weight for each quality
attribute. Using the output of step1 and step 2 we
calculate the quality attribute weight by the following
formula.

=

QAWt
i
= i
th
quality attribute weight.

Calculate quality attribute rank. Quality to quality
relationship table is stored in the database which
maintains relationship values of quality to quality
attribute. Calculate quality attribute rank using
quality to quality relationship table by following
formula.

QAR
i
= i
th
quality attribute rank.
q = Number of quality attribute.
QtoQ
q
= qth Quality to quality relationship

Calculate tactics rank. Quality to tactics relationship
table is stored in the database which maintains
relationship values of quality to tactics. Calculate
tactics rank using quality to tactics relationship table
by following formula.

TR
i
= i
th
Tactics rank.
QtoT
t
= t
th
Quality to tactics relationship.
t = Number of tactics.

Calculate architecture style rank. Tactics to
architecture style relationship table is stored in the
database which maintains relationship values of
tactics to architecture style. Calculate architecture
style rank using tactics to architecture style
relationship table by flowing formula.

N
QWtn
AQAi
n
m e
QWtn EQn EWtQAi
1 1
*
QAWti EWtQAi AQAi
q
QtoQq QAWti QARi
1
*
t
QtoTt QARi TRi
1
*
a
TtoASa TRi ASRi
1
*
S.N Architecture
Style
Tactics Relationship values

1) Pipe & Filter Generality Very Strong 0.9
Locality Very Strong 0.9
Modularity Average 0.5

2) Black Board Generality Very Strong 0.9
Modularity Very Strong 0.9

3) Object Oriented Generality Very Strong 0.9
Modularity Very Strong 0.9

109 | P a g e
ASR
i
= i
th
Architecture style rank.

TtoAS
a
= ith Tactics to architecture style relationship.
a = Number of architecture styles

First user will click the main page then main page opens
and he will select non functional requirements. Now he has to
select scenario questions and corresponding weight according
to his requirement. A single user can select more than one non
functional requirement
Now the user will select the submit button and Result page
will open and he would be able to see the Average Weight,
Effective Weight, Quality Attribute Weight, Quality Attribute
rank, Tactics Rank and Architectural Style and Rank.
VI. RELATED WORK
The work in this paper is inspired by the original work in
the area of architectural design guidance tool by Thomas Lane
[4], and it is partially influenced by the research in [2, 5], [6],
of [7], [8], [9], [11], [12], [13], and [14]. In [5], NFRs, such as
accuracy, security, and performance have been utilized to study
software systems.

Figure 2 . A simple form-based Scenario

Figure 3. The evaluation results

In [6], the authors analyzed the architectural styles using
modality, performance, and reusability. Their study provided
preliminary support for the usefulness of architectural styles the
work by Bass et al. [8] introduces the notion of design principle
and scenarios that can be utilized to identify and implement
quality characteristics of a system. In [7], the discussed the
identification of the architecturally significant requirements its
impact and role in assessing and recovering software
architecture. In [9], the authors proposed an approach to elicit
NFRs and provide a process by which software architecture to
obtain the conceptual models.
In [14], the authors proposed a systematic method to extract
architecturally significant requirements and the manner by
which these requirements would be integrate into the
conceptual representation of the system under development.
The method worked with the computation, communication, and
coordination aspects of a system to select the most optimal
generic architecture. The selected architecture is then deemed
as the starting point and hence is subjected to further
assessment and/or refinement to meet all other users
expectations.
In [13], the authors developed a set of systematic
approaches based on tactics that can be applied to select
appropriate software architectures. More specifically, they
developed a set of methods, namely, ATAM (architecture
Tradeoff Analysis Method, SAAM (Software Architecture
Analysis Method, and ARID (Active Reviews for Intermediate
Designs. Our approach has been influenced by [13]; we did
applied tactics and QAs to select an optimal architecture.
However, the main differences between our approach and the
methods developed by Clements et al. [13] are 1) our method
utilizes different set of design principle and proven design, 2)
establishes the correlation within QAs, tactics using tables, 3)
establishes the proper correlation between QAs, tactics, and
architectural styles using a set of tables and 4) the
implementation of scenarios, which meant to increase the
accuracy of the evaluation and architectural recommendations.
VII. CONCLUSIONS AND FUTURE WORK
In this paper, we created a tool based on a set of scenarios
that allows the user to select an architecture based on non-
functional requirements. Non-functional requirements are then
mapped to tactics using weighting. The architecture is then
selected by its compatibility with the high-scoring design
principle. We believe this approach has a lot of merits.
However, more research work will be required to create a
complete set of scenarios having a closer coupling with quality
attributes. Additional work may also be required in fine-tuning
the mappings between nonfunctional and functional
requirements.
Currently, our tool can be utilized to derive and/or
recommend architectural styles based on NFR. To validate the
practicality and the usefulness of our approach, we plan to
conduct a series of experiments in the form of case studies in
which the actual architectural recommendations from our tool
will be compared to the design recommendations by architects.
We have discussed some quality attributes, some design tactics
and some architecture styles. This needs some more research
work on other quality attributes, tactics and architecture styles.
New research works on non functional requirements might be

110 | P a g e
done by project members in the future. Our tool provides
facility for addition, deletion and modification of new non-
functional requirements, new tactics, and new architecture
styles.
REFERENCES
[1] R. Aris. Mathematical modeling techniques. London; San Francisco:
Pitman (Dover, New York), 1994.
[2] N. Medvidovic, P. Gruenbacher, A. Egyed, and B. Boehm. Proceedings
of the 13th International Conference on Software Engineering and
Knowledge Engineering (SEKE01), Buenos Aires, Argentina, June
2001.
[3] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and M. Stal.
Pattern-Oriented Software Architecture: A System of Patterns. John
Wiley, 1996.
[4] T. Lane. User Interface Software Structure, Ph.D. thesis. Carnegie
Mellon University, May 1990.
[5] L. Chung, and B. Nixon. Dealing with Non Functional Requirements:
Three experimental Studies of a Processes oriented approach. Process
dings of the International Conference on Software Engineering
(ICSE95), Seattle, USA, 1995.
[6] M. Shaw, and D. Garlan. Software Architecture: Perspective on an
Emerging Discipline. Prentice Hall, 1996.
[7] M. Jazayeri, A. Ran, and F. Linden. Software Architecture for Product
Families., Addison/Wesley, 2000.
[8] L. Bass, P. Clements, and R. Kazman. Software Architecture in Practice,
second edition, Addison/Wesley. 2003.
[9] L. Cysneiros, and J. Leite. Nonfunctional Requirements: From elicitation
to Conceptual Models. IEEE Transaction on Software Engineering,
vol.30, no.5, May 2004.
[10] B. Boehm, A. Egyed, J. Kwan, D. Port, A. Shah, and R. Madachy. Using
the WinWin spiral model: a case study IEEE Computer, 1998.
[11] A. Lamsweerde. From System Goals to Software Architecture. In
Formal Methods for Software Architecture by LNCS 2804, Springer-
Verlag, 2003.
[12] L. Chung, B. Nixon, E. Yu, and J. Mylopoulos. Nonfunctional
Requirements in Software Engineering. Kluwer Academic, Boston,
2000.
[13] P. Clements, R. Kazman, and M. Klein. Evaluating Software
Architectures: Methods and Case Studies. Addison Wesley, 2002.
[14] H. Reza, and E. Grant. Quality Oriented Software Architecture. The
IEEE International Conference on Information Technology Coding and
Computing (ITCC05), Las Vegas, USA, April 2005.
[15] J.A. McCall, Quality Factors, Software Engineering Encyclopedia, Vol
2, J.J. Marciniak ed., Wiley, 1994, pp. 958 971
[16] B. Boehm and H. Hoh, Identifying Quality-Requirement Conflicts,
IEEE Software, pp. 25-36, Mar. 1996.
[17] M. C. Paulk, The ARC Network: A case study, IEEE Software, vol. 2,
pp. 61-69, May 1985.
[18] M. Chen and R. J. Norman, A framework for integrated case, IEEE
Software, vol. 9, pp. 18-22, March 1992.
[19] S. T. Albin, The Art of Software Architecture: Design Methods and
Techniques, John Wiley and Sons, 2003
[20] P. Bengtsson, Architecture-Level Modifiability Analysis, Doctoral
Dissertation Series No.2002-2, Blekinge Institute of Technology, 2002.

AUTHORS PROFILE
Mr. Mahesh Parmar is Assistant Professor in CSE Dept. in LNCT Bhopal and
having 2 years of Academic and Professional experience. He has published 5
papers in International Journals and Conferences. He received M.E. degree in
Computer Engineering from SGSITS Indore in July 2010. His other
qualifications are B.E.(Computer Science and Engineering, 2006). His area of
expertise is Software Architecture and Software Engineering

Dr.W.U.Khan, has done PhD (Computer Engg) and Post Doctorate (Computer
Engg). He is Professor in Computer Engineering Department at, Shri G.S.
Institute of Technology and Science, Indore, India.

Dr. Binod Kumar is HOD and Associate professor in MCA Dept. in LNCT
Bhopal and having 12.5 years of Academic and Professional experience. He is
Editorial Board Member and Technical Reviewer of Seven (07) International
Journals in Computer Science. He has published 11 papers in International
and National Journals. He received Ph.D degree in Computer Science from
Saurastra Univ. in June 2010. His other qualifications are M.Phil (Computer
Sc, 2006), MCA(1998) and M.Sc (1995). His area of expertise is Data Mining,
Bioinformatics and Software Engineering.


111 | P a g e
To Generate the Ontology from Java Source Code
OWL Creation

Gopinath Ganapathy
1
1
Department of Computer Science,
Bharathidasan University,
Trichy, India.
[email protected]

S. Sagayaraj
2
2
Department of Computer Science,
Sacred Heart College,
Tirupattur, India
[email protected]

AbstractSoftware development teams design new components
and code by employing new developers for every new project. If
the company archives the completed code and components, they
can be reused with no further testing unlike the open source code
and components. Program File components can be extracted
from the Application files and folders using APIs. The proposed
framework extracts the metadata from the source code using
QDox code generators and stores it in the OWL using Jena
framework automatically. The source code will be stored in the
HDFS repository. Code stored in the repository can be reused for
software development. By Archiving all the project files in to one
ontology will enable the developers to reuse the code efficiently.
Keywords- component: Metadata; QDox, Parser, J ena, Ontology,
Web Ontology Language and Hadoop Distributed File System;.
I. INTRODUCTION
Todays Web content is huge and not well-suited for human
consumption. An alternative approach is to represent Web
content in a form that is more easily machine-processable by
using intelligent techniques. The machine processable Web is
called the Semantic Web. Semantic Web will not be a new
global information highway parallel to the existing World
Wide Web; instead it will gradually evolve out of the existing
Web [1]. Ontologies are built in order to represent generic
knowledge about a target world [2]. In the semantic web,
ontologies can be used to encode meaning into a web page,
which will enable the intelligent agents to understand the
contents of the web page. Ontologies increase the efficiency
and consistency of describing resources, by enabling more
sophisticated functionalities in development of knowledge
management and information retrieval applications. From the
knowledge management perspective, the current technology
suffers in searching, extracting, maintaining and viewing
information. The aim of the Semantic Web is to allow much
more advanced knowledge management system.
For every new project, Software teams design new
components and code by employing new developers. If the
company archives the completed code and components, it can
be used with no further testing unlike open source code and
components. File content metadata can be extracted from the
Application files and folders using APIs. During the
development each developer follows one's own methods and
logic to perform a task. So there will be different types of
codes for the same functionalities. For instance to calculate the
factorial, the code can be with recursive, non-recursive process
and with different logic. In organizational level a lot of time is
spent in re-doing the same work that had been done already.
This has a recursive effect on the time of development, testing,
deployment and developers. So there is a base necessity to
create system that will minimize these factors.
Code re-usability is the only solution for this problem. This
will reduce the development of an existing work and testing.
As the developed code has undergone the rigorous software
development life cycle, it will be robust and error free. There is
no need to re-invent the wheel. Code reusability was covered in
more than two decades. But still it is of syntactic nature. The
aim of this paper is to extract the methods of a project and store
the metadata about the methods in the OWL. OWL stores the
structure of the methods in it. Then the code will be stored in
the distributed environment so that the software company
located in various geographical areas can access. To reuse the
code, a tool can be created that can extract the metadata such
as function, definition, type, arguments, brief description,
author, and so on from the source code and store them in OWL.
This source code can be stored in the HDFS repository. For a
new project, the development can search for components in the
OWL and retrieve them at ease[3].
The paper begins with a note on the related technology
required in Section 2. The detailed features and framework for
source code extractor is found in Section 3. The metadata
extraction from the source code is in section 4. The metadata
extracted is stored in OWL using Jena framework is in section
5. The implementation scenario is in Section 6. Section 7 deals
with the findings and future work of the paper.
II. RELATED WORK
A. Metadata
Metadata is defined as data about data or descriptions of
stored data. Metadata definition is about defining, creating,
updating, transforming, and migrating all types of metadata that
are relevant and important to a users objectives. Some
metadata can be seen easily by users, such as file dates and file
sizes, while other metadata can be hidden. Metadata standards
include not only those for modeling and exchanging metadata,

112 | P a g e
but also the vocabulary and knowledge for ontology [4]. A lot
of efforts have been made to standardize the metadata but all
these efforts belong to some specific group or class. The
Dublin Core Metadata Initiative (DCMI) [5] is perhaps the
largest candidate in defining the Metadata. It is simple yet
effective element set for describing a wide range of networked
resources and comprises 15 elements. Dublin Core is more
suitable for document-like objects. IEEE LOM [6], is a
metadata standard for Learning Objects. It has approximately
100 fields to define any learning object. Medical Core
Metadata (MCM) [7] is a Standard Metadata Scheme for
Health Resources. MPEG-7 [8] multimedia description
schemes provide metadata structures for describing and
annotating multimedia content. Standard knowledge ontology
is also needed to organize such types of metadata as content
metadata and data usage metadata.
B. Hadoop & HDFS
The Hadoop project promotes the development of open
source software and it supplies a framework for the
development of highly scalable distributed computing
applications [9]. Hadoop is a free, Java-based programming
framework that supports the processing of large data sets in a
distributed computing environment and it also supports data
intensive distributed application. Hadoop is designed to
efficiently process large volumes of information[10]. It
connects many commodity computers so that they could work
in parallel. Hadoop ties smaller and low-priced machines into
a compute cluster. It is a simplified programming model which
allows the user to write and test distributed systems quickly. It
is an efficient, automatic distribution of data and it works
across machines and in turn it utilizes the underlying
parallelism of the CPU cores.
In a Hadoop cluster even while, the data is being loaded in,
it is distributed to all the nodes of the cluster. The Hadoop
Distributed File System (HDFS) will break large data files into
smaller parts which are managed by different nodes in the
cluster. In addition to this, each part is replicated across several
machines, so that a single machine failure does not lead to non-
availability of any data. The monitoring system then re-
replicates the data in response to system failures which can
result in partial storage. Even though the file parts are
replicated and distributed across several machines, they form a
single namespace, so their contents are universally accessible.
Map Reduce [11] is a functional abstraction which provides an
easy-to-understand model for designing scalable, distributed
algorithms.
C. Ontology
The key component of the Semantic Web is the collections
of information called ontologies. Ontology is a term borrowed
from philosophy that refers to the science of describing the
kinds of entities in the world and how they are related. Gruber
defined ontology as a specification of a conceptualization
[12].Ontology defines the basic terms and their relationships
comprising the vocabulary of an application domain and the
axioms for constraining the relationships among terms [13].
This definition explains what an ontology looks like [14].The
most typical kind of ontology for the Web has taxonomy and a
set of inference rules. The taxonomy defines classes of objects
and relations among them. Classes, subclasses and relations
among entities are a very powerful tool for Web use.
A large number of relations among entities can be
expressed by assigning properties to classes and allowing
subclasses to inherit such properties. Inference rules in
ontologies supply further power. Ontology may express rules
on the classes and relations in such a way that a machine can
deduce some conclusions. The computer does not truly
understand" any of this information, but it can now
manipulate the terms much more effectively in ways that are
useful and meaningful to the human user. More advanced
applications will use ontologies to relate the information on a
page to the associated knowledge structures and inference
rules.
III. SOURCE CODE EXTRACTOR FRAMEWORK
After the completion of a project, all the project files are
sent to Source code extraction framework that extracts
metadata from the source code. Only java projects are used for
this framework. The java source file or folder that consists of
java files is passed as input along with project information like
description of the project, version of the project. The
framework extracts the metadata from the source code using
QDox code generators and stores it in the OWL using Jena
framework. The source code is stored in the Hadoops HDFS.
A sketch of the source code extractor tool is shown in Fig. 1.
Source code extraction framework performs two processes:
Extracting Meta data from the source code using QDox and
storing the meta-data in to OWL using Jena. Both the
operations are performed by API's. This source code extractor
will integrate these two operations in a sequenced manner. The
given pseudo code describes the entire process of the
framework.
Figure 1. The process of Semantic Stimulus Tool
The framework takes project folder as input and counts the
number of packages. Each package information is stored in the
OWL. Each package contains various classes and each class
has many methods. The class and method information is stored
in the OWL. For each of method, the information such as
return type, parameters and parameter type information are
stored in the OWL. The framework which places all the
information in the persistence model and it is stored in the
OWL file.


113 | P a g e
1. Get package count by passing the file path.
2. Initialize packageCounter to zero
3. While the package count equal to packageCounter
3.1 Store the package[packageCounter] Information into OWL model
3.2 Initilalize classCounter value is equal to zero.
3.3 Get the no of class count
3.4 While class count equal to classCounter
3.4.1 Store the class[ClassCounter] Information
into OWL model
3.4.2 Initialize methodCounter to zero
3.4.3 Get no of method of the
class[packageCounter]
3.4.4 While no of method count is equal to zero
3.4.4.1 Store the method[methodCounter]
information into OWL.
3.4.4.2 Store the modifier informaton of the
method [classCounter]
3.4.4.3 Store the return type of the
method[classCounter]
3.4.4.4 Initialize the parmCounter to zero
3.4.4.5 Get the no of parameters of
method[methodCounter]
3.4.4.6 While no of paramerters count is equal to
zero
3.4.4.6.1 Store the
parameter[paramCounter]
information into OWL model
3.4.4.6.2 Increase
paramCounter by
one
3.4.4.7 Increase methodCounter by one
3.4.5 Increase class Counter by one
3.5 Increase package Counter by one
4. Write the OWL model in the OWL File.

IV. EXTRACTING METADATA
QDox is a high speed small footprint parser for extracting
classes, interfaces, and method definitions from the source
code. It is designed to be used by active code generators or
documentation tools. This tool extracts the metadata from the
given java source code. To extract the meta-data of the source,
the given order has to be followed. When the java source file or
folder that has the java source file is loaded to QDox, it
automatically performs the iteration. The loaded information is
stored in the JavaBuilder object. From the java builder object
the list of packages, as an array of string, are returned. This
package list has to be looped to get the class information. From
the class information, the method information is extracted. It
returns the array of JavaMethod. Out of these methods, the
information like scope of the method, name of method, return
type of the method and parameter information is extracted.
The QDox process uses its own methods to extract various
metadata from the source code. The getPackage() method lists
all the available packages for a given source. The getClasses()
method lists all the available classes in the package. The
getMethods() method lists all the available methods in a class.
The getReturns() method returns the return type of the method.
The getParameters() method lists all the parameters available
for the method. The getType() method returns the type of the
method. And when the getComment() method is used with
packages, classes and methods, it returns the appropriate
comments. Using the above methods the project informations
such as package, class, method, retune type of the method,
parameters of the method, method type and comments are
extracted by the QDox. These metadata are passed to the next
section for storing in the OWL.
V. STORING METADATA IN OWL
To store the metadata extracted by QDox, the Jena
framework is used. Jena is a Java framework for manipulating
ontologies defined in RDFS and OWL Lite [15]. Jena is a
leading Semantic Web toolkit [16] for Java programmers.
Jena1 and Jena2 are released in 2000 and August 2003
respectively. The main contribution of Jena1 was the rich
Model API. Around this API, Jena1 provided various tools,
including I/O modules for: RDF/XML [17], [18], N3 [19], and
N-triple [20]; and the query language RDQL [21]. In response
to these issues, Jena2 has a more decoupled architecture than
Jena1. Jena2 provides inference support for both the RDF
semantics [22] and the OWL semantics [23].
Jena contains many APIs out of which only few are used
for this framework like addProperty(), createIndividual() and
write methods. The addProperty() method is to store data and
object property in the OWL Ontology. CreateIndividual()
creates the individual of the particular concepts. Jena uses in-
memory model to hold the persistent data. So this has to be
written in to OWL Ontology using write() method.
The OWL construction is done with Protg. Protg is an
open source tool for managing and manipulating OWL[24].
Protg [25] is the most complete, supported and used
framework for building and analysis of ontologies [26, 27, 28].
The result generated in Protg is a static ontology definition
[29] that can be analyzed by the end user. Protg provides a
growing user community with a suite of tools to construct
domain models and knowledge-based applications with
ontologies. At its core, Protg implements a rich set of
knowledge-modeling structures and actions that support the
creation, visualization, and manipulation of ontologies in
various representation formats. Protg can be customized to
provide domain-friendly support for creating knowledge
models and entering data. Further, Protg can be extended by
way of a plug-in architecture and a Java-based API for building
knowledge-based tools and applications.
Based on the java source code study the ontology domain is
created with the following attributes. To store the extracted
metadata, the ontology is created with project, packages,
classes, methods and parameters. The project is concept that
holds the information like name, project repository location,
project version and the packages. The package is a concept that
holds the information like name and the class. The class is a
concept that holds the class informations such as author, class
comment, class path, identifier, name and the methods. The
method is a concept that holds the information like method
name, method Comment, method identifier, isConstructor,
return type, and the parameter. The parameter is a concept that
holds the information like name and the data type.
Concepts/Classes provide an abstraction mechanism for
grouping resources with similar characteristics. Project,
package, class, method, parameter are concepts in source code
extractor ontology.
Individual is an instance of the concept/ class.

114 | P a g e

Property describes the relation between concepts and
objects. It is a binary relationship on individuals. Each property
has domain and range. There are two types of property namely
object and data property
Object Property links individuals to individuals. In source
code ontology, the object properties are hasClass, hasMethod,
hasPackage and hasParameter. hasClass is an object property
which has domain Package and range Class. hasMethod is an
object property which has domain class and range method.
hasPackage is an object property which has domain Project and
range Package. hasParameter is an object property which has
domain method and range range.
Datatype Property links individuals to data values. Author
is a dataproperty which has domain Class and the String as
range. ClassComment is a data property which has domain
class and string as range. DataType is a data property which
has domain parameter and the range string as range. Identifier
is a data property which has domain method,class and the range
boolean as range. IsConstructor is a data property which has
domain method and string as range. MethodComment is a data
property which has domain method and string as range. Name
is a data property which has domain project, package, class,
method, parameter and string as range. Project_Date is a data
property which has domain project and string as range.
Project_Description is a data property which has domain
project and string as range. Project_Repository_Location is a
data property which has domain project and string as range.
Project_Description is a data property which has domain
project and string as range. Project_Version is a data property
which has domain project and string as range. Returns is a data
property which has domain method and string as range.
VI. CASE STUDY
To evaluate the proposed framework the following simple
java code is used.
package com.sourceExtractor.ontology;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;
import org.apache.log4j.Logger;
import org.apache.log4j.spi.RootLogger;
import com.hp.hpl.jena.ontology.DatatypeProperty;
/** * To manage the ontology related informations
* @author Sagayaraj */
public class OntoManager {
private Logger LOGGER =
RootLogger.getLogger(OntoManager.class);
@SuppressWarnings("static-access")
public OntModel getModel(String modelLocation) {
OntModel ontModel = null;
ontModel = ModelFactory.createOntologyModel();
ontModel.read(new FileManager().get().open(modelLocation),
"");
return ontModel; }
/**
* To create Individual In OWL
* @param model
* @param concept
* @param individual */
public void createIndividual(OntModel model, String concept,
String individual) {
OntClass ontClass =
model.getOntClass(addNameSpace(concept, model));
model.enterCriticalSection(Lock.WRITE);
try {if (ontClass != null) {
ontClass.createIndividual(addNameSpace(individual, model));
} else {
LOGGER.error("Direct Class is null");// todo
}} finally {
model.leaveCriticalSection();
} }}

The sample java code is given as input to QDox document
generator through the Graphical User Interface (GUI) provided
in the Fig. 2.
Source Code Extractor Source Code Extractor
Select the Source Folder
Project Name
Project Description
Version of Project
/home/prathap/SourceExtractor/src/
Extract Close
Click here to Start Extraction

Figure 2. GUI for locating folder
Using the QDox APIs metadata is extracted as given in
the Table 1. The output of the QDox stores metadata in the
form of strings. To store the metadata the OWL ontology,
template is created using Protg. The strings are passed to the
Jena framework and the APIs place the metadata in to the
OWL Ontology. The entire project folder, stored in the HDFS,
is linked to the method signature in the OWL ontology for
retrieval purpose. The components will be reused for the new
project appropriately. The obtained OWL Ontology
successfully loads on both Protg Editor and Altova
Semantics. The sample OWL file is given below as the output
of the framework.
<owl:Ontology rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Package"/>
ontologies.com/SourceExtractorj.owl#Project"/>
ontologies.com/SourceExtractorj.owl#Class"/>
ontologies.com/SourceExtractorj.owl#Method"/>
ontologies.com/SourceExtractorj.owl#Parameter"/>
<owl:ObjectProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#hasPackage">
<rdfs:domain rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Project"/>
<rdfs:range rdf:resource="http://www.owl-

115 | P a g e
</owl:ObjectProperty>
ontologies.com/SourceExtractorj.owl#hasParameter">
ontologies.com/SourceExtractorj.owl#Parameter"/>
ontologies.com/SourceExtractorj.owl#hasClass">
ontologies.com/SourceExtractorj.owl#hasMethod">
</owl:ObjectProperty> <owl:DatatypeProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Project_Date"/>
<owl:DatatypeProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Identifier">
<rdfs:domain>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
</owl:unionOf>
</owl:Class>
</rdfs:domain>
<rdfs:range>
<owl:Class>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DataRange"/>
<owl:oneOf rdf:parseType="Resource">
<rdf:first rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>public</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>private</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first
rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>protected</rdf:first>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-
ns#nil"/>
</rdf:rest>
</rdf:rest>
</owl:oneOf>
</owl:Class>
</rdfs:range>
</owl:DatatypeProperty>
VII. CONCLUSION AND FUTURE WORK
This paper presents an approach for generating ontologies
using the source code extractor tool from source code. This
approach helps to integrate source code into the Semantic Web.
OWL is semantically much more expressive than needed for
the results of our mapping. With these sample tests the paper
argues that it is indeed possible to transform source code in to
OWL using this Source Code Extractor framework. The
framework created OWL which will increase the efficiency and
consistency in development of knowledge management and
information retrieval applications. The purpose of the paper is
to achieve the code re-usability for the software development.
By creating OWL for the source code the future will be to
search and extract the code and components and reuse to
shorten the software development life cycle. Open source code
can also be used to create OWL so that there will be huge
number of components which can be reused for the
development. By storing the projects in the OWL and the
HDFS the corporate knowledge grows and the developers will
use more of reuse code than developing themselves. Using the
reuse code the development cost will come down, development
time will become shorter, resource utilization will be less and
quality will go up.
After developing OWL and storing the source code in the
HDFS, the code components can be reused. The future work
can take off in two ways. One can take a design document from
the user as input, then extract the method signature and try to
search and match in the OWL. If the user is satisfied with the
method definition, it can be retrieved from the HDFS where the
source code is stored. Second one can take the project
specification as input and text mining can be performed to
extract the keywords as classes and the process as methods.
The method prototype can be used to search and match with the
OWL and the required method definition can be retrieved from
the HDFS. The purpose of storing the metadata in OWL is to
minimize the factors like time of development, time of testing,
time of deployment and developers. Creating OWL using this
framework can reduce these factors.
REFERENCES
[1] Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer,
PHI Learning Private Limited, New Delhi, 2010, pp 1-3.
[2] Bung. M, Treatise on Basic Philosophy. Ontology I. The Furniture of
the World. Vol. 3, Boston: Reidel.
[3] Gopinath Ganapathy and S. Sagayaraj, Automatic Ontology Creation
by Extracting Metadata from the Source code , in Global Journal of
Computer Science and Technology,Vol.10, Issue 14( Ver.1.0) Nov.
2010. pp.310-314.
[4] Won Kim: On Metadata Management Technology Status and Issues,
in Journal of Object Technology, vol. 4, no.2, 2005, pp. 41-47.
[5] Dublin Core Metadata Initiative. <
http://dublincore.org/documents/>,2002.
[6] IEEE Learning Technology Standards Committee,
http://ltsc.ieee.org/wg12, IEEE Standards for Learning Object Metadata
(1484.12.1)
[7] Darmoni, Thirion, Metadata Scheme for Health Resources
American Medical Infor. Association, 2000 JanFeb; 7(1): 108109.
[8] MPEG-7 Overview: ISO/IEC JTC1/SC29/WG11 N4980, Kla-genfurt,
July 2002.
[9] Jason Venner, Pro Hadoop : Build Scalable, Distributed
Applications, in the cloud, Apress, 2009.
[10] Gopinath Ganapathy and S. Sagayaraj, Circumventing Picture
Archiving and Communication Systems Server with Hadoop Framework
in Health Care Services, in Journal of Social Science, Science
Publication 6 (3) : pp.310-314.
[11] Tom White, Hadoop: The Definitive Guide, OReilly Media, Inc.,
2009.
[12] Gruber, T. What is an Ontology? ,September, 2005:
http://www.ksl- stanford.edu/kst/what-is-an-ontology.html.
[13] Yang, X. Ontologies and How to Build Them,(March, 2006):

116 | P a g e
http://www.ics.uci.edu/~xwy/publications/area-exam.ps.
[14] Bugaite, D., O. Vasilecas, Ontology-Based Elicitation of Business
Rules. In A. G. Nilsson, R. Gustas, W. Wojtkowski, W. G.
Wojtkowski, S. Wrycza, J. Zupancic Information Systems
Development: Proc. of the ISD2004. Springer- Verlag, Sweden, 2006,
pp. 795-806.
[15] McCarthy, P, Introduction to Jena:
www-106.ibm.com/developerworks/java/library/j-jena/, ,
22.02.2005.
[16] B. McBride, Jena IEEE Internet Computing, July/August, 2002.
[17] J.J. Carroll. CoParsing of RDF & XML, HP Labs Technical Report,
HPL-2001-292, 2001
[18] J.J. Carroll, Unparsing RDF/XML,WWW2002:
http://www.hpl.hp.com/techreports/2001/HPL-2001- 292.html
[19] T. Berners-Lee et al, Primer: Getting into RDF & Semantic Web using
N3, http://www.w3.org/2000/10/swap/Primer.html
[20] J. Grant, D. Beckett, RDF Test Cases, 2004, W3C6.
[21] L. Miller, A. Seaborne, and A. Reggiori,Three Implementations of
SquishQL, a Simple RDF Query Language, 2002, p 423.
[22] P. Hayes, RDF Semantics, 2004, W3C.
[23] P.F. Patel-Schneider, P. Hayes, I. Horrocks, OWL Semantics &
Abstract Syntax, 2004, W3C.
[24] Protg Semantic Web Framework,
http://protege.stanford.edu/overview/protege-owl.html,
accessed 16
th
October 2010.
[25] Protg.
http://protege.stanford.edu/ontologies/ontologyOfScience.
[26] 9th Intl. Protg Conference - July 23-26, 2006 Stanford,
Californiahttp://protege.stanford.edu/conference/2006.
[27] 10th Intl. Protg Conference - July 15-18, 2007 Budapest,
Hungaryhttp://protege.stanford.edu/conference/2007.
[28] 11th Intl. Protg Conference - June 23-26, 2009 Amsterdam,
Netherlandshttp://protege.stanford.edu/conference/2009.
[29] Hai H. Wang, Natasha Noy, Alan Rector, Mark Musen, Timothy
Redmond, Daniel Rubin, Samson Tu, Tania Tudorache, Nick
Drummond, Matthew Horridge, and Julian Sedenberg, Frames and
OWL side by side. In 10th International Protg Conference,Budapest,
Hungary, July 2007.
AUTHORS PROFILE
Gopinath Ganapathy is the Professor & Head, Department of Computer
Science and Engineering in Bharathidasan University, India. He obtained his
under graduation and post-graduation from Bharathidhasan University, India in
1986 and 1988 respectively. He submitted his Ph.D in 1996 in Maduari
Kamaraj University, India. Received Young Scientist Fellow Award for the
year 1994 and eventually did the research work at IIT Madras. He published
around 20 research papers. He is a member of IEEE, ACM, CSI, and ISTE. He
was a Consultant for a 8.5 years in the international firms in the USA and the
UK, including IBM, Lucent Technologies (Bell Labs) and Toyota. His research
interests include Semantic Web, NLP, Ontology, and Text Mining.

S. Sagayaraj is the Associate professor in the Department of Computer
Science, Sacred Heart College, Tirupattur, India. He did his Bachelor Degree in
Mathematic in Madras University, India in 1985. He completed his Master of
Computer Applications in Bharadhidhasan University, India in 1988. He
obtained Master of Philosophy in Computer Science from Bharathiar
University, India in 2001. He registered for Ph.D. programme in
Bharathidhasan University, India in 2008. His Research interests include Data
Mining, Ontologies and Semantic Web.

TABLE I. METADATA EXTRACTED FROM THE SAMPLE CODE

Project
Project Name Ontology_Learn
Project Version 1.0.0
Project Date 10/10/10
Repository Location /opt/SourceCodeExtrctor/
HasPackage com.sourceExtractor.ontology
Package
Name com.sourceExtractor.ontology
HasClass OntoManager
Class
Name OntoManager
Class Comment It manage the ontology
operation
Class Path /SampleOntology/com/sourceE
xtractor/ontology/OntoMa
nager.java
Author Sagayaraj
Identifier Public
HasMethod getModel
createIndividual
Method
Name getModel
createIndividual
Identifier Public Public
Returns OntoModel
Void
Method Comment -undefined-
To add the data property in owl
file
IsConstructor FALSE
FALSE
HasParameter modelLocation
Individual model
Concept
Parameter
Name modelLocation
Data Type java.lang.String
Name Individual
Name Model
Data Type OntModel
Name Concept

117 | P a g e
Query based Personalization in Semantic Web Mining

Mahendra Thakur
Department of CSE
Samrat Ashok Technological
Institute
Vidisha, M.P., India
Yogendra Kumar Jain
Department of CSE
Institute
Geetika Silakari
Department of CSE
Institute

Abstract To provide personalized support in on-line course
resources system, a semantic web-based personalized learning
service is proposed to enhance the learner's learning efficiency.
When a personalization system relies solely on usage-based
results, however, valuable information conceptually related to
what is finally recommended may be missed. Moreover, the
structural properties of the web site are often disregarded. In this
Paper, we present a personalize Web search system, which can
helps users to get the relevant web pages based on their selection
from the domain list. In the first part of our work we present
Semantic Web Personalization, a personalization system that
integrates usage data with content semantics, expressed in
ontology terms, in order to compute semantically enhanced
navigational patterns and effectively generate useful
recommendations. To the best of our knowledge, our proposed
technique is the only semantic web personalization system that
may be used by non-semantic web sites. In the second part of our
work, we present a novel approach for enhancing the quality of
recommendations based on the underlying structure of a web
site. We introduce UPR (Usage-based Page Rank), a Page Rank-
style algorithm that relies on the recorded usage data and link
analysis techniques based on user interested domains and user
query.
Keywords-SemanticWeb Mining;Personalized Recommendation;
Recommended System
I. INTRODUCTION
Comparing with the traditional face-to-face learning style,
e-learning is indeed a revolutionary way to provide education
in the life-long term. However, different learners have different
learning styles, goals, previous knowledge and other
preferences; the traditional one-size-fits-all learning method
is no longer enough to satisfy the needs of learners. Nowadays
more and more personalized systems have been developed and
are trying to find a solution to the personalization of the
learning process, which affect the learning function outcome.
The Semantic
Web is not a separate web but an extension of the current
one, in which information is given well-defined meaning, and
better enabling computers and people to work in cooperation
[1].Under the conditions of Semantic Web-based learning
system the learning information is well-defined, and the
machine can understand and deal with the semantics for the
learning contents to provide adaptable learning services with a
powerful technical support.

Figure: 1 the web personalization process

The problem of providing recommendations to the visitors
of a web site has received a significant amount of attention in
the related literature. Most of the research efforts in web
personalization correspond to the evolution of extensive
research in web usage mining, taking into consideration only
the navigational behavior of the (anonymous or registered)
visitors of the web site. Pure usage-based personalization,
however, presents certain shortcomings. This may happen
when, for instance, there is not enough usage data available in
order to extract patterns related to certain navigational actions,
or when the web sites content changes and new pages are
added but are not yet included in the web logs. Moreover,
taking into consideration the temporal characteristics of the
web in terms of its usage, such systems are very vulnerable to
the training data used to construct the predictive model. As a
result, a number of research approaches integrate other sources
of information, such as the web content or the web structure in
order to enhance the web personalization process [1] and [2].
As already implied, the users navigation is largely driven
by semantics. In other words, in each visit, the user usually
aims at finding information concerning a particular subject.
Therefore, the underlying content semantics should be a
dominant factor in the process of web personalization. The web
sites content characterization process involves the feature
extraction from the web pages. Usually these features are
keywords subsequently used to retrieve similarly characterized
content. Several methods for extracting keywords that
characterize web content have been proposed. The similarity
between documents is usually based on exact matching
between these terms. This way, however, only a binary
matching between documents is achieved, whereas no actual
semantic similarity is taken into consideration. The need for a

118 | P a g e
more abstract representation that will enable a uniform and
more flexible document matching process imposes the use of
semantic web structures, such as ontologys. By mapping the
keywords to the concepts of an ontology, or topic hierarchy, the
problem of binary matching can be surpassed through the use
of the hierarchical relationships and/or the semantic similarities
among the ontology terms, and therefore, the documents.
Finally, we should take into consideration that the web is not
just a collection of documents browsed by its users. The web is
a directed labeled graph, including a plethora of hyperlinks that
interconnect its web pages. Both the structural characteristics
of the web graph, as well as the web pages and hyper links
underlying semantics are important and determinative factors
in the users navigational process. The main contribution of this
paper is a set of novel techniques and algorithms aimed at
improving the overall effectiveness of the web personalization
process through the integration of the content and the structure
of the web site with the users navigational patterns. In the first
part of our work we present the semantic web personalization
system for Semantic Web Personalization that integrates usage
data with content semantics in order to compute semantically
enhanced navigational patterns and effectively generate useful
recommendations. Similar to previously proposed approaches,
the proposed personalization framework uses ontology terms to
annotate the web content and the users navigational patterns.
The key departure from earlier approaches, however, is that
Semantic Web Personalization is the only web personalization
framework that employs automated keyword-to-ontology
mapping techniques, while exploiting the underlying semantic
similarities between ontology terms. Apart from the novel
recommendation algorithms we propose, we also emphasize on
a hybrid structure-enhanced method for annotating web
content. To the best of our knowledge, Semantic Web
Personalization is the only semantic web personalization
system that can be used by any web site, given only its web
usage logs and a domain-specific ontology [3] and [4].
II. BACKGROUND
The main data source in the web usage mining and
personalization process is the information residing on the web
sites logs. Web logs record every visit to a page of the web
server hosting it. The entries of a web log file consist of several
fields which represent the date and the time of the request, the
IP number of the visitors computer (client), the URI requested,
the HTTP status code returned to the client, and so on. The web
logs file format is based on the so called extended log
format.
Prior to processing the usage data using web mining or
personalization algorithms, the information residing in the web
logs should be preprocessed. The web log data preprocessing is
an essential phase in the web usage mining and personalization
process. An extensive description of this process can be found.
In the sequel, we provide a brief overview of the most
important pre-processing techniques, providing in parallel the
related terminology. The first issue in the pre-processing phase
is data preparation. Depending on the application, the web log
data may need to be cleaned from entries involving page
accesses that returned, for example, an error or graphics file
accesses. Furthermore, crawler activity usually should be
filtered out, because such entries do not provide useful
information about the sites usability. A very common problem
to be dealt with has to do with web pages caching. When a
web client accesses an already cached page, this access is not
recorded in the web sites log. Therefore, important
information concerning web path visits is missed. Caching is
heavily dependent on the client-side technologies used and
therefore cannot be dealt with easily. In such cases, cached
pages can usually be inferred using the referring information
from the logs and certain heuristics, in order to re-construct the
user paths, filling out the missing pages. After all page accesses
are identified, the page view identification should be
performed. A page view is defined as the visual rendering of a
web page in a specific environment at a specific point in time.
In other words, a page view consists of several items, such as
frames, text, graphics and scripts that construct a single web
page. Therefore, the page view identification process involves
the determination of the distinct log file accesses that
contribute to a single page view. Again such a decision is
application-oriented. In order to personalize a web site, the
system should be able to distinguish between different users or
groups of users. This process is called user profiling. In case no
other information than what is recorded in the web logs is
available, this process results in the creation of aggregate,
anonymous user profiles since it is not feasible to distinguish
among individual visitors. However, if the users registration is
required by the web site, the information residing on the web
log data can be combined with the users demographic data, as
well as with their individual ratings or purchases. The final
stage of log data pre-processing is the partition of the web log
into distinct user and server sessions. A user session is defined
as a delimited set of user clicks across one or more web
servers, whereas a server session, also called a visit, is defined
as a collection of user clicks to a single web server during a
user session. If no other means of session identification, such
as cookies or session ids is used, session identification is
performed using time heuristics, such as setting a minimum
timeout and assumes that consecutive accesses within it belong
to the same session, or a maximum timeout, assuming that two
consecutive accesses that exceed it belong to different sessions
[1] and [5] and [6].
A. Web Usage Mining and Personalization:
Web usage mining is the process of identifying
representative trends and browsing patterns describing the
activity in the web site, by analyzing the users behavior. Web
site administrators can then use this information to redesign or
customize the web site according to the interests and behavior
of its visitors, or improve the performance of their systems.
Moreover, the managers of e-commerce sites can acquire
valuable business intelligence, creating consumer profiles and
achieving market segmentation. There exist various methods
for analyzing the web log data. Some research studies use well
known data mining techniques such as association rules
discovery, sequential pattern analysis, clustering, probabilistic
models, or a combination of them. Since web usage mining
analysis was initially strongly correlated to data warehousing,
there also exist some research studies based on OLAP cube
models. Finally some proposed web usage mining approaches
that require registered user profiles, or combine the usage data

119 | P a g e
with semantic meta-tags incorporated in the web sites content.
Furthermore, this knowledge can be used to automatically or
semi-automatically adjust the content of the site to the needs of
specific groups of users, i.e. to personalize the site. As already
mentioned, web personalization may include the provision of
recommendations to the users, the creation of new index pages,
or the generation of targeted advertisements or product
promotions. The usage-based personalization systems use
association rules and sequential pattern discovery, clustering,
Markov models, machine learning algorithms, or are based on
collaborative filtering in order to generate recommendations.
Some research studies also combine two or more of the
aforementioned techniques [2] and [4].
B. Integrating Content Semantics in Web Personalization:
Several frameworks supporting the claim that the
incorporation of information related to the web sites content
enhances the web personalization process have been proposed
prior or subsequent to our work. In this Section we overview
in detail the ones that are more similar to ours, in terms of
using a domain-ontology to represent the web sites content.
Dai and Mobasher proposed a web personalization
framework that uses ontologies to characterize the usage
profiles used by a collaborative filtering system. These profiles
are transformed to domain-level aggregate profiles by
representing each page with a set of related ontology objects. In
this work, the mapping of content features to ontology terms is
assumed to be performed either manually, or using supervised
learning methods. The defined ontology includes classes and
their instances therefore the aggregation is performed by
grouping together different instances that belong to the same
class. The recommendations generated by the proposed
collaborative system are in turn derived by binary matching of
the current user visit, expressed as ontology instances, to the
derived domain-level aggregate profiles, and no semantic
similarity measure is used. The idea of semantically enhancing
the web logs using ontology concepts is independently
described in recent. This framework is based on a semantic
web site built on an underlying ontology. The authors present a
general framework where data mining can then be performed
on these semantic web logs to extract knowledge about groups
of users, users preferences, and rules. Since the proposed
framework is built on a semantic web knowledge portal, the
web content is already semantically annotated focuses solely on
web mining and thus does not perform any further processing
in order to support web personalization.
In recent (through the existing RDF annotations), and no
further automation is provided. Moreover, the proposed
framework t work also proposes a general personalization
framework based on the conceptual modeling of the users
navigational behavior. The proposed methodology involves
mapping each visited page to a topic or concept, imposing a
concept hierarchy (taxonomy) on these topics, and then
estimating the parameters of a semi-Markov process defined on
this tree based on the observed user paths. In this Markov
models-based work, the semantic characterization of the
content is performed manually. Moreover, no semantic
similarity measure is exploited for enhancing the prediction
process, except for generalizations/specializations of the
ontology terms. Finally, in a subsequent work, explore the use
of ontologies in the user profiling process within collaborative
filtering systems. This work focuses on recommending
academic research papers to academic staff of a University.
The authors represent the acquired user profiles using terms of
research paper ontology (is-a hierarchy). Research papers are
also classified using ontological classes. In this hybrid
recommender system which is based on collaborative and
content-based recommendation techniques, the content is
characterized with ontology terms, using document classifiers
(therefore a manual labeling of the training set is needed) and
the ontology is again used for making
generalizations/specializations of the user profiles [7] and [8]
and [9].
C. Integrating Structure in Web Personalization:
Although the connectivity features of the web graph have
been extensively used for personalizing web search results,
only a few approaches exist that take them into consideration in
the web site personalization process. To use citation and
coupling network analysis techniques in order to conceptually
cluster the pages of a web site. The proposed recommendation
system is based on Markov models. In previous, use the degree
of connectivity between the pages of a web site as the
determinant factor for switching among recommendation
models based on either frequent item set mining or sequential
pattern discovery. Nevertheless, none of the aforementioned
approaches fully integrates link analysis techniques in the web
personalization process by exploiting the notion of the
authority or importance of a web page in the web graph.
In a very recent work, address the data sparsity problem of
collaborative filtering systems by creating a bipartite graph and
calculating linkage measures between unconnected pairs for
selecting candidates and make recommendations. In this study
the graph nodes represent both users and rated/purchased items.
Finally, subsequent work, proposed independently two link
analysis ranking methods, Site Rank and Popularity Rank
which are in essence very much like the proposed variations of
our UPR algorithm (PR and SUPR respectively). This work
focuses on the comparison of the distributions and the rankings
of the two methods rather than proposing a web personalization
algorithm [9] and [10].
III. PROPOSED TECHNIQUE
In this paper, we present Semantic Enhancement for Web
Personalization, a web personalization framework that
integrates content semantics with the users navigational
patterns, using ontologies to represent both the content and the
usage of the web site. In our proposed framework we employ
web content mining techniques to derive semantics from the
web sites pages. These semantics, expressed in ontology
terms, are used to create semantically enhanced web logs,
called C-logs (concept logs). Additionally, the site is organized
into thematic document clusters. The C-logs and the document
clusters are in turn used as input to the web mining process,
resulting in the creation of a broader, semantically enhanced set
of recommendations. The whole process bridges the gap
between Semantic Web and Web Personalization areas, to
create a Semantic Web Personalization system.

120 | P a g e
A. Semantic Enhancement for Web Personalization System
Architecture:
Semantic Enhancement for Web Personalization uses a
combination of web mining techniques to personalize a web
site. In short, the web sites content is processed and
characterized by a set of ontology terms (categories). The Web
personalization process include (a) The collection of Web data,
(b) The modeling and categorization of these data
(preprocessing phase), (c) The analysis of the collected data,
and (d) The determination of the actions that should be
performed. When a user sends a query to a search engine, the
search engine returns the URLs of documents matching all or
one of the terms, depending on both the query operator and the
algorithm used by the search engine. Ranking is the process of
ordering the returned documents in decreasing order of
relevance, that is, so that the best answers are on the top.
When the user enters the query, the query is first analyzed .The
Query is given as input to the semantic search algorithm for
separation of nouns, verbs, adjectives and negations and
assigning weights respectively. The processed data is then
given to the personalized URL Rank algorithm for
personalizing the results according to the user domain, interest
and need. The sorted results are those results in which the user
is interested. The personalization can be enhanced by
categorizing the results according to the types. Thus after
building the knowledge base, the system can give use
recommendation based on the similarity of the user interested
domain and the user query. The recommendation procedure of
the System has two steps:
The system gives user a list of interested domains
.Detect users current interested domain.

Based on users current interested domain and
combined his or her profile, the system will give him
or her set of URLs with ranking scores.

In this way, the system could help the user to retrieve his or
her potential interested domains. Besides, a user can change his
or her current interested domain by clicking the interested
domain list on the same page but with more convenience. In
the beginning, if the user does not have a profile in the
database, the system displays the user available domains, and
then keeps a track of the users selections .The users selections
is used to construct a table that uses URL weight calculation.
The current interested domains recommendation is based on
last selections. The figure 2 shows the complete process.

Figure 2: Web Personalization architecture

B. Recommendation process:
The learners implicit query defined previously under both
of its shapes constitutes the input of the recommendation phase.
The recommendation process task is accomplished using
basically: content based filtering (CBF) and collaborative
filtering (CF) approaches (Figure 3). First, we apply the (CBF)
approach alone using the search functionalities of the search
engine. We submit the term vector to the search engine in order
to compute recommendation links. Results are ranked
according to the cosine similarity of their content (vector of
TF-IDF weighted terms) with the submitted term vector.
Second, we apply the collaborative approach (CF) alone by
comparing, first, the sliding window pages to clusters (groups
of learners obtained in the offline phase by applying two-level
model based collaborative filtering approach) in order to
classify the active learner in one of the learners group. Then,
we use the ARs of the corresponding group to give personalized
recommendations. The current session window is matched
against the "condition" or left side of each rule.
It is worth noting that several recommendation strategies
using these approaches have been investigated in our work.
After applying a CF and CBF approaches alone, we included
next the possibility to combine both of the recommendation
approaches (CBF and CF) in order to improve the
recommendation quality and generate the most relevant
learning objects to learners. Hence, two approaches are to be
considered: Hybrid content via profile based collaborative
filtering with cascaded/feature augmentation combination,
which performs collaborative recommendation followed by
content recommendation (the reverse order could also be
considered); and Hybrid content and profile based collaborative
filtering with weighted combination, where the collaborative
filtering and content based filtering recommendations are
performed simultaneously, then the results of both techniques
are combined together to produce a single recommendation set.
In the Hybrid content via profile based collaborative filtering
with cascaded/feature augmentation combination approach, we
apply first CF approach giving as output a set of recommended
links, then we apply CBF approach on these links. In fact,
recommended links are mapped to a set of content terms in

121 | P a g e
order to compose a term vector (top k frequent terms), a parser
tool must be used for this task. Finally, these terms are
submitted to the search engine which returns the final
recommended links.

Figure 3: Recommendation process

In the Hybrid content and profile based collaborative
filtering with weighted combination approach, the collaborative
filtering and content based filtering are performed separately,
then the results of both techniques are combined together to
produce a single recommendation set.
C. This process uses the following steps:

I. Step 1 is performed in the same way as in CF approach; the
result is called Recommended Set 1;
II. Step 2 maps each LO references in the sliding window to a
set of content terms (top k frequent terms). Then these terms
are submitted to the search engine which returns
recommended links. This result is called Recommended Set 2;
III. Final collaborative and content based filtering
recommendation combination: both recommended sets
obtained previously are combined together to form a coherent
list of related recommendation links, which are ranked based
on their overlap ratio.
IV. METHODOLOGY
Data Set The two key advantages of using this data set are
that the web site contains web pages in several formats
(such as pdf, html, ppt, doc, etc.), written both in Greek
and English and a domain-specific concept hierarchy is
available (the web administrator created a concept-
hierarchy of 150 categories that describe the sites
content). On the other hand, its context is rather narrow,
as opposed to web portals, and its visitors are divided into
two main groups: students and researchers. Therefore, the
subsequent analysis (e.g. association rules) uncovers these
trends: visits to course material, or visits to publications
and researcher details. It is essential to point out that the
need for processing online (up-to-date) content, made it
impossible for us to use other publicly available web log
sets, since all of them were collected many years ago and
the relevant sites content is no longer available.
Moreover, the web logs of popular web sites or portals,
which would be ideal for our experiments, are considered
to be personal data and are not disclosed by their owners.
To overcome these problems, we collected web logs over
a 1-year period (01/01/10 31/12/10). After
preprocessing, the total web logs size was approximately
105 hits including a set of over 67.700 distinct
anonymous user sessions on a total of 360 web pages. The
sessionizing was performed using distinct IP & time limit
considerations (setting 20 minutes as the maximum time
between consecutive hits from the same user).
Keyword Extraction: Category Mapping: We extracted
up to 7 keywords from each web page using a
combination of all three methods (raw term frequency,
inlinks, outlinks). We then mapped these keywords to
ontology categories and kept at most 5 for each page.
Document Clustering: We used the clustering scheme
described in recent, i.e. the DBSCAN clustering algorithm
and the similarity measure for sets of keywords. However,
other web document clustering schemes (algorithm &
similarity measure) may be employed as well.
Association Rules Mining: We created both URI-based
and category-based frequent item sets and association
rules. We subsequently used the ones over a 40%
confidence threshold.
V. RESULTS
In our paper work we compare the performance of the three
ranking methods based on pure similarity, plain Page Rank and
weighted (personalized) URL Rank.
The personalization accuracy was found to be 75%; the
random search accuracy is 74.6 %. The average of
personalization accuracy is 74.7%. Because the interested
domains personalization is done considering the user selected
domain, the accuracy is higher than the random
recommendation in our experiment. Above Fig. 4 is a
comparison of the interested domains personalization accuracy
based on random selection and based on our personalization
method. Figure 4 shows Relevance Query Results vs. Random
& Personalization Selection graph.

Figure 4 Random Selection accuracy
P
e
r
c
e
n
t
a
g
e

a
c
c
u
r
a
c
y

pure_similarity plain_PageRank
weighted_URL_Rank.
Random
Selection
using
personalized
selection

122 | P a g e

The URL personalization accuracy based on the interested
domains selection is 71.3%; and the URL personalization
accuracy without the interested domains selection assistance is
31.9 % in Fig. 5. From this result, we can see that the interested
domains recommendation help the system to filter lots of URLs
that the user might not be interested in. Moreover, the system
could focus on the domains that users are interested in to select
the relevant URL. Figure 5 shows Relevance Query Results vs.
Random & Personalization Selection graph.

Figure 5- Personal accuracy in interested domain
VI. CONCLUSION
In this paper contribution is a core technology and reusable
software engine for rapid design of a broad range of
applications in the field of personalized recommendation
systems and more. We present a web personalization system
for web search, which not only gives user a set of personalized
pages, but also gives user a list of domains the user may be
interested in. Thus, user can switch to different interests when
he or she is surfing on the web for information. Besides, the
system focuses on the domains that the user is interested in, and
wont waste lots of time on searching the information in the
irrelevant domains. Moreover, the recommendation wont be
affected by the irrelevant domains, and the accuracy of the
recommendation is increased.
REFERENCES
[1] Changqin Huang, Ying Ji, Rulin Duan, A semantic web-based
personalized learning service supported by on-line course resources, 6th
IEEE International Conference on Networked Computing (INC), 2010.
[2] V. Gorodetsky, V. Samoylov, S. Serebryakov, Ontologybased
contextdependent personalization technology, IEEE/WIC/ACM
International Conference on Web Intelligence and Intelligent Agent
Technology, vol. 3, pp. 278-283, 2010.
[3] Pasi Gabriella, Issues on preference-modelling and personalization in
information retrieval, IEEE/WIC/ACM International Conference on
Web Intelligence and Intelligent Agent Technology, pp. 4, 2010.
[4] Wei Wenshan and Li Haihua, Base on rough set of clustering algorithm
in network education application, IEEE International Conference on
Computer Application and System Modeling (ICCASM 2010), vol. 3,
pp. V3-481 - V3-483, 2010.
[5] Shuchih Ernest Chang, and Chia-Wei Wang, Effectively generating and
delivering personalized product information: Adopting the web 2.0
approach, 24th IEEE International Conference on Advanced
Information Networking and Applications Workshops (WAINA), pp.
401-406, 2010.
[6] Xiangwei Mu, Van Chen, and Shuyong Liu, Improvement of similarity
algorithm in collaborative filtering based on stability degree, 3rd
International Conference on Advanced Computer Theory and
Engineering (ICACTE), vol .4, pp. V4-106 - V4-110, 2010
[7] Dario Vuljani, Lidia Rovan, and Mirta Baranovi, Semantically
enhanced web personalization approaches and techniques, 32nd IEEE
International Conference on Information Technology Interfaces (ITA),
pp. 217-222, 2010.
[8] Raymond Y. K. Lau, Inferential language modeling for selective web
search personalization and contextualization, 3rd IEEE International
Conference on Advanced Computer Theory and Engineering (ICACTE),
vol. 1, pp. V1-540 - V1-544, 2010.
[9] Esteban Robles Luna, Irene Garrigos, and Gustavo Rossi, Capturing
and validating personalization requirements in web applications, 1st
IEEE International Workshop on Web and Requirements Engineering
(WeRE), pp. 13-20, 2010.
[10] B. Annappa, K. Chandrasekaran, K. C. Shet, Meta-Level constructs in
content personalization of a web application, IEEE International
conference on Computer & Communication Technology-ICCCT10, pp.
569 574, 2010.
[11] F. Murtagh, A survey of recent advances in hierarchical clustering
algorithms, Computer Journal, vol. 26, no. 4, pp. 354 -359, 1983.
[12] Wang Jicheng, Huang Yuan, Wu Gangshan, and Zhang Fuyan, Web
mining: Knowledge discovery on the web system, IEEE International
Conference systems, Man and cybernatics, vol.2, pp. 137 141, 1999.
[13] B. Mobasher, Web usage mining and personalization in practical
handbook of internet computing, M.P. Singh, Editor. 2004, CRC Press,
pp. 15.1-37.
[14] T. Maier, A formal model of the ETL process for OLAP-based web
usage analysis, 6th WEBKDD- workshop on Web Mining and Web
Usage Analysis, part of the ACM KDD: Knowledge Discovery and
Data Mining Conference, pp. 23-34, Aug. 2004
[15] R. Meo, P. Lanzi, M. Matera, R. Esposito, Integrating web conceptual
modeling, WebKDD, vol. 3932, pp. 135-148, 2006.
[16] B. Mobasher, R. Cooley, and J. Srivastava, Automatic personalization
based on web usage mining, Communications of the ACM, vol. 43, no.
8, pp. 142151, Aug 2000.
[17] B. Mobasher, H. Dai, T. Luo, and M. Nakagawa, Effective
personalization based on association rule discovery from web usage
data, 3rd international ACM Workshop on Web information and data
management, 2001.
[18] O. Nasraoui, R. Krishnapuram, and A. Joshi, Mining web access logs
using a relational clustering algorithm based on a robust estimator, 8th
International World Wide Web Conference, pp. 40-41, 1999.
[19] D. Pierrakos, G. Paliouras, C. Papatheodorou, V. Karkaletsis, and M.
Dikaiakos, Web community directories: A new approach to web
personalization, 1st European Web Mining Forum (EWMF'03),
vol. 3209, pp. 113-129, 2003.
[20] Schafer J. B., Konstan J., and Reidel J., Recommender systems in e-
commerce, 1st ACM Conference on Electronic commerce, pp. 158-166.
1999

AUTHORS PROFILE

Dr. Yogendra Kumar Jain presently working as head of the department,
Computer Science & Engineering at Samrat Ashok Technological Institute
Vidisha M.P India. The degree of B.E. (Hons) secured in E&I from SATI
Vidisha in 1991, M.E. (Hons) in Digital Tech. & Instrumentation from
SGSITS, DAVV Indore(M.P), India in 1999. The Ph. D. degree has been
awarded from Rajiv Gandhi Technical University, Bhopal (M.P.) India in
2010.

123 | P a g e

Research Interest includes Image Processing, Image compression, Network
Security, Watermarking, Data Mining. Published more than 40 Research
papers in various Journals/Conferences, which include 10 research papers in
International Journals. Tel:+91-7592-250408, E-mail: [email protected].

Geetika Silakari presently working as Asst. Professor in Computer Science
& Engineering at Samrat Ashok Technological Institute Vidisha M.P India.
The degree of B.E. (Hons) secured in Computer Science & engineering. She
secured M.Tech in Computer science and Engineering from Vanasthali
University. She is currently pursuing PHD in Computer science and
engineering. E-mail:[email protected]

Mr. Mahendra Thakur is a research scholar pursuing M.Tech in Computer
Science & Engineering from Samrat Ashok Technological Institute Vidisha
M.P India. He secured degree of B.E. in IT from Rajiv Gandhi Technical
University, Bhopal (M.P.) India in 2007.
[email protected]

124 | P a g e
A Short Description of Social Networking Websites
And Its Uses
Ateeq Ahmad
Department of Computer science & Engineering
Singhania University
Pacheri Bari, Disst. Jhunjhunu (Rajasthan)-333515, India

AbstractNow days the use of the Internet for social networking
is a popular method among youngsters. The use of collaborative
technologies and Social Networking Site leads to instant online
community in which people communicate rapidly and
conveniently with each other. The basic aim of this research
paper is to find out the kinds of social network are commonly
using by the people.
Keywords-Social Network, kinds, Definition, Social Networking
web sites, Growth.
I. INTRODUCTION
A web site that provides a social community for people
interested in a particular subject or interest together. Members
create there own online profile with data, pictures, and any
other information. They communicate with each other by
voice, chat, instant message, videoconferencing, and the
service typically provides a way for members to connect by
making connections through individuals is known as Social
networking. Now days there are many web sites dedicated to
the Social Networking, some popular websites are. Facebook,
Orkut, Twitter, Bebo, Myspace, Friendster, hi5, and
Bharatstudent are very commonly used by the people. These
websites are also known as communities network sites. Social
networking websites function like an online community of
internet users. Depending on the website in question, many of
these online community members share common interests in
hobbies, discussion. Once you access to a social networking
website you can begin to socialize. This socialization may
include reading the profile pages of other members and
possibly even contacting them.
II. DEFINITION
Boyd and Ellison (2007) define social network services as
web-based services which allow individuals to Construct a
public or semipublic profile within a bounded system,
Communicate with other users; and View the pages and details
provided by other users within the system. The social
networking websites have evolved as a combination of
personalized media experience, within social context of
participation. The practices that differentiate social networking
sites from other types of computer-mediated communication
are uses of profiles, friends and comments or testimonials
profiles are publicly viewed, friends are publicly articulated,
and comments are publicly visible.
Users who join Social networking websites are required to
make a profile of themselves by filling up a form. After filling
up the forms, users are supposed to give out information about
their personality attributes and personal appearances. Some
social networking websites require photos but most of them
will give details about one's age, preference, likes and dislikes.
Some social networking websites like Facebook allow users to
customize their profiles by adding multimedia content.
(Geroimenko & Chen, 2007)
III. CHARACTERISTICS OF SOCIAL NETWORKING SITES
Social networking websites provide rich information about
the person and his network, which can be utilized for various
business purposes. Some of the main characteristics of social
networking sites are:
They act as a resource for advertisers to promote their
brands through word-of-mouth to targeted customers.
They provide a base for a new teacher-student
relationship with more interactive sessions online.
They promote the use of embedded advertisements in
online videos.
They provide a platform for new artists to show their
profile.
IV. OBJECTIVE
The basic objective of this research is to analysis about the
awareness and frequency regarding the use of social
networking websites.
V. HISTORY OF SOCIAL NETWORKING WEBSITES
The first social networking websites was launched in the
year 1997 Sixdegrees.com. This company was the first of its
kind; it allowed user to list their profiles, provide a list of
friends and then contact them. However, the Company did not
do very well as it eventually closed three years later. The
reason for this was that many people using the internet at that
time had not formed many social networks hence there was
little room for maneuver. It should be noted that there were also
other elements that hinted at Social network websites. For
instance, dating sites required users to give their profiles but
they could not share other people's websites. Additionally,

125 | P a g e
there were some websites that would link former school mates
but the lists could not be shared with others. (Cassidy, 2006)
After this there was the creation of Live Journal in the year
1999. It was created in order to facilitate one way exchanges of
journals between friends. Another company in Korea called CY
world added some social networking features in the year 2001.
This was then followed by Lunar Storm in Sweden during the
same year. They include things like diary pages and friends
lists. Additionally, Ryze.com also established itself in the
market. It was created with the purpose of linking business men
within San Francisco. The Company was under the
management of Friendster, LinkedIn, Tribe.net and Ryze. The
latter company was the least successful among all others.
However, Tribe.net specialized in the business world but
Friendster initially did well; this did not last for long. (Cohen,
2003)
VI. SOCIAL NETWORKING WEBSITES THAT ARE COMMONLY
USED BY THE PEOPLE
The most significant Social networking websites commonly
used by the people especially by the youngster like, Friendster,
Myspace, Facebook, Downlink, Ryze, SixDegrees, Hi 5,
LinkedIn, Orkut, Flicker, YouTube, Reddit, Twitter,
FriendFeed, BharatStudent and Floper.
A. Friendster
Friendster began its operations in the year 2002. It was a
brother company to Ryze but was designed to deal with the
social aspect of their market. The company was like a dating
service, however, match making was not done in the typical
way where strangers met. Instead, friends would propose which
individuals are most compatible with one another. At first,
there was an exponential growth of the Comply. This was
especially after introduction of network for gay men and
increase in number of bloggers. The latter would usually tell
their friends about the advantages of social networking through
Friendster and this led to further expansion. However,
Friendster had established a market base in one small
community. After their subscribers reached overwhelming
numbers, the company could no longer cope with the demand.
There were numerous complaints about the way their servers
were handled because subscribers would experience
communication breakdowns. As if this was not enough, social
networks in the real world were not doing well; some people
would find themselves dating their bosses or former classmates
since the virtual community created by the company was rather
small. The Company also started limiting the level of
connection between enthusiastic users. (Boyd, 2004)
B. MySpace
By 2003, there were numerous companies formed with the
purpose of providing social networking service. However, most
of them did not attract too much attention especially in the US
market. For instance, LinkedIn and Xing were formed for
business persons while services like MyChurch, Dogster and
Couchsurfing were formed for social services. Other
companies that had been engaging in other services started
offering social networking services. For instance, the You Tube
and Last. FM was initially formed to facilitate video and music
sharing respectively. However, the started adopted social
networking services. (Backstrom et al, 2006)
C. Facebook
This social networking service was introduced with the
purpose of linking friends in Harvard University in 2004.
Thereafter, the company expanded to other universities then
colleges. Eventually, they invited corporate communities. But
this does not mean that profiles would be interchanged at will.
There are lots of restrictions between friends who join the
universities social network because they have to have the .edu
address. Additionally, those joining corporate network must
also have the .com attachment. This company prides itself in
their ability to maintain privacy and niche communities and
have been instrumental in learning institutions. (Charnigo &
Barnett-Ellis, 2007)
D. Downelink
This website was founded in 2004 for the lesbian, gay,
bisexual, and transgender community. Some features include
social networking, weblogs, internal emails, a bulletin board,
DowneLife and in the future, a chat.
E. Ryze
The first of the online social networking sites, Adrian
Scotts founded Rzye as a business-oriented online community
in 2001.Business people can expand their business networks by
meeting new people and join business groups, called Networks,
through industries, interests, and geographic areas.
F. SixDegrees
Six Degrees was launched in 1997 and was the first modern
social network. It allowed users to create a profile and to
become friends with other users. While the site is no longer
functional, at one time it was actually quite popular and had
around a million of members.
G. Hi5
Hi5 is established in 2003 and currently boasting more than
60 million active members according to their own claims.
Users can set their profiles to be seen only by their network
members. While Hi5 is not particularly popular in the U.S., it
has a large user base in parts of Asia, Latin America and
Central Africa.
H. LinkedIn
LinkedIn was founded in 2003 and was one of the first
mainstream social networks devoted to business. Originally,
LinkedIn allowed users to post a profile and to interact through
private messaging.
I. Orkut
Launched in January 2004, is Goggles social network, and
while its not particularly popular in the U.S., its very popular
in Brazil and India, with more than 65 million users. Orkut lets
users share media, status updates, and communicate through
IM.

126 | P a g e
J. Flickr
Flickr has become a social network in its own right in
recent years. They claim to host more than 3.6 billion images
as of June 2009. Flickr also has groups, photo pools, and allows
users to create profiles, add friends, and organize images and
video.
K. YouTube
YouTube was the first major video hosting and sharing site,
launched in 2005. YouTube now allows users to upload HD
videos and recently launched a service to provide TV shows
and movies under license from their copyright holders.
L. Reddit
Reddit is another social news site founded in 2005. Reddit
operates in a similar fashion to other.
M. Twitter
Twitter was founded in 2006 and gained a lot of popularity
during the 2007. Status updates have become the new norm in
social networking.
N. FriendFeed
Friend Feed launched in 2007 and was recently purchased
by Facebook, allow you to integrate most of your online
activities in one place. Its also a social network in its own
right, with the ability to create friends lists, post and updates.
O. BharatStudent
Bharatstudent is a social utility that brings together all the
young Indians living across the globe. It is for every Young
Indian who is a student or a non-student, fresh graduate, a
working professional or an Entrepreneur, and is focused on
providing comprehensive solutions for any personal and
professional issues.
P. Fropper
Fropper is ALL about meeting people, making new friends
& having fun with photos, videos, games & blogs! Come,
become a part of the 4 Million strong Fropper communities.
VII. GROWTH OF SOCIAL NETWORKING WEBSITES.
Now days Social networking popularity is increasing
rapidly around the world. Social networking behemoth
MySpace.com attracted more than 114 million global visitors
age 15 and older in June 2007, representing a 72-percent
increase versus year ago. Facebook.com experienced even
stronger growth during that same time frame, jumping 270
percent to 52.2 million visitors. Bebo.com (up 172 percent to
18.2 million visitors) and Tagged.com (up 774 percent to 13.2
million visitors) also increased by orders of magnitude.
(ComScore)
A. Worldwide Growth of Selected social Networking Sites
between June 2006 and June 2007
During the past year, social networking has really taken off
globally, Literally hundreds of millions of people around the
world are visiting social networking sites each month and many
are doing so on a daily basis(Bob Ivins) see table I.
TABLE I. ANLYSIS OF SOCIAL NETWORKING SITES
Social Networking sites
Worldwide
Growth of Social Networking Sites
J une-2006 J une-2007
Percent
Change
MySpace 66,401 114,147 72
FaceBook 14,083 52,167 270
Hi5 18,098 28,174 56
Friendster 14,917 24,675 65
Orkut 13,588 24,120 78
Bebo 6,694 18,200 172
Tagged 1,506 13,167 774

B. Worldwide Growth of Selected social Networking Sites
between June 2007 and June 2008
During the past year, many of the top social networking
sites have demonstrated rapid growth in their global user bases.
Facebook.com, which took over the global lead among social
networking sites in April 2008, has made a concerted effort to
become more culturally relevant in markets outside the U.S. Its
introduction of natural language interfaces in several markets
has helped propel the site to 153 percent growth during the past
year. Meanwhile, the emphasis Hi5.com has put on its full-
scale localization strategy has helped the site double its visitor
base to more than 56 million. Other social networking sites,
including Friendster.com (up 50 percent), Orkut (up 41
percent), and Bebo.com (up 32 percent) have demonstrated
particularly strong growth on a global basis. See table II.
TABLE II. ANLYSIS OF SOCIAL NETWORKING SITES
C. Worldwide Growth of Selected social Networking Sites
between July 2009 and July 2010
Social Networking sites in India, that Facebook.com
grabbed the number one ranking in the category for the first
time in July with 20.9 million visitors, up 179 percent versus
year ago. The social networking phenomenon continues to gain
steam worldwide, and India represents one of the fastest
growing markets at the moment, Though Facebook has tripled
its audience in the past year to pace the growth for the
category, several other social networking sites have posted
their own sizeable gains. (Will Hodgman)See table III.
Worldwide
J une-2007 J une-2008
Percent
Change
Asia Pacific 162,738 200,555 23
Europe 122,527 165,256 35
North America 120,848 131,255 9
Latin America 40,098 53,248 33
Middle East Africa 18,226 30,197 66

127 | P a g e
More than 33 million Internet users age 15 and older in
India visited social networking sites in July, representing 84
percent of the total Internet audience. India now ranks as the
seventh largest market worldwide for social networking, after
the U.S., China, Germany, Russian Federation, Brazil and the
U.K. The total Indian social networking audience grew 43
percent in the past year, more than tripling the rate of growth of
the total Internet audience in India.
TABLE III. ANLYSIS OF SOCIAL NETWORKING SITES
VIII. CONCLUSTION
Social networking websites is also one of the social media
tools which can be used as a tool in education industry to
generate on line traffic and a pipe line for new entrants. The
use of these websites is growing rapidly, while others
traditional online is on the decrease. Social network user
numbers are staggering, vastly increasing the exposure
potential to education industry through advertising industry.
Social network offers people great convenience for social
networking. It allows people to keep in touch with friends, and
with old friends, meet new people, and even conduct business
meeting online. You can find people with similar interests as
you and get to know them better, even if they are in a different
country. Every day people are joining the Social Network. And
the growth and uses of social networking are increasing, all
over the World.
REFERENCES
[1] Budden C B, Anthony J F, Budden M C and Jones M A (2007),
"Managing the Evolution of a Revolution: Marketing Implications of
Internet Media Usage Among College Students", College Teaching
Methods and Styles Journal, Vol. 3, No. 3, pp. 5-10.
[2] Danielle De Lange, Filip Agneessens and Hans Waege (2004), "Asking
Social Network Questions: A Quality Assessment of Different
Measures", Metodoloki zvezki, Vol. 1, No. 2, pp. 351-378.
[3] David Marmaros and Bruce Sacerdote (2002), "Peer and Social
Networks in Job Search", European Economic Review, Vol. 46, Nos. 4
and 5, pp. 870-879.
[4] Feng Fu, Lianghuan Liu and Long Wang (2007), "Empirical Analysis of
Online Social Networks in the Age of Web 2.0", Physica A: Statistical
Mechanics and its Applications, Vol. 387, Nos. 2 and 3, pp. 675-684.
[5] Kautz H, Selman B and Shah M (1997), "Referral Web: Combining
Social Net Works and Collaboration Filtering", Communication of the
ACM, Vol. 40, No. 3, pp. 63-65.
[6] Mark Pendergast and Stephen C Hayne (1999), "Groupware and Social
Networks: Will Life Ever Be the Same Again?", Information and
Software Technology, Vol. 41, No. 6, pp. 311-318.
[7] Mayer Adalbert and Puller Steven L (2008), "The Old Boy (and Girl)
Network: Social Network Formation on University Campuses", Journal
of Public Economics, Vol. 92, Nos. 1 and 2, pp. 329-347.
[8] Reid Bate and Samer Khasawneh (2007), "Self-Efficacy and College
Student's Perceptions and Use of Online Learning Systems", Computers
in Human Behavior, Vol. 23, No. 1, pp. 175-191.
[9] Sorokou C F and Weissbrod C S (2005), "Men and Women's Attachment
and Contact Patterns with Parents During the First Year of College",
Journal of Youth and Adolescence, Vol. 34, No. 3, pp. 221-228.
[10] Boyd Danah and Ellison Nicole (2007), "Social Network Sites:
Definition, History and Scholarship", Journal of Computer-Mediated
Communication, Vol. 13, No. 1.
[11] Boyd Danah (2007), "Why Youth (Heart) "Social Network Sites: The
Role of Networked Publics in Teenage Social Life", MacArthur
Foundation Series on Digital Learning-Youth, Identity and Digital
Media Volume, David Buckingham (Ed.), MIT Press, Cambridge, MA.
[12] Madhavan N (2007), "India Gets More Net Cool", Hindustan Times,
July 6, http://www.hindustantimes.com
[13] Cotriss, David (2008-05-29). Where are they now: Theglobe.com. The
Industry Standard.
[14] Romm-Livermore, C, & setzekorn, K. (2008). Social Networking
communities and E-Dating Services: Concepts and Implications. IGI
Global. P.271
[15] Knapp, E, (2006). A Parents Guide to Myspace. DayDream Publishers.
ISBN 1-4196-4146-8.
[16] Acquisti, A, & Gross, R (2006): Imagined communities: Awareness,
information sharing, and privacy on the Facebook: Cambridge, UK:
Robinson College
[17] Backstrom, L et al (2006): Group formation in large social networks:
Membership, growth, and evolution, pp. 44-54, New York ACM Press.
[18] Boyd, D. (2004): Friendster and publicly articulated social networks.
Proceeding of ACM
[19] Cassidy, J, (2006): Me media: How hanging out on the Internet Became
big Business, The new Yorker, 82,13,50
[20] Charnigo, L & Barnett-Ellis, p. (2007): Checking out Facebook.com:
The impact of a digital trend on academic libraries; Information
Technology and Libraries, 26,1,23.
[21] Choi, H (2006): Living in Cyworld: Contextualising CyTies in south
Korea, pp. 173-186, New York Peter Lang.
[22] Cohen, R. (2003): Livewire: Web sites try to make Internet dating less
creepy, Reuters, retrieved from http://asia.reuters.com Accessed 01 Feb
2011
[23] ComScore (2007): Social networking goes global. Reston,
http://www.comscore.com Accessed 01 Feb 2011
[24] Cameron Chapman(2009)History and evolution of social media
Retrieved from http://www.webdesignerdepot.com
[25] http://wikipedia.com Accessed 01 Feb 2011.
[26] Worldwide social networking websites(ComScore)
http://www.comscore.com Accessed 01 Feb 2011
[27] Geroimenko, V. & Chen, C. (2007): Visualizing the Semantic Web,
pp. 229-242, Berlin: Springer.
[28] Shahjahan S. (2004), Research Methods for Management, Jacco
Publishing House
[29] Cooper Donald R. and Shindler Panda S (2003), Business Research
Methods, Tata McGraw Hill Co. Ltd., New Delhi.
[30] Shah Kruti and DSouza Alan (2009), Advertising & Promotions: An
IMC perspective, Tata McGraw Hill Publishing Company Limited,
New Delhi.
Worldwide
J uly-2009 J uly-2010
Percent
Change
United States 131,088 174,429 33
China N/A 97,151 N/A
Germany 25,743 37938 47
Russian Federation 20,245 35,306 74
Brazil 23,966 35,221 47
United Kingdom 30,587 35,153 15
India 23,255 33,158 43
France 25,121 32,744 30
Japan 23,691 31,957 35
South Korea 15,910 24,962 57

128 | P a g e
[31] Paneeerselvam R. (2004), Research Methodology, Prentice Hall of
India Pvt. Ltd., New Delhi.
[32] Opie Clive, (June 2004), Doing Educational Research- A Guide for
First Time Researchers, Sage Publications, New Delhi,
[33] Kothari C.R. (2005), Research Methodology: Methods and
Techniques, India, and New Age International Publisher.
Cooper Donald R. and Shindler Panda S (2003), Business Research
Methods, Tata McGraw Hill Co. Ltd., New Delhi.
AUTHOR PROFILE
Ateeq Ahmad received the Master degree in computer science in year 2003.
He has a PhD student in Department of
Computer science & Engineering Singhania
University, Rajasthan India. His research
interests include Computer networks, Social
Network, Network Security, and Web
development.

129 | P a g e
Multilevel Security Protocol using RFID
1
Syed Faiazuddin,
2
S. Venkat Rao,
3
S.C.V.Ramana Rao,
4
M.V.Sainatha Rao,
5
P.Sathish Kumar
1
Asst Professor, SKTRM College of Engg & Tech, Dept .of CSE, Kondair, A.P., [email protected].
2
Asst Professor, S.R.K. P.G College, Dept .of MCA, Nandyal, Kurnool (Dist.), A.P.,

[email protected].
3
Asst. Professor, RGM Engg & Tech, Dept. of CSE, Nandyal, Kurnool (Dist.), A.P.,

[email protected].
4
Asst. Professor, RGM Engg & Tech, Dept. of IT, Nandyal, Kurnool (Dist.), A.P.

[email protected].

5
Asst Professor, SVIST , Dept .of CSE, Madanapalle, Chittoor (Dist.), A.P., [email protected]

AbstractThough RFID provides automatic object identification,
yet it is vulnerable to various security threats that put consumer
and organization privacy at stake. In this work, we have
considered some existing security protocols of RFID system and
analyzed the possible security threats at each level. We have
modified those parts of protocol that have security loopholes and
thus finally proposed a modified four-level security model that
has the potential to provide fortification against security threats.
Keywords- RFI D, Eavesdropping, Slotted I D, Spoofing, Tracking.
I. INTRODUCTION
Radio Frequency Identification is a generic term for
identifying living beings or objects using Radio Frequency.
The benefit of RFID technology is that, it scans and identifies
objects accurately and efficiently without visual or physical
contact with the object [1], [3].
A typical RFID system consists of:
An RFID tag
A tag reader
A host system with a back-end database[2]
Each object contains a tag that carries a unique ID [3]. The
tags are tamper resistant and can be read even in visually and
environmentally challenging conditions [3] such as snow, ice,
fog, inside containers and vehicles etc [2]. It can be used in
animal tracking, toxic and medical waste management, postal
tracking, airline baggage management, anti-counterfeiting in
the drug industry, access control etc. It can directly benefit the
customer by reducing waiting time and checkout lines [3] due
to its very fast response time. Hence, it should be adopted
pervasively.
For low cost RFID implementation, inexpensive passive
tags that do not contain a battery [5] and can get activated only
by drawing power from the transmission of the reader [4]
through inductive coupling are used. Tags don't contain any
microprocessor [6], but incorporate ROM (to store security
data, unique ID, OS instructions) and a RAM (to store data
during reader interrogation and response) [2], [6],

Fig . 1
In the simplest case, on reader interrogation the tag sends
back its secret ID (Fig-1). The universally unique ID makes
the tag vulnerable towards tracking as it moves from one
place to another. Hi violates "location privacy". Unprotected
tags could be monitored and tracked by business rivals. An II if
known to an illegal reader could be used to produce fake tags
that would successfully pass through security checks in future.
Hence, the security of RFID tags and the stored ID is of
extreme importance and sensing : probable security
loopholes we have proposed a :k monitoring protocol that
would reduce the security threats due to eavesdropping and
tracking.
II. SECURITY THREATS
A. Eavesdropping Scenario:
Eavesdropping normally occurs when the attacker
intercepts the communication between an RFID token and
authorized reader. The attacker does not need to power or
communicate with the token, so it has the ability to execute the
attack from a greater distance than is possible for skimming. It
is, however, limited in terms of location and time window,
since it has to be in the vicinity of an authorized reader when
transaction that it is interested in, is conducted. The attacker
needs to capture the transmitted signals using suitable RF
equipment before recovering and storing data of interest [4],
[8].


130 | P a g e

Fig. 2
B. Forward privacy:
Forward privacy ensures that messages transmitted today
will be secure in the future even after compromising the tag.
Privacy also includes the fact that a tag must not reveal any
information about the kind of item it is attached to [9], [10].
C. Spoofing:

It is possible to fool an RFID reader into beUeVmg, it
vs recewmg da\a from an RFID tag or data. This is called
"SPOOFING". In spoofing someone with a suitably
programmed portably reader covertly read" and" record's a
cfafa transmission from a tag that could contain the tag's ID.
When this data transmission is retransmitted, it appears to be a
valid tag. Thus, the reader system cannot determine that data
transmission is not authenticated [II], [12].
D. Tracking:
A primary security concern is the illicit tracking of RFID
tags. Tags which are world-readable, pose a risk to both
personal location privacy and corporate security. Since tag can
be read from inside wallets, suitcases etc. even in places where
it's not expected to items to move often it can be a smart idea to
find ways to track the item. Current RFID deployments can be
used to track people the tag the carry. To solve this problem,
we cannot use a fixed identifier [7], [12].
III. RELATED WORK
To resolve the security concerns rose in the previous
section many protocols have been proposed in various research
papers.
In the work [4], the authors proposed a 'Hash lock Scheme'.
In this scheme, the tag carries a key and a meta ID that is
nothing but the hashed key. Upon request from a reader, the tag
sends its stored meta ID back to the reader. Reader then
forwards this meta ID to the back end database where the key
of the tag has been found by looking up the database using
meta ID as the search key. The reader forwards the key found
from the database to the tag which hashes this key value and
matches the calculated hashed value with the stored meta ID.
On a successful match the tag is unlocked for further
information fetch.
Fig. 3
The drawback of this protocol is that the meta ID is still
unique. A tag can still be tracked using this meta 3D despite of
knowing the original ID. So, "location privacy" is still under
threat. Again, while transmission of the key from back end
database through reader, it can easily be captured by an
eavesdropper though the connection between the reader and tag
has been an authenticated one. Hence, eavesdropping is still a
major problem. From this, it is inferred that no unique and
'static' value can ever be sent back to the reader.
To overcome this problem, a new protocol has been
predicted [4] in which tag responses change with every query.
To realize this, the tag sends a pair <r, h(ID, r)> where r is a
random number upon request. The database searches
exhaustively through its list of known IDs until it finds the one
that matches h(ID,r), for the given r. Though this technique
resolves the tracking problem yet increases the overhead of the
database and the search complexity increases with r. This is
handled by the protocol discussed by us in the next section.
Our tag contains a unique meta ID. As we cannot send the
unique meta ID, we are generating a random number in the tag.
This random number is fed to a down counter. The down
counter counts down to zero and sends a clock pulse to a
sequence Tag sends a pair <r, q> where r is the random number
and q is the new state generated by the sequence generator.
At the reader end a reverse sequence generator is
implemented through which the state equal to the original
meta ID has been found.


131 | P a g e
Fig.4

Fig. 5
A. Reader Identification:
Since the reader plays an important role in RFID system,
the tag must identify its authenticated reader. An authenticated
reader has the capability to modify, change, insert or delete the
tag's data. As an extension to the previous section, after
generating the original meta ID the system looks into the back-
end database and retrieves the corresponding key. Now, before
sending the key to the tag, the logic circuit effaces some of the
bits from the key and sends the modified key to the tag. Which
bits are to be deleted is determined by the random number r.
generator with each down count, on receiving of which the
sequence generator each time generates a new state starting
from the state equivalent to the meta ID. When the down
counter becomes zero, the state of sequence generator is
recorded.
IV. SECURITY PROPOSALS
A. Mitigating Eavesdropping:
In the first part of our work, we came up with a novel idea
to alleviate eavesdropping introducing meta ID concept in a
new light.

Fig. 6 Example of an Image with Acceptable

At the tag end, the missing bits of the covert key are copied
down from the original key stored in the tag. Then the stored
key and the modified key are compared. On a successful
match, the tag considers the reader to be valid and unlocks
itself for further access of the reader. Otherwise, it rejects the
query request sensing the reader to be a false one.
B. Slotted ID Read:
Up to this stage only a valid reader has been given the
privilege to gain access of the next level of the tag. Still the
unique ID of the tag cannot be sent openly to the reader as it
can readily get skimmed and tracked by an eavesdropper. To
deal with it, the ID is divided into a number of slots of varying
length. Some additional bits are added at the beginning of
each slot that holds the length of the ID belonging to that slot.
Then the entire data packet is encrypted. As only the
authenticated, reader knows the number of bits used to specify
the length of that slot, it provides an extra security to this
approach. The transmission of data packets in several slots is
continued until the end of the ID.

Length of Data Data + Padding Bits
Fig . 7. Typical packet
C. Tag Identification:
At the reader end after receiving the each packet, it first
decrypts the data and then eliminates the bits used to specify
the length of that slot recovers a part of the original ID. This
method continued for each packet and then the decry IDs are
combined together to reform the entire unique ID. Thus, the
unique ID is transmitted to the authenticated reader and at the
same time it also stymied the false readers from reading it.


132 | P a g e

Fig. 8 Example of an Image with acceptable resolution
V. SECURITY ANALYSIS
In our protocol, we have provided a four step security to the
ID that prevents the tag from getting cloned and reduces the
risk of spoofing, eavesdropping by many folds.
As the <r, q> pair sent to the reader from the tag
changes every time, an eavesdropper can never track
a tag through its meta ID. In work [4] though this was
achieved, it increased the database overhead and
complexity of brute-force search algorithm. In our
method, the same goal was met but the problem of
work [4] has also been resolved.
The key retrieved from the back-end database of the
reader has not directly been sent to the tag as any
false reader can catch this key on its way to the tag and
can prove itself to be a valid reader at any moment.
Hence, the key has been modified with special method
and as the same key is modified in a different manner
each time, it doesn't allow a false reader or an
eavesdropper to discover the key.
The received and modified key is reconstructed and
matched with the key stored in the tag to authenticate
a reader. This feature bars ali readers apart from the
valid one to gain further access of the tag contents.
The entire ID has been slotted and each slot is
different length. The first few bits of each slot
represent the number of bits of ID belonging to that
slot. Then the data of the entire sl ot is being
encrypted and sent to the reader. The ID is sent in
several steps and the unique ID has never been sent m
its original Sorm. TYi'is ent'ire method aYiows orhy an
authenticated reader to find the original ID.
Thus, we have beefed up the security of the ID through our
protocol and provided secure tag-to-reader transactions.
The proposed protocol can be mixed up with other research
works to make it more beneficial for practical life
implementation towards the goal of manufacturing low cost
RFID. With the passage of time and generation new ideas
along with new technology will sprung up, which will
definitely make this RFID technology, a more preferable and
cost effective. As the radiation from RFID is not good for
human exposure, RFID radiation is inadvertently causing
damage to human cells, tissues on its exposure. So there is a
wide space in this field also to minimize its effect on human
beings. Hence there is a plethora of fields in which we can
work on.
VI. CONCLUSION
As our work revolves around security only, we have
provided a 3-way security level in our proposal. With our
limited resources we had tried our best to give tag-reader
identification a higher priority since both have their own
importance in security analysis measurement. By combining
the random variable concept for tag-reader identification we
have provided an additional security. The most important
characteristic of our protocol is that at no point of time we are
leaving our IDs/keys in their original form. Even if a false
reader reads any information, it's of no use for that reader. That
said, our proposed security definitions are just a starting point.
They certainly do not capture the full spectrum of real-world
needs. We had proposed important areas for further work.
REFERENCES

[1] Glidas Avoine and Philippe Oechslin, "A Scalable and Provably
Secure Hash-Based RFID Protocol", The 2"" IEEE International
Workshop on Pervasive Computing and Communication Security,
2005.
[2] C. M. Roberts, Radio frequency identification. Computers & Security vol.
25, p. 18-26, Elsevier, 2006.
[3] Tassos Dimitnou, A Secure and Efficient RFID Protocol that could make
Big Brother (partially) Obsolete, Proceedings of the Fourth Annual
IEEE International Conference on Per\'asive Computing and
Communications,2006.
[4] Stephen A. Weis, Sanjay E. Sharma, Ronald L. Rivest and Daniel
VV. Engels, Security and Privacy Aspects of Low-Cost Radio
Frequency Identification Systems, /** Internationa/ conference
on Security in Pervasive Cotnput,ng(SPC).2003,
[5] Mike Biirmester, Breno de Medeiros and Rossana Motta, Provably
Secure grouping-proofs for RFID tags.
[6] Gildas Avoine and Philippe Oechslin, RFID Traceability: A Multilayer
Problem.
[7] Mike Burmester and Breno de Medeiros, RFID Security: Attacks,
Countermeasitres and Challenges.
[8] G.P. Hancke, Eavesdropping Attacks on High-Frequency RFID
Tokens.
[9] Raphael C.-W. Phan, Jiang Wu , Khaled Ouafi and Douglas R. Stinson,
Privacy Analysis of Fonvard and Bacfovard Un/raceable RFID
Authentication Schemes.
[10] Mayla Brns'o, Konstantinos Chatzikokolakis, and Jerry den Hartog,
Formal verification of privacy for RFID systems.
[11] Dale R. Thompson, Neeraj Chaudhry, Craig W. Thompson RFID
Security Threat Model.
[12] Dong-Her Shih, Privacy and Security Aspects of RFID Tap'
AUTHORS PROFILE
Syed Faiazuddin received the M. Tech degree
in Computer Science and Engineering from JNTU
Kakinada University from the Department of Master
of Technology. He is a faculty member in the
Department of M. Tech in KTRMCE, Kondair. His
research interests are in areas of Computer
Networks, Mobile Computing, DSP, MATLAB,
Data Mining, Sensor Networks. He is worked as 2
years research and Development Engineer in
ELICO Ltd.,. He is Member of the National Service

133 | P a g e

Scheme (NSS).
Mr. S. Venkat Rao is currently working as
Assistant Professor in the Department of Computer
Science and Applications in Sri Ramakrishna
Degree & P.G. College, Nandyal, Andhra Pradesh,
India. He completed his B.Sc., M.Sc., and M.Phil.
with Computer Science as specialization in
1995,1997 and 2006 respectively from Sri
Krishnadevaraya University, Anantapur. He is
currently a part-time Research Scholar pursuing
Ph.D. degree. His areas of interest include Computer
Networks, WDM Networks. He participated in number of national seminars
and presented papers in national conferences.
S.C.V. Ramana Rao received the M.
Tech(CSE) degree in Computer Science and
Engineering from JNTU Anantapur University from
the Department of Master of Technology. He is a
faculty member in the Department of C.S.E. His
research interests are in areas of Computer Networks,
Mobile Computing, DSP, MATLAB, Data Mining,
Sensor Networks.
M.V. Sainatha Rao is currently Working as Assistant Professor in the
Department of Information and Technolgy. He has
8 Years of experience in Computer Science . He
has attended 2 international conferences, 6 national
conferences, 4 workshops. His research interests
are in areas of Computer Networks, Mobile
Computing, DSP, MATLAB, Sensor Networks.

P. Sathish Kumar received the M.Tech
degree in Computer Science and Engineering
from JNTU Anantapur University from
the Dept. of Master of Technology. He is
Asst. Professor in the Department of C.S.E
at Sri Vishveshwaraiah Institute Of
Science&Technology, approved by AICTE,
affiliated to JNTU, Anantapur. Madanapalle,
Chittoor Dist., A.P. His research interests are in
areas of Computer Networks, Mobile
Computing, DSP, MATLAB, Data Mining,
Sensor Networks.

PDF

Uploaded by

PDF

Uploaded by

A Publication of

The Science and Information Organization

= 0.001531423 and the

Var( ) cov( , ) 7.265244e-03 -1.474579e-06

f(x; , ) o using MLEs and

f(x; , ) o using MLEs and Bayesian

are first and second components of the

are electric and magnetic

d d ,shall provide pointing

, with the help of

embeds the velocity component of fluid at a

at a large distance shall contribute for =/ for plane

X from spherical coordinates

shall provide us the pointing vector of the

component reside in , we need to calculate

for plane wave. We can thus evaluate total energy

, and after normalization Sin termgets

as common, we can evaluate

) here effect fluid velocity v have

= Cos Cos + Sin Cos Sin

= Sin Sin + Cos Cos

are weight or magnitude of the function and

(t, x, y, z),H(t, r), E(t, r)

] , permeability is hidden in H value.

are magnitude or function weight

) are electric field dipole moments.

are known values and

are test functions.

You might also like