n
q
q q W C
Kq
1
. (1)
Where
Kq = q
th
main criteria.
n = number of sub-criteria in the q
th
Criteria.
C
q
= Fuzzy value of q
th
parameter
W
q
= Weight of the relative parameter.
To calculate the overall score of the decision hierarchy, the
equation is re-defined as:
Total Score = K
1
+ K
2
+ K
3
+ . . . . + K
11
. (2)
It implies that:
11
1 i
Ki Score Total . (3)
Where,
Wi= Expert weight of the i
th
main criteria.
Ki= Calculation obtained from equation (1) or
value of main criteria obtained through
equation (1).
From equation (1) and (3), we derived equation (4) as
below:
11
1 1
) (
i
n
q
Wq Cq Score Total ... (4)
Processs evaluation score could not be achieved efficiently
through using yes/no, i.e. 1 or 0; as the values of parameters are
qualitative in nature. For these qualitative parameters we use
fuzzy logic. Zadeh [22] used fuzzy logic to measure the
continuous values.
Since, through out in the decision making process
parameters weight assigned from experts opinion is constant,
final decision ranking score computed from the overall decision
System Users Knowledge Engineer
Dialog Dialog
DDE Facility
External Database
(SQL Server or MS Access)
User Interface
Knowledge Base
(Collection of facts)
Control Mechanism
(Inference Engine)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
24 | P a g e
http://ijacsa.thesai.org/
hierarchy is varied due to the input parameters values entered
by users. The value of parameters, might be 0 or 1 [22], if a
special parameter is present then the parameter weight is
multiplied by 1, else by 0 if not present.
According to Zadeh [22, 24] variables words or sentences
as their values is called linguistics variables and the variables
that represents the gradual transition from high to low, true to
false is called fuzzy variables and a set containing these
variables is the fuzzy set. Their degree of membership is [1, 0],
where 1 represents highest membership and 0represents no
membership.
We defined a fuzzy variable set for conceptual framework
model as:
Fuzzy Set = {Extremely Strong, Strong, Moderate, Weak,
Extremely Weak}
Their fuzzy membership values are as: Fuzzy membership
value = {1.0, 0.75, 0.5, 0.25, 0}
Table I depicts the fuzzy variables with respective degree of
membership value. In the following table, from top to bottom a
gradual transition is represented from extremely strong to
extremely weak in the fuzzy variables and the respective
degree of membership values.
TABLE I. PARAMETERS OF FUZZY VALUES
Fuzzy variable Degree of Membership
Extremely Strong 1.0
Strong 0.75
Moderate 0.50
Weak 0.25
Extremely Weak 0
These fuzzy values are input parameters values provided
by the users during consultation and the final ranking score of a
particular processs evaluation is calculated by the system at
run time, but here an example of computation in the decision
making score is depicted. If qualitative value of a parameter is
extremely strong, then related numeric value 1.0 will be
multiplied with parameters weight.
Let, suppose a parameter risk analysis is assigned a weight
0.046 by experts then the fuzzy decision score can be
calculated as shown in the Table II.
The overall weights of all the parameters are calculated by
Equation (4), and the resultant score will be the final score for
decision making.
Decision score calculated by linguistic mapping, which is
the output of the intelligent system and also description for a
selection of software process model ranking is depicted in the
Table III.
IV. CONCLUSION
This research is promising to solve the problems associated
with the existing approach to intelligent framework modeling.
These models will become a base for selection of an
appropriate process model for Expert systems development (i.e.
ESPMS). Neither there exist any strict rules to be followed to
select a software process model nor any consultative system to
guide novice user. This an attempt to integrate various
technologies, like Expert Systems, AHP, Fuzzy Logic and
Decision Making to solve real world problems.
TABLE II. FUZZY SCORE CALCULATION
TABLE III. LINGUISTIC DESCRIPTION OF PROPOSED SYSTEM
WITH OUTPUT (X)
V. FUTURE SCOPE
Following the decision issues and the accompany models
presented in this paper, a prototype ESPMS can easily be
developed. This prototype ESPMS can be linked with external
database and other software to develop a full-fledge Expert
System for final decision making in selection of a process
model for a particular software project. This work may become
a base for solving other similar problems.
REFERENCES
[1] [Pressman R. S., Software Engineering a Practitioners Approach, Fifth
Edition, McGraw Hill, 2001.
[2] Reddy A. R. M, Govindarajulu P., Naidu M. A Process Model for
Software Architecture, IJCSNS, VOL.7 No.4, April 2007..
[3] Jorge L. Daz-Herrera, Artificial Intelligence (AI) and Ada: Integrating
AI with Mainstream Software Engineering, 1994.
[4] Rech J., Althoff K. D., Artificial Intelligence and Software
Engineering: Status and Future Trends, 2004.
[5] Durkin, J., Application of Expert Systems in the Sciences, OHIO J.
SCI. Vol. 90 (5), pp. 171-179,1990
[6] Kazaz, A., Application of an Expert System on the Fracture Mechanics
of Concrete. Artificial Intelligence Review. 19, 177190, 2003.
[7] Raza F. N., Artificial Intelligence Techniques in Software
Engineering (AITSE), IMECS 2009, Vol-1, Hong Kong, 2009.
Parameters fuzzy
value
Membership
value
* Parameters
weight
Fuzzy score
Extremely Strong 1.0 0.046 0.046
Strong 0.75 0.046 0.034
Moderate 0.5 0.046 0.023
Weak 0.25 0.046 0.011
Extremely Weak 0 0.046 0
System Output Linguistic Description
X < 0.20 Prototyping Process Model
0.20X0.40 RAD Process model
0.40X0.60 Evolutionary Process Models
(Incremental, Spiral, Win-Win Spiral &
Concurrent Development Model)
0.60X0.80 Waterfall Process Model
X0.80 Component-Based Development
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
25 | P a g e
http://ijacsa.thesai.org/
[8] Ruhe G., Learning Software Organizations, Fraunhofer Institute for
Experimental Software Engineering (IESE), Volume 2 , Issue 3-
4 (October-November 2000), Pages: 349 367, ISSN:1387-3326.
2000
[9] Armenise P., Bandinelli S, Ghezzi C. & Morzenti A, A Survey and
Assessment of Software Process Representation Formalisms, CEFRIEL
Politecnico di Milano, 1993.
[10] Lonchamp J. (1993), A Structured Conceptual and Terminological
Framework for Software Process Engineering, (CRIN), France, 1993.
[11] Yu E. S. K. & Mylopoulos J., Understanding Why in Software
Process Modelling, Analysis, and Design, Proc. 16th Int. Conf. Software
Engineering, 1994.
[12] Grobelny P., The Expert System Approach in Development of Loosely
Coupled Software with Use of Domain Specific Language, Proceedings
of the International Multi-conference on Computer Science and
Information Technology, pp. 119 123, ISSN 1896-7094, 2008.
[13] Canfora G., Garca F., Piattini M., Ruiz F. & C. A. Visaggio, Applying
a framework for the improvement of software process maturity, Softw.
Pract. Exper. 36:283304, 2006.
[14] Atan R., Ghani A. A. A, Selamat M. H., & Mahmod R, Software
Process Modelling using Attribute Grammar, IJCSNS VOL.7 No.8,
2007.
[15] Kim J., & Gil Y., Knowledge Analysis on Process Models, Information
Sciences Institute University of Southern California, (IJCAI-2001),
Seattle, Washington, USA. 2001
[16] Liao L., Yuzhong Qu Y. & Leung H. K. N., A Software Process
Ontology and Its Application. 2005.
[17] Turban, E., Expert Systems and Applied Artificial Intelligence. New York:
Macmillan Publishing Company, 1992.
[18] Awad, E.M. Building Experts Systems: Principals, Procedures, and
Applications, New York: West Publishing Company, 1996.
[19] Abdur Rashid Khan, Zia Ur Rehman. Factors Identification of Software
Process Model Using a Questionnaire, IJCNS) International Journal of
Computer and Network Security, Vol. 2, No. 5, May 2010 90.
[20] Prolog Development Center, Expert System Shell for Text Animation
(ESTA), version 4.5, A/S. H. J. Volst Vej 5A, DK-2605 Broendby
Denmark, copy right1992-1998.
[21] Saaty T. L., The analytic hierarchy process: planning, priority setting
and resource allocation. New York: McGraw-Hill, 1980.
[22] Zadeh, L.A., Fuzzy sets. Inform. and control, 8, 338-353,
1965.
[23] Khan, A.R., Expert System for Investment Analysis of Agro-based
Industrial Sector, Bishkek 720001, Kyrgyz Republic, 2005
[24] Zadeh, L.A., The concept of a linguistic variable and its
application to
appropriate reasoning.information sciences, 8,43-80, 1975.
[25] Futrell R. T., Shafer L. I, Shafer D. F, Quality Software Project
Management, Low Price Edition, 2004.
[26] Anderson S., & Felici M., Requirement Engineering Questionnaire,
Version 1.0, Laboratory for Foundation of Computer Science, Edinburgh
EH9 3JZ, Scotland, UK 2001.
[27] Skonicki M., QA/QC Questionnaire for Software Suppliers, January
2006.
[28] Energy U. S. D., Project Planning Questionnaire,
www.cio.energy.gov/Plnquest.pdf (last access, 04 August 2009)
[29] Liu & Perry (2004), On the Meaning of Software Architecture,
Interview Questionnaire, Version 1.2, July, 2004.
AUTHORS PROFILE
Abdur Rashid Khan The author is presently working as an Associate
Professor at ICIT, Gomal University D.I.Khan, Pakistan. He received his PhD
degree from Kyrgyz Technical University, Kyrgyz Republic in 2004. He has
been published more than 23 research papers in national and international
journals and conferences. His research interest includes ES, DSS, MIS and
Software Engineering.
Zia Ur Rehman The author has received his MCS in Computer Science
from Institute of Information Technology, Kohat University of Science &
Technology (KUST), Kohat, Pakistan in 2005. He is currently pursuing his MS
degree in Computer Science from the same institute. His area of interest
includes software engineering, AI, knowledge engineering, expert system, and
applications of fuzzy logic.
Hafeez Ullah Amin- is a research student at Institute of Information
Technology, Kohat University of Science & Technology, Kohat 26000, KPK,
Pakistan. He has completed BS(Hons) in Inforamtion Technology and MS in
Computer Science in 2006 & 2009 respectiviely from the above cited
institution. His current research interests includes Artificial Intelligence,
Information System, and Data Base.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
26 | P a g e
http://ijacsa.thesai.org/
Modelling & Designing Land Record Information
System Using Unified Modelling Language
Kanwalvir Singh Dhindsa
CSE & IT Department,
B.B.S.B.Engg.College,
Fatehgarh Sahib,Punjab,India
[email protected]
Himanshu Aggarwal
Department of Computer Engg.,
Punjabi University,
Patiala, Punjab,India
[email protected]
Abstract - Automation of Land Records is one of the most
important initiatives undertaken by the revenue department to
facilitate the landowners of the state of Punjab. A number of
such initiatives have been taken in different States of the
country. Recently, there has been a growing tendency to adopt
UML (Unified Modeling Language) for different modeling needs
and domains, and is widely used for designing and modelling
Information systems. UML diagramming practices have been
applied for designing and modeling the land record information
system so as to improve technical accuracy and understanding
in requirements related with this information system. We have
applied a subset of UML diagrams for modeling the land record
information system. The case study of Punjab state has been
taken up for modelling the current scenario of land record
information system in the state. Unified Modeling Language
(UML) has been used as the specification technique. This paper
proposes a refined software development process combined
with modeled process of UML and presents the comparison
study of the various tools used with UML.
Keywords - I nformation system, Unified Modeling Language
(UML), software modelling, software development process, UML
tools.
I. INTRODUCTION
Computerization of Land Records is one of the most
important initiatives undertaken by the Revenue Department
to facilitate the landowners of the State. A number of such
initiatives have been taken in different States of India. The
paper proposes a UML based approach, where non-
functional requirements are defined as reusable aspects to
design and analysis. UML offers vocabulary and rules for
communication and focus on conceptual and physical
representations of a system. UML uses an object oriented
approach to model systems which unifies data and
functions (methods) into software components called
objects. Various diagrams are used to show objects and
their relationships as well as objects and their
responsibilities (behaviors). UML is Standard for object-
oriented modeling notations endorsed by the Object
Management Group (OMG), an industrial consortium on
object technologies. UML has become a standard after
combining and taking advantage of a number of object
oriented design methodologies (Kobryn, 1999) and is
currently posed as a modeling language instead of a design
process.
A. Process of Data Digitisation
The automation of the projects related with information
systems is underway in many Govt. sectors. With the use of
the funds, 153 Fard kendras will be established in the Tehsils
of the State to provide certified copies of the Revenue Records
to the general public. Some farad centres have been already
opened in few tehsils and sub-tehsils, for to be used by public.
The land records (Jamabandi etc.), generally are updated after
every 5 years.The legacy land records to be digitized are:
Jamabandi, Mutation, Roznamcha Waqiati, Khasra Girdawari
and Field Book. Lack of faith and undefined procedures
regarding services being provided to the citizens. This
implementation of changing the paper record into digital
records will lead to facilitation of the farmers, maintaining
better transparency of the revenue records, lead to drastic
reduction of fraudulent practices, level of corruption and
procedural hassles relating to the management of the land
records, will lead to reduction in time delay and will also work
as a faith building measure, providing service to citizens.
II. UNIFIED MODELLING LANGUAGE
UML (Unified Modelling Language) is a complete
language for capturing knowledge(semantics) about a
subject and expressing knowledge(syntax) regarding the
subject for the purpose of communication. It applies to
modeling and systems. Modeling involves a focus on
understanding a subject (system) and being able to
communicate in this knowledge. It is the result of unifying
the information systems and technology industrys best
engineering practices (principals, techniques, methods and
tools). It is used for both database and software modeling.
UML attempts to combine the best of the best from: Data
Modeling concepts (Entity Relationship Diagrams),
Business Modeling (work flow), Object Modeling and
Component Modeling. UML is defined as: UML is a
graphical language for visualizing, specifying,
constructing, and documenting the artifacts of a software
intensive system [Booch]. Software architecture is an area
of software engineering directed at developing large,
complex applications in a manner that reduces development
costs, increases the quality and facilitates evolution[8]. A
central and critical problem software architects face is
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
27 | P a g e
http://ijacsa.thesai.org/
how to efficiently design and analyze software
architecture to meet non- functional requirements. UML
offers vocabulary and rules for communication and focus
on conceptual and physical representations of a
system.
The various structural things in UML are Class,
Interface, Collaboration, Use-case, behavioral things
comprise of Interaction, State machine, Grouping things
comprise of packages and notes.
a) Things: important modeling concepts.
b) Relationships: tying individual things (i.e., their
concepts).
c) Diagrams: grouping interrelated collections of things
and relationships.
The artifacts included in standard UML consist of:
Use case diagram, Class diagram, Collaboration
diagram, Sequence diagram, State diagram, Activity
diagram, Component diagram and Deployment diagram
(OMG, 1999).There are different ways of using UML
in terms of design methodologies to accomplish different
project objectives.
III. SYSTEM ANALYSIS & DESIGN
Unified Modeling Language (UML) is used as a
specification technique for the system analysis and design
process involved in the software development life cycle.
A. Modelling & Designing Using UML
1) Case Scenario : Land Record Information System
UML is built upon the MOF metamodel for OO
modeling. A modeling method comprises a language and
also a procedure for using the language to construct models,
which in this case is Unified Modeling Language(UML).
Modeling is the only way to visualize ones design and
check it against requirements before developers starts to
code. The land record information system is modeled using
use-case, sequence, class, and component diagrams offered
by the Unified Modeling Language.
a) Use-Case Diagram: Use case diagrams describe
what a system does from the standpoint of an external
observer [17]. Use Case Diagrams describe the
functionality of a system and users of the system. And
contain the following elements:
Actors, which represent users of a system, including
human users and other systems.
Use Cases, which represent functionality or services
provided by a system to users.
{ *as modeled in StarUML }
b)Class Diagrams & Object Diagrams: Being the
most important entity in modeling object-oriented software
systems, it is used to depict the classes and the static
relationships among them [3]. Class Diagrams describe the
static structure of a system, or how it is structured rather
than how it behaves. These diagrams contain the following
elements:
Classes, which represent entities with common
characteristics or features. These features include
attributes, operations and associations.
Associations, which represent relationships that
relate two or more other classes where the relationships
have common characteristics or features.
c) Object Diagrams: describe the static structure of a system
at a particular time. Whereas a class model describes all
possible situations, an object model describes a particular
situation. Object diagrams contain the following elements:
Objects, which represent particular entities. These are
instances of classes.
Links, which represent particular relationships
between objects. These are instances of associations.
{ *as modeled in StarUML }
d) Collaboration Diagrams & Component Diagrams:
Component diagram is one of UMLs architectural
diagrams used to effectively describe complex architectures
as a hierarchy of components (subsystems) communicating
through defined interfaces [6]. Collaboration Diagrams
describe interactions among classes and associations. These
interactions are modeled as exchanges of messages
between classes through their associations. Collaboration
diagrams are a type of interaction diagram. Collaboration
diagrams contain the following elements:
i) Class roles, which represent roles that objects
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
28 | P a g e
http://ijacsa.thesai.org/
may play within the interaction.
ii) Association roles, which represent roles that
links may play within the interaction.
iii) Message flows, which represent messages sent
between objects via links. Links transport or
implement the delivery of the message.
{*as modeled in StarUML }
e) Deployment Diagrams : Deployment diagrams describe
the configuration of processing resource elements and
the mapping of software implementation components
onto them. These diagrams contain components and
nodes, which represent processing or computational
resources, including computers, printers, etc. Each cube
icon is known as a node representing a physical system.
All the system requirements are shown in the
architecture which is used for the land record
information system. All the modules of the information
system have been developed using Visual Basic with
SqlServer at the backend. The web components are
hosted on Apache web server and use Java Servlets.
The modeled components* are shown in the
deployment diagram.
{ *as modeled in StarUML }
IV. MODELLING TOOLS USED IN UML
The various types of tools used for modelling in Unified
Modeling Language(UML) are:
a) Modeling Tools : Rational Rose, ArgoUML,
Together,
UMbrello
b) Drawing Tools : Visio, Dia
c) Metamodels: Eclipse UML2, NSUML, OMF
d) Renderers: Graphviz, UMLDoc
e) IDEs: Visual Studio 2005, XCode 2,
Rational XDE
A. Comparison of UML Tools
The Unified Modelling Language(UML) tools used for
modeling the design of various information systems are
compared by taking some vital parameters which
distinguish each one of them; giving fairly the advantage of
one tool over the other.
TABLE I. COMPARISON OF UML TOOLS
Tools Strength/Stability Cost Additional
Features
Current
Status
Rational
Rose
Full-strength industrial
modeling suite
expensi
ve
Office for UML,
add-ons, plug-
ins, scripting
interface,
plug-in to MS
Visual Studio
and Eclipse
Re-
developed as
Rational
XDE
Together Supports most
UML diagrams
Mid-
range
cost
Can reverse
engineer with
C++,Java
Generate
source code
for
C++, Java
Exports to
PNG
ArgoUML Open source
UML modeling
application
written in Java
Free to
downlo
ad
Supports most
diagram types,
reverse
engineering
and code
generation for
Java
Forked
into
commercial
product
Poseidon
Umbrello Open source
modeling
application for
KDE, written in
C++
Free to
downlo
ad
Supports data
modeling for
SQL, reverse
engineering
and code
generation
Under active
developmen
t
MS Visio Fairly compliant
with UML
metamodel
Not
interop
erable
used for creating
2D schematics
and diagrams
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
29 | P a g e
http://ijacsa.thesai.org/
Dia Open source
graphics drawing
Program form
GNOME
Free Supports the
creation of
some
UML diagram
types
Used
frequently
by open
Source
develop
ers
Graphviz
and
UMLDoc
AT&T Graphviz -Accepts
graph
specification
input, generates
PNG, PDF
layouts of graphs
UMLDoc -
parses Java
comments to
produce
diagrams
mid-
range
generates PNG,
PDF layouts of
graphs
Actually uses
graphviz to
create diagrams
MS-Visual
Studio
Supports UML-
like diagrams
for .NET
languages (i.e.,
C#)
New
part in
VS
2005
Provides support
for roundtrip
engineering, and documentation generation
New part in
VS 2005
XCode2 Claims support
for C, C++,
and Java
Provides UML-
likeclass
diagrams for
Objective-C,
Can be used
for roundtrip
engineering
Provides
UML-like
class
diagrams
for
Objective-C
Rational
XDE
Visual modeling
suite for
UML
Costly Plugs in to
many different
IDEs (Visual
Studio .NET,
Eclipse, IBM
WebSphere),
Supports
roundtrip
engineering
Provides features of
Rational
Rose
V. UML IN INFORMATION SYSTEMS: ITS
APPLICATIONS
Any type of application, running on any type and
combination of hardware, operating system,
programming language, and network can be
modeled in UML.
UML Profiles (that is, subsets of UML tailored for
specific purposes) help to model Transactional,
Real- time, and Fault-Tolerant systems in a natural
way.
UML is effective or modeling large, complex
software systems.
It is simple to learn for most developers, but
provides advanced features for expert analysts,
designers and architects.
It can specify systems in an implementation-
independent manner.
Structural modeling specifies a skeleton that can be
refined and extended with additional structure and
behavior.
Use case modeling specifies the functional requirements
of system in an object-oriented manner. Existing source
code can be analyzed and can be reverse-engineered into
a set of UML diagrams.
UML is currently used for applications other than drawing
designs in the fields of Forward engineering, Reverse
engineering, Roundtrip engineering and Model-Driven
Architecture (MDA). A number of tools on the market
generate Test and Verification Suites from UML models.
VI. CONCLUSION & FUTURE SCOPE
UML tools provide support for working with the UML
language for the development of various types of
information systems. From the paper, it is concluded that
each UML tool is having its own functionality and can be
used, according to the need of the software development
cycle for the development of information systems. The
three different views of using UML are: Documenting
design up front, maintaining design documentation after the
fact and generating refinements or source code from
models. This paper has concluded with the aspect that
information system can be modeled using UML due to its
flexibility and inherent nature & the tools tend to add to its
ever-increasing demand for the use of development of
information systems. UML can still further be considered
as part of mobile development strategy and further
planning can also be done to conceive the unified
modeling principles for later stages of enhancement of
land record information system.
Future work that could be pursued includes applying the
software process to large scale m-commerce application
systems and generating the model diagrams with UML,
for them to be made specially tailored for the software
development process; providing backbone to the analysis
and design phases associated in the SDLC.
REFERENCES
[1] A. Gurd, Using UML 2.0 to Solve Systems Engineering Problems,
White Paper, Telelogic,2003.
[2] Blaha, M. & Premerlani, W., Object-Oriented Modeling and Design for
Database Applications, Prentice Hall, New Jersey,1998.
[3] R. Miller, Practical UML: A Hands-On Introduction for
Developers, White Paper, Object Mentor Publications,1997.
[4] B. Graham, Developing embedded and mobile Java technology-based
applications using UML, White Paper, IBM Developerworks,2003.
[5] Y. . Fowler, UML Distilled: a brief guide to the standard object
modeling language, 3rd ed., Addison- Wesley,2004.
[6] P.Jalote, A.Palit, P.Kurien, V.T. Peethamber,Timeboxing: A
Process Model for Iterative Software Development, Journal of
Systems and Software (JSS), Volume 70, Number 1-2, pp.117-127,
2004.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
30 | P a g e
http://ijacsa.thesai.org/
[7] Nikolaidou M., Anagnostopoulos D., A Systematic Approach for
Configuring Web-Based Information Systems, Distributed and Parallel
Database Journal, Vol 17, pp 267-290, Springer Science, 2005.
[8] M. Shaw, and D. Garlan, Software Architecture: Perspectives on an
Emerging Discipline, Prentice Hall, 1996.
[9] Kobryn, C., UML 2001: a standardization odyssey'', Comm. of the
ACM, Vol. 42 No. 10, October, pp. 29-37,1999.
[10] OMG UML Revision Task Force,OMG-Unified
Modeling Language Specification, http://uml.systemhouse.mci.com/
[11] Jeusfeld, M.A. et al.: ConceptBase: Managing conceptual models
about information systems. Handbook of Information Systems,
Springer-Verlag ,pp. 265-285,1998.
[12] Berardi D., Calvanese D., and De Giacomo G.: Reasoning on
UML class diagrams, Artificial Intelligence, 168, 70-118,2005.
AUTHORS PROFILE
Er. Kanwalvir Singh Dhindsa is currently an Assistant Professor at CSE &
IT department of B.B.S.B.Engg.College, Fatehgarh Sahib (Punjab), India. He
received his M.Tech. from Punjabi University, Patiala (Punjab) and is
currently pursuing Ph.D. degree in Computer Engineering from the same
university. His research interests are Information Systems,Relational Database
Systems and Modelling Languages. He is a member of IEI, ISTE and ACEEE.
Prof. (Dr.) Himanshu Aggarwal is currently an Reader at department of
Computer Engg. of Punjabi University,Patiala(Punjab). He received his Ph.D.
degree in Computer Engineering from Punjabi University in 2007. His
research interests are Information Systems, Parallel Computing and Software
Engineering. He has contributed 14 papers in reputed journals and 35 papers
in national and international conferences. He is also on the editorial board of
some-international-journals.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
31 | P a g e
http://ijacsa.thesai.org/
An Algorithm to Reduce the Time Complexity of
Earliest Deadline First Scheduling Algorithm in
Real-Time System
Jagbeer Singh
Dept. of Computer Science and
Engineering
Gandhi Institute of Engg. & Tech.
Gunupur, Rayagada, India-765022
[email protected]
Bichitrananda Patra
Dept. of Information Technology
Gandhi Institute of Engg. & Tech.
Gunupur, Rayagada, India-765022
[email protected]
Satyendra Prasad Singh
Dept. of Master of Computer
Application
Gandhi Institute of Compt. Studies
Gunupur, Rayagada, India-765022
[email protected]
AbstractTo this paper we have study to Reduce the time
Complexity of Earliest Deadline First (EDF), a global scheduling
scheme for Earliest Deadline First in Real Time System tasks on
a Multiprocessors system. Several admission control algorithms
for earliest deadline first are presented, both for hard and soft
real-time tasks. The average performance of these admission
control algorithms is compared with the performance of known
partitioning schemes. We have applied some modification to the
global earliest deadline first algorithms to decrease the number of
task migration and also to add predictability to its behavior. The
Aim of this work is to provide a sensitivity analysis for task
deadline context of multiprocessor system by using a new
approach of EFDF (Earliest Feasible Deadline First) algorithm.
In order to decrease the number of migrations we prevent a job
from moving one processor to another processor if it is among the
m higher priority jobs. Therefore, a job will continue its
execution on the same processor if possible (processor affinity).
The result of these comparisons outlines some situations where
one scheme is preferable over the other. Partitioning schemes are
better suited for hard real-time systems, while a global scheme is
preferable for soft real-time systems.
Keywords- Real-time system; task migration, earliest deadline first,
earliest feasible deadline first.
I. INTRODUCTION (HEADING 1)
Real-time systems are those in which its correct operation
not only depends on the logical results, but also on the time at
which these results are produced. These are high complexity
systems that are executed in environments such as: military
process control, robotics, avionics systems, distributed systems
and multimedia.
Real-time systems use scheduling algorithms to decide an
order of execution of the tasks and an amount of time assigned
for each task in the system so that no task (for hard real-time
systems) or a minimum number of tasks (for soft real-time
systems) misses their deadlines. In order to verify the
fulfillment of the temporal constraints, real-time systems use
different exact or inexact schedulability tests. The
schedulability test decides if a given task set can be scheduled
such that no tasks in the set miss their deadlines. Exact
schedulability tests usually have high time complexities and
may not be adequate for online admission control where the
system has a large number of tasks or a dynamic workload. In
contrast, inexact schedulability tests provide low complexity
sufficient schedulability tests.
The first schedulability test known was introduced by Liu
and Layland with the Rate Monotonic Scheduling Algorithm
[Liu, 1973] (RM). Liu and Layland introduced the concept of
achievable utilization factor to provide a low complexity test
for deciding the schedulability of independent periodic and
preemptable task sets executing on one processor.
In Earliest Deadline First scheduling, at every scheduling
point the task having the shortest deadline is taken up for
scheduling. The basic principle of this algorithm is very
intuitive and simple to understand. The schedulability test for
EDF is also simple. A task is schedule under EDF, if and only
if it satisfies the condition that total processor utilization (U
i
)
due to the task set is less than 1.
With scheduling periodic processes that have deadlines
equal to their periods, EDF has a utilization bound of 100%.
Thus, the schedulability test for EDF is:
Where the {C
i
} are the worst-case computation-times of the
n processes and the {T
i
} are their respective inter-arrival
periods (assumed to be equal to the relative deadlines).
The schedulability test introduced by Liu and Layland for
RM states that a task set will not miss any deadline if it meets
the following condition: U n(2
1/n
- 1). Liu and Layland
provided a schedulability tests that fails to identify many
schedulable task sets when the system is heavily overloaded.
After the work of Liu and Layland, many researchers have
introduced improvements on the schedulability condition for
RM for one and multi processors. These improvements include
the introduction of additional timing parameters in the
schedulability tests and transformations on the task sets. It is a
well-known fact that when more timing parameters are
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
32 | P a g e
http://ijacsa.thesai.org/
introduced in the schedulability condition better performance
can be achieved.
For example let us Consider 3 periodic processes scheduled
using EDF, the following acceptance test shows that all
deadlines will be met.
Table 1: Task Parameters
Process Execution Time = C Period = T
P1 1 8
P2 2 5
P3 4 10
The utilization will be:
The theoretical limit for any number of processes is 100%
and so the system is schedulable.
EDF has been proven to be an optimal uniprocessor
scheduling algorithms [8].This means that if a set of tasks is
unschedulable under EDF, then no other scheduling algorithm
can feasible schedule this task set. The EDF algorithm chooses
for execution at each instant in the time currently active job(s)
that have the nearest deadlines. The EDF implementation upon
uniform parallel machines is according to the following rules
[2], No Processor is idled while there are active jobs waiting
for execution, when fewer then mjobs are active, they are
required to execute on the fastest processor while the slowest
are idled, and higher priority jobs are executed on faster
processors.
A formal verification which guarantees all deadlines in a
real-time system would be the best. This verification is called
feasibility test.
Three different kinds of tests are available:-
Exact tests with long execution times or simple
models [11], [12], [13].
Fast sufficient tests which fail to accept feasible task
sets, especially those with high utilizations [14], [15].
Approximations, which are allowing an adjustment of
performance and acceptance rate [1], [8].
For many applications an exact test or an approximation
with a high acceptance rate must be used. For many task sets a
fast sufficient test is adequate.
EDF is an appropriate algorithm to use for online
scheduling on uniform multiprocessors. However, their
implementation suffers from a great number of migrations due
to vast fluctuations caused by finishing or arrival of jobs with
relatively nearer deadlines. Task migration cost might be very
high. For example, in loosely coupled system such as cluster of
workstation a migration is performed so slowly that the
overload resulting from excessive migration may prove
unacceptable [3]. Another disadvantage of EDF is that its
behavior becomes unpredictable in overloaded situations.
Therefore, the performance of EDF drops in overloaded
condition such that it cannot be considered for use. In this
paper we are presenting a new approach, call the Earliest
Feasible Deadline First (EFDF) which is used to reduce the
time complexity of earliest deadline first algorithm by some
assumptions.
II. BACKGROUND AND REVIEW OF RELATED WORKS
Each processor in a uniform multiprocessor machine is
characterized by a speed or Computing capacity, with the
interpretation that a job executing on a processor with speed s
for t time units completes (s * t) units of execution. The
Earliest-Deadline First scheduling of real-time systems upon
uniform multiprocessor machines is considered. It is known
that online algorithms tend to perform very poorly in
scheduling such real-time systems on multiprocessors;
resource-augmentation techniques are presented here that
permit online algorithms in general (EDF in particular) to
perform better than may be expected given these inherent
limitations.
Generalization the definition of utilization from periodic
task to nonperiodic tasks has been studies in [23] and [24]. In
deriving the utilization bound for rate monotonic scheduler
with multiframe and general real time task models, Mok and
Chen in [25] and [26] proposed a maximum average utilization
which measures utilization in an infinite measuring window.
To derive the utilization bound for nonperiodic tasks and
multiprocessor system, the authors in [23] and [24] proposed a
utilization definition that is based on relative deadlines of tasks,
instead of periods. It is shown that EDF scheduling upon
uniform multiprocessors is robust with respect to both job
execution requirements and processor computing capacity.
III. SCHEDULING ON MULTIPROCESSOR SYSTEM
Meeting the deadlines of a real-time task set in a
multiprocessor system requires a scheduling algorithm that
determines, for each task in the system, in which processor they
must be executed (allocation problem), and when and in which
order, with respect to other tasks, they must start their
execution (scheduling problem). This is a problem with a
difficult solution, because (i) some research results for a single
processor not always can be applied for multiple processors
[17], [18], (ii) in multiple processors different scheduling
anomalies appear [19], [21], [20] and (iii) the solution to the
allocation problem requires of algorithms with a high
computational complexity.
The scheduling of real-time tasks on multiprocessors can be
carried out under the partitioning scheme or under the global
scheme. In the partitioning scheme (Figure 1.a) all the instances
(or jobs) of a task are executed on the same processor. In
contrast, in the global scheme (Figure 1.b), a task can migrate
from one processor to another during the execution of different
instances. Also, an individual job of a task that is preempted
from some processor, may resume execution in a different
processor. Nevertheless, in both schemes parallelism is
prohibited, that is, no job of any task can be executed at the
same time on more than one processor.
On both schemes, the admission control mechanism not
only decides which tasks must be accepted, but also it must
create a feasible allocation of tasks to processors (i.e., on each
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
33 | P a g e
http://ijacsa.thesai.org/
processor, all tasks allocated must met their deadlines). For the
partitioning and global schemes, task sets can be scheduled
using static or dynamic schedulers. In any case, the
computational complexity associated to the admission control
must remain as low as possible, especially for the dynamic
case.
The partitioning scheme has received greater attention than
the global scheme, mainly because the scheduling problem can
be reduced to the scheduled on single processors, where at the
moment a great variety of scheduling algorithms exist. It has
been proved by Leung and Whitehead [18] that the partitioned
and global approaches to static-priority scheduling on identical
multiprocessors are incomparable in the sense that (i) there are
task sets that are feasible on identical processors under the
partitioned approach but for which no priority assignment
exists which would cause all jobs of all tasks to meet their
deadlines under global scheduling on the same processors,
and (ii) there are task sets that are feasible on identical
processors under the global approach, which cannot be
partitioned into distinct subsets such that each individual
partition is feasible on a single static-priority uniprocessor.
Fig. 1. (a). Partitioning and (b). Global Scheduling Schemes
IV. OUR PROPOSED GRID APPROXIMATION STRATEGY
We have applied some modification to the global Earliest
Deadline First algorithms to decrease the number of task
migration and also to add predictability to its behavior. In order
to decrease the number of migrations we prevent a job from
moving to another processor if it is among the mhigher priority
jobs. The scheduling algorithms can be classified in static and
dynamic. In a static scheduling algorithm, all scheduling
decisions are provided a priori. Given a set of timing
constraints and a schedulability test, a table is constructed,
using one of many possible techniques (e.g., using various
search techniques), to identify the start and completion times of
each task, such that no task misses their deadlines. This is a
highly predictable approach, but it is static in the sense that
when the characteristics of the task set change the system must
be re-started and its scheduling table re-computed.
In a dynamic scheduling algorithm, the scheduling decision
is executed at run-time based on task's priorities. The dynamic
scheduling algorithms can be classified in algorithms with fixed
priorities and algorithms with variable priorities. In the
scheduling algorithms with fixed priorities, the priority of each
task of the system remains static during the complete execution
of the system, whereas in an algorithm with variable priorities
the priority of a task is allowed to change at any moment.
The schedulability test in static scheduling algorithms can
only be performed off-line, but in dynamic scheduling
algorithms it can be performed off-line or on-line. In the o-
line scheduling test, there are complete knowledge of the set of
tasks executing in the system, as well as the restrictions
imposed to each one of the tasks (deadlines, precedence
restrictions, execution times), before the start of their
execution. Therefore no new tasks are allowed to arrive in the
system. Therefore, a job will continue its execution on the same
processor if possible (processor affinity
1
).
A. The Strategy
In Earliest Deadline First scheduling, at every scheduling
point the task having the shortest deadline is taken up for
scheduling. The basic principle of this algorithm is very
intuitive and simple to understand. The schedulability test for
Earliest Deadline First is also simple. A task is schedule under
EDF, if and only if it satisfies the condition that total processor
utilization due to the task set is less than 1. For a set of periodic
real-time task {T1, T
2
, T
n
}, EDF schedulibility criterion can be
expressed as:-
Where e
i
is the execution time, p
i
is the priority of task and
u
i
is the average utilization due to the task T
i
and n is the total
number of task in set. EDF has been proven to be an optimal
uniprocessor scheduling algorithm [8]. This means that if a set
of task is unschedulable under Earliest Deadline First , then no
other scheduling algorithm can feasible schedule this task set.
In the simple schedulability test for EDF we assumed that the
period of each task is the same as its deadline. However in
practical problem the period of a task may at times be different
from its deadline. In such cases, the schedulability test needs to
be changed. If p
i
>d
i
, then each task needs e
we are
ount of
computing time every min(p
i
, d
i
) duration time. Therefore we
can write:
However, if p
i
<d
i
, it is possible that a set of tasks is EDF
schedulable , even when the task set fail to meet according to
expression
B. Mathematical Representation
Our motivation for exploiting processor affinitydrive from
the observation that, for much parallel application, time spent
bringing data into the local memory or cache is significant
source of overhead, ranging between 30% to 60% of the total
execution time [3]. While migration is unavoidable in the
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
34 | P a g e
http://ijacsa.thesai.org/
global schemes, it is possible to minimize migration caused by
a poor assignment of task to processors.
By scheduling task on the processor whose local memory or
cache already contains the necessary data, we can significantly
reduce the execution time and thus overhead the system. It is
worth mentioning that still a job might migrate to another
processor when there are two or more jobs that were last
executed on the same processor. A migration might also
happen when the numbers of ready jobs become less than the
number processors. This fact means that our proposed
algorithm is a work conserving one.
In order to give the scheduler a more predictable behavior
we first perform a feasibility check to see whether a job has a
chance to meet its deadline by using some exiting algorithm
like Yaos [16]. If so, the job is allowed to get executed.
Having known the deadline of a task and its remaining
execution time it is possible to verify whether it has the
opportunity to meet its dead line. More precisely, this
verification can be done by examining a tasks laxity
3
. The
laxity of a real-time task T
i
at time t, L
i
(t), is defined as
follows:-
L
i
(t) =D
i
(t) - E
i
(t)
Where D
i
(t) is the dead line by which the task T
i
must be
completed and E
i
(t) is the amount of computation remaining to
be performed. In other words, Laxity is a measure of the
available flexibility for scheduling a task. A laxity of L
i
(t)
means that if a task T
i
is delayed at most by L
i
(t) time units, it
will still has the opportunity to meet its deadline.
A task with zero laxity must be scheduled right away and
executed without preemption or it will fail to meet its deadline.
A negative laxity indicates that the task will miss the deadline,
no matter when it is possible picked up for execution. We call
this novel approach the Earliest Feasible Deadline First
(EFDF)
C. EFDF Scheduling Algorithm
Let m denote the number of processing nodes and n, (nm)
denote the number of Available tasks in a uniform parallel real-
time system. Let s
1
, s
2
, s
m
denote the computing capacity of
available processing nodes indexed in a non-increasing
manner: s
j
s
j
+1 for all j, 1<j<m. We assume that all speeds
are positive i.e. s
j
>0 for all j. In this section we are presenting
five steps of EFDF algorithm. Obviously, each task which is
picked for up execution is not considered for execution by
other processors. Here we are giving following methods for our
new approach:
1. Perform a feasibility check to specify the task
which has a chance to meet their deadline and put
them into a set A, Put the remaining tasks into set
B. We can partition the task set by any existing
approach.
2. Sort both task sets A and B according to their
deadline in a non-descending order by using any
of existing sorting algorithms. Let k denote the
number of tasks in set A, i.e. the number of tasks
that have the opportunity to meet their deadline.
3. For all processor j, (jmin(k,m)) check whether a
task which was last running on the j
th
processor is
among the first min(k,m) tasks of set A. If so
assign it to the j
th
processor. At this point there
might be some processors to which no task has
been assigned yet.
4. For all j, (jmin(k,m)) if no task is assigned to the
j
th
processor , select the task with earliest deadline
from remaining tasks of set A and assign it to the
j
th
processor. If km, each processor have a task
to process and the algorithm is finished.
5. If k<m, for all j, (k<jm) assign the task with
smallest deadline from B to the j
th
processor. The
last step is optional and all the tasks from B will
miss their deadlines.
D. Experimental Evaluation
We conducted simulation-based experimental studies to
validate our analytical results on EFDF overhead. We consider
an SMP machine with four processors. We consider four tasks
running on the system. Their execution times and periods are
given in Table 2. The total utilization is approximately 1.5,
which is less than 4, the capacity of processors. Therefore,
LLREF can schedule all tasks to meet their deadlines. Note that
this task sets (i.e., max
N
{u
i
}) is 0.818, but it does not affect
the performance of EFDF, as opposed to that of global EDF
[22].
Table 2: Task Parameters (4 Task Set)
Process P
i
Execution Time C
i
Period T
i
U
i
P1 9 11 0.818
P2 5 25 0.2
P3 3 30 0.1
P4 5 14 0.357
Figure 1: Scheduler Invocation Frequency with 4 Tasks
In Figure 1, the upper-bound on the scheduler invocation
frequency and the measured frequency are shown as a dotted
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
35 | P a g e
http://ijacsa.thesai.org/
line and a fluctuating line, respectively. We observe that the
actual measured frequency respects the upper bound.
Table 3: Task Parameters (8 Task Set)
Process P
i
Execution Time C
i
Period T
i
U
i
P1 3 7 0.429
P2 1 16 0.063
P3 5 19 0.263
P4 4 5 0.8
P5 2 26 0.077
P6 15 26 0.577
P7 20 29 0.69
P8 14 17 0.824
Figure 2: Scheduler Invocation Frequency with 8 Tasks
Figure 2 shows the upper-bound on the invocation
frequency and the actual frequency for the 8-task set.
Consistently with the previous case, the actual frequency never
moves beyond the upper-bound. We also observe that the
average invocation frequencies of the two cases are
approximately 1.0 and 4.0, respectively. As expected the
number of tasks proportionally affects EFDF overhead.
E. Complexity and Performance of the Partitioning
Algorithms
In Table 2 we are taking the compression of given standard
and simulated complexities of different algorithms given below
and we are comparing these complexities to our purposed
algorithm, the complexity and performance of the partitioning
algorithms is introduced. Note that the algorithms with lowest
complexity are RMNF-L&L, RMGT/M, and EDF-NF, while
the algorithm with highest complexity is RBOUND-MP. The
rest of the algorithms have complexity O(n log n). The
algorithms with best theoretical performance are RM-FFDU,
RMST, RMGT, RMGT/M, EDF-FF and EDF-BF.[16]
TABLE 2 :COMPLEXITY AND PERFORMANCE OF THE MULTIPROCESSOR
PARTITIONING ALGORITHMS
F. Complexity Analysis
The Earliest Deadline First algorithm would be
maintaining all tasks that are ready for execution in a queue.
Any freshly arriving task would be inserted at the end of queue.
Each task insertion will be achieved in O(1) or constant time,
but task selection (to run next) and its deletion would require
O(n) time, where n is the number of tasks in the queue. EDF
simply maintaining all ready tasks in a sorted priority queue
that will be used a heap data structure. When a task arrives, a
record for it can be inserted into the heap in O(log
2
n) time
where n is the total number of tasks in the priority queue.
Therefore, the time complexity of Earliest Deadline First is
equal to that of a typical sorting algorithm which is O(n log
2
n).
While in the EFDF the number of distinct deadlines that tasks
is an application can have are restricted.
In our approach, whenever a task arrives, its absolute
deadline is computed from its release time and its relative
deadline. A separate first in first out (FIFO) queue is
maintained for each distinct relative deadline that task can
have. The schedulers insert a newly arrived task at the end of
the corresponding relative deadline queue. So tasks in each
queue are ordered according to their absolute deadlines. To find
a task with the earliest absolute deadline, the scheduler needs to
search among the threads of all FIFO queues. If the number of
priority queue maintained by the scheduler in n, then the order
of searching would be O(1). The time to insert a task would
also be O(1). So finally the time complexity of five steps of
Earliest Feasible Deadline First (EFDF) are O(n), O(n log
2
n),
O(m), O(m), O(m), respectively.
V. CONCLUSION AND FUTURE WORK
This work focused on some modification to the global
Earliest Deadline First algorithms to decrease the number of
task migration and also to add predictability to its behavior.
Mainly Earliest Feasible Deadline First algorithms are
presented the least complexity according to their performance
analyzed. Experimental result of Earliest Feasible Deadline
First (EFDF) algorithm reduced the time complexity in
compression of Earliest Deadline First algorithm on real time
system scheduling for multiprocessor system and perform the
feasibility checks to specify the task which has a chance to
meet their deadline.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
36 | P a g e
http://ijacsa.thesai.org/
When Earliest Feasible Deadline First is used to schedule a
set of real-time tasks, unacceptable high overheads might have
to be incurred to support resource sharing among the tasks
without making tasks to miss their respective deadlines, due to
this it will take again more time. Our future research will
investigate other less complexity Algorithm and also reduced
the overhead for different priority assignments for global
scheduling which will, consequently, lead to different bounds.
We believe that such studies should be conducted regularly
by collecting data continuously so that skill demand patterns
can be understood properly. This understanding can lead to
informed curricula design that can prepare graduates equipped
with necessary skills for employment. Once such studies
are carried out, students can use the findings to select courses
that focus on those skills which are in demand. Academic
institutions can use the findings so that those skills in demand
can be taken into account during curriculum design.
As an advance to our work, in future, we have desire to
work on different deployment approaches by developing more
strong and innovative algorithms to solve the time complexity
of Earliest Deadline First. Moreover, as our proposed algorithm
is a generalized one, we have planned to expand our idea in the
field of Real Time System existing Rate Monotonic Algorithm
for calculating minimum Time Complexity. Moreover, we have
aim to explore some more methodologies to implement the
concept of this paper in real world and also explore for Fault
Tolerance Task Scheduling Algorithms to finding the Task
Dependency in single processor or multiprocessor system for
reducing the time for fault also reduce the risk for fault and
damage.
ACKNOWLEDGMENTS
The authors thank the reviewers of drafts of this paper. It is
profound gratitude and immense regard that we acknowledge
to Dr. S.P. Panda, Chairman, GGI, Prof. N.V.J. Rao Dean
(Admin), GGI for their confidence, support and blessing
without which none of this would have been possible. Also a
note to all professors here in GIET for the wisdom and
knowledge that they given us, all of which came together in the
making of this paper. We express our gratitude to all my
friends and colleagues as well for all their help and guidance.
REFERENCES
[1] S. Baruah, S. Funk, and J. Goossens , Robustness Results Concerning
EDF Scheduling upon Uniform Multiprocessors,IEEE Transcation on
computers, Vol. 52, No.9 pp. 1185-1195 September 2003.
[2] E.P.Markatos, and T.J. LeBlanc, Load Balancing versus Locality
Management in Shared-Memory Multiprocessors, The 1992
International Conference on Parallel Processing, August 1992.
[3] S. Lauzac, R. Melhem, and D. Mosses,Compression of Global and
Partitioning Scheme for Scheduling Rate Monotonic Task on a
Multiprocessor, The 10th EUROMICRO Workshop on Real-Time
Systems, Berlin,pp.188-195, June 17-18, 1998.
[4] Vahid Salmani ,Mohsen Kahani , Deadline Scheduling with Processor
Affinity and Feasibility Check on Uniform Parpllel Machines, Seventh
International Conference on Computer and Information Technology,
CIT.121.IEEE,2007.
[5] S. K. Dhall and C. L. Liu, On a real-time scheduling problem.
Operations Research, 26(1):127140, 1978.
[6] Y. Oh and S. Son. Allocating fixed-priority periodic tasks on
multiprocessor systems, Real-Time Systems Journal, 9:207239, 1995.
[7] J. Lehoczky, L. Sha, and Y. Ding. The rate monotonic Scheduling:
Exact characterization and average case behavior, IEEE Real-time
Systems Symposium, pages 166171, 1989.
[8] C.M. Krishna and Shin K.G. Real-Time Systems. Tata
McGrawiHill,1997.
[9] S. Chakraborty, S. Knzli, L. Thiele. Approximate Schedulability
Analysis. 23rd IEEE Real-Time Systems Symposium (RTSS), IEEE
Press, 159-168, 2002.
[10] J.A. Stankovic, M. Spuri, K. Ramamritham, G.C. Buttazzo. Deadline
Scheduling for Real-Time Systems EDF and Related Algorithms. Kluwer
Academic Publishers, 1998.
[11] S. Baruah, D. Chen, S. Gorinsky, A. Mok. Generalized Multiframe
Tasks. The International Journal of Time-Critical Computing Systems,
17, 5-22, 1999.
[12] S. Baruah, A. Mok, L. Rosier. Preemptive Scheduling Hard-Real-Time
Sporadic Tasks on One Processor. Proceedings of the Real- Time
Systems Symposium, 182-190, 1990.
[13] K. Gresser. Echtzeitnachweis Ereignisgesteuerter Realzeitsysteme.
Dissertation (in german), VDI Verlag, Dsseldorf, 10(286), 1993.
[14] M. Devi. An Improved Schedulability Test for Uniprocessor Periodic
Task Systems. Proceedings of the 15th Euromicro Conference on Real-
Time Systems, 2003.
[15] C. Liu, J. Layland. Scheduling Algorithms for Multiprogramming in
Hard Real-Time Environments. Journal of the ACM, 20(1), 46-61, 1973
[16] Omar U. Pereira Zapata, Pedro Meja Alvarez EDF and RM
Multiprocessor Scheduling Algorithms: Survey and Performance
Evaluation Report No. CINVESTAV-CS-RTG-02. CINVESTAV-IPN,
Seccin de Computacin.
[17] S. K. Dhall and C. L. Liu, On a Real-Time Scheduling Problem,
Operation Research, vol. 26, number 1, pp. 127-140, 1978.
[18] J. Y.-T. Leung and J. Whitehead, On the Complexity of Fixed-Priority
Scheduling of Periodic Real-Time Tasks, Performance Evaluation,
number 2, pp. 237-250,1982.
[19] R. L. Graham, Bounds on Multiprocessing Timing Anomalies,SLAM
Journal of Applied Mathematics, 416-429, 1969.
[20] R. Ha and J. Liu, Validating Timing Constraints in Multiprocessor and
Distributed Real-Time Systems, Intl Conf. on Distributed Computing
system, pp. 162-171, June 21-24, 1994.
[21] B. Andersson, Static Priority Scheduling in Multiprocessors, PhD
Thesis, Department of Comp.Eng., Chalmers University, 2003.
[22] J. Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen,An
Optimal Real-Time Scheduling Algorithm for Multiprocessors, IEEE
Conference Proceedings, SIES 2007: 9-16.
[23] B. Anderson,Static-priority scheduling on multiprocessors, PhD
dissertation, Dept. of Computer eng., Chalmers Univ. of
Technology,2003.
[24] T.Abdelzaher and C.Lu,Schedubility Analysis and Utilization Bound of
highly Scalable Real-Time Services, Proc. 15
th
Euro-micro Conf. Real
Time Systems,pp.141-150,july 2003.
[25] A.K.Moc and D.Chen, A General Model for Real Time Tasks,
Technical Report TR-96-24,Dept. of Computer Sciences, Univ.of Texas
at Austin, Oct.1996.
[26] A.K.Moc and D.Chen, A multiframe Model for Real Time
Tasks,IEEE Trans. Software Eng., vol. 23 ,no.10,pp.635-645,Oct 1997.
AUTHORS PROFILE
Jagbeer Singh has received a bachelors degree in Computer
Science and engineering, from the Dr. B.R.A. University Agra 2000, Uttar
Pradesh (India). In 2006, he received a masters degree in computer science
from the Gandhi Institute of Engineering and Technology Gunupur , under
Biju Patnaik University of Technology Rourkela ,Orissa(India) He has been a
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
37 | P a g e
http://ijacsa.thesai.org/
Asst. Professor Gandhi Institute of Engineering and Technology Gunupur in
the Department of Computer Science since 2004. His research interests are in
the areas of Real Time Systems under the topics Fault Tolerance Tasks
Scheduling in single processor or multiprocessor system, he has published 3
peer-reviewed, and 6 scientific papers, organized 5 national research papers in
international/ national conferences and organized national conferences
/workshops, and serves as a reviewer for 3 journals, conferences, workshops,
and also having membership for different professional bodies like
ISTE,CSI,IAENG etc.
Bichitrananda Patra He is an assistant professor at the
Department of Information Technology Engineering, Gandhi Institute of
Engineering Technology, Gunupur, Orissa, India, He received his master
degree in Physics and Computer Science from the Utkal University,
Bhubaneswar, Orissa, India. His research interests are in Soft Computing,
Algorithm analysis, statistical, neural nets. He has published 8 research papers
in international journals and conferences organized national workshops and
conference and also having membership for different professional bodies like
ISTE, CSI etc.
Satyendra Prasad Singh having M. Sc., MCA and Ph. D. in
Statistics and working as a Professor and Head of department of MCA, Gandhi
Institute of Computer Studies, Gunupur, Rayagada, Orissa, India since 2007.
He has worked as a Research Associate in Defence Research and Development
Organisation, Ministry of Defence, Government of India, New Delhi for 2
years and also worked in different universities. Also he has received a Young
Scientist Award in year 2001 by International Academy of Physical Sciences
for the best Research Paper in CONIAPS-IV, 2001. He has published more
than 10 papers in reputed International/National journals and presented 15
Papers in International/National Conferences in the field of Reliability
Engineering, Cryptology and Pattern Recognization. He has guided many M.
Tech and MCA project thesis.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
38 | P a g e
http://ijacsa.thesai.org/
Analysis of Software Reliability Data using
Exponential Power Model
Ashwini Kumar Srivastava
Department of Computer Application,
S.K.P.G. College, Basti, U.P., India
[email protected]
Vijay Kumar
Departments of Mathematics & Statistics,
D.D.U. Gorakhpur University, Gorakhpur, U.P., India
[email protected]
AbstractIn this paper, Exponential Power (EP) model is
proposed to analyze the software reliability data and the present
work is an attempt to represent that the model is as software
reliability model. The approximate MLE using Artificial Neural
Network (ANN) method and the Markov chain Monte Carlo
(MCMC) methods are used to estimate the parameters of the EP
model. A procedure is developed to estimate the parameters of
the EP model using MCMC simulation method in OpenBUGS by
incorporating a module into OpenBUGS. The R functions are
developed to study the various statistical properties of the
proposed model and the output analysis of MCMC samples
generated from OpenBUGS. A real software reliability data set is
considered for illustration of the proposed methodology under
informative set of priors.
Keywords- EP model, Probability density, function, Cumulative
density function, Hazard rate function, Reliability function,
Parameter estimation, MLE, Bayesian estimation.
I. INTRODUCTION
Exponential models play a central role in analyses of
lifetime or survival data, in part because of their convenient
statistical theory, their important 'lack of memory' property and
their constant hazard rates. In circumstances where the one-
parameter family of exponential distributions is not sufficiently
broad, a number of wider families such as the gamma, Weibull
and lognormal models are in common use. Adding parameters
to a well-established family of models is a time honoured
device for obtaining more flexible new families of models. The
Exponential Power model is introduced by [14] as a lifetime
model. This model has been discussed by many authors [4], [9]
and [12].
A model is said to be an Exponential Power model with
shape parameter o>0 and scale parameter >0, if the survival
function of the model is given by
( )
( ) x
R x ( , ) > 0 and x (0, ) exp 1 e ,
o
o e
=
`
)
.
A. Model Analysis
For > 0 and, > 0 the two-parameter Exponential Power
model has the distribution function
( ) x
F(x; , ) 1 exp 1 e ; ( , ) 0, x 0
o
o = o > >
`
)
(1)
The probability density function (pdf) associated with eq
(1) is given by
( ) ( ) x x 1
f (x; , ) x e e xp 1 e ; ( , ) 0, x 0
o o
o o
o = o o > >
`
)
(2)
We shall write EP(, ) to denote Exponential Power model
with parameters .
as shape parameter by [4] and [14]. The R functions
dexp.power( ) and pexp.power( ) given in SoftreliaR
package can be used for the computation of pdf and cdf,
respectively.
Some of the typical EP density functions for different
values of o and for = 1 are depicted in Figure1. It is clear
from the Figure 1 that the density function of the Exponential
Power model can take different shapes.
Figure 1 Plots of the probability density function of the Exponential
Power model for =1 and different values of o
1) Mode
The mode can be obtained by solving the non-linear
equation
( ) ( )
( ) x
1 x 1 e 0
o
o
o +o =
`
)
. (3)
2) The quantile function
For a continuous distribution F(x), the p percentile (also
referred to as fractile or quantile), x
p
, for a given p, 0 < p <1,
is a number such that
p p
P(X x ) F(x ) p s = = . (4)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
39 | P a g e
http://ijacsa.thesai.org/
The quantile for p=0.25 and p=0.75 are called first
and third quartiles and the p=0.50 quantile is called the
median(Q
2
). The five parameters
Minimum(x), Q
1
, Q
2
, Q
3
, Maximum(x)
are often referred to as the five-number summary or
explanatory data analysis. Together, these parameters give a
great deal of information about the model in terms of the
centre, spread, and skewness. Graphically, the five numbers are
often displayed as a boxplot. The quantile function of
Exponential Power model can be obtained by solving
( ) x
1 exp 1 e p
o
=
`
)
or, ( ) { }
1
p
1
x log 1 log 1 p ; 0 p 1
o
= < <
. (5)
The computation of quantiles the R function qexp.power(
), given in SoftreliaR package, can be used. In
particular, for p=0.5 we get
( ) { } ( )
1
0.5
1
Median(x ) log 1 log 0.5
o
=
. (6)
3) The random deviate generation
Let U be the uniform (0,1) random variable and F(.) a cdf
for which F
-1
(.) exists. Then F
-1
(u) is a draw from distribution
F(.) . Therefore, the random deviate can be generated from
EP(o, ) by
( ) { }
1 1
x log 1 log 1 u ; 0 u 1
o
= < <
(7)
where u has the U(0, 1) distribution. The R function
rexp.power( ), given in SoftreliaR package, generates the
random deviate from EP(, ).
4) Reliability function/survival function
The reliability/survival function
( ) ( )
{ }
S x; , ( , ) > 0 and exp 1 exp x , x 0
o
o o = > (8)
The R function sexp.power( ) given in SoftreliaR
package computes the reliability/ survival function.
5) The Hazard Function
The hazard function of Exponential Power model is given
by
( ) ( )
1
h x; , ( , ) >0 and x exp x , x 0
o o o
o o = o > (9)
and the allied R function hexp.power( ) given in
SoftreliaR package. Since the shape of h(x) depends on
the value of the shape parameter o. When o 1, the failure
rate function is increasing. When o < 1, the failure rate
function is of bathtub shape. Thus the shape parameter o plays
an important role for the model.
Since differentiating equation (9) w.r.to x, we have
( ) ( ) ( )
{ }
1
h x 1 x
x
o
' = o +o . (10)
Setting h (x) ' = 0 and after simplification, we obtain the
change point as
1
1
0
1
x
o
o | |
=
|
o
\ .
. (11)
It easily follows that the sign of ( ) h x ' is determined by
( ) ( ) 1 .x
o
o +o which is negative for all x x
0
and positive
for all x x
0
.
Figure 2 Plots of the hazard function of the Exponential Power
model for =1 and different values of o
Some of the typical Exponential Power Model hazard
functions for different values of o and for = 1 are depicted in
Figure 2. It is clear from the Figure 2 that the hazard function
of the Exponential Power model can take different shapes.
6) The cumulative hazard function
The cumulative hazard function H(x) defined as
{ } H(x) 1 logF(x) = (12)
can be obtained with the help of pexp.power( ) function
given in SoftreliaR package by choosing arguments
lower.tail=FALSE and log.p=TRUE. i.e.
- pexp.power(x, alpha, lambda, lower.tail = FALSE,
log.p = TRUE)
7) Failure rate average (fra) and Conditional survival
function(crf)
Two other relevant functions useful in reliability analysis
are failure rate average (fra) and conditional survival function
(crf) The failure rate average of X is given by
H(x)
FRA(x) =
x
, x > 0, (13)
where H(x) is the cumulative hazard function.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
40 | P a g e
http://ijacsa.thesai.org/
The survival function (s.f.) and the conditional survival of
X are defined by
R(x)= 1 F(x)
and
R (x + t)
R (x | t) =
R(x)
, t > 0, x > 0, R ( ) > 0, (14)
respectively, where F() is the cdf of X. Similarly to h(x) and
FRA(x), the distribution of X belongs to the new better than
used (NBU), exponential, or new worse than used (NWU)
classes, when R (x | t) < R(x), R(t | x) = R(x), or R(x | t) >
R(x), respectively.
The R functions hra.exp.power( ) and crf.exp.power( )
given in SoftreliaR package can be used for the failure rate
average (fra) and conditional survival function(crf),
respectively.
II. MAXIMUM LIKELIHOOD ESTIMATION AND
INFORMATION MATRIX
Let x=(x
1
, . . . , x
n
) be a sample from a distribution with
cumulative distribution function (1). The log likelihood
function of the parameter L(o, ) is given by
( )
{ }
n
i
i 1
n n
i i
i 1 i 1
logL( , ) nlog n log ( 1) log x
x n exp x
=
= =
= + +
+ +
o o o
o o o o
(15)
Therefore, to obtain the MLEs of o and we can
maximize eq.(15) directly with respect to o and or we can
solve the following two non-linear equations using iterative
procedure [2] and [4]:
( ) ( )
{ }
n
i
i 1
n
i i i
i 1
logL n
nlog log x
x log( x ) 1 exp x 0
=
=
c
= + + +
c
=
o o
o o
(16)
( )
{ }
n
1
i i
i 1
logL n
x 1 exp x 0
=
c
= + =
c
o o o
o
o
(17)
Let us denote
( )
, u = o as the MLEs of ( ) , u = o . It is
not possible to obtain the exact variances of
( )
, u = o . The
asymptotic variances of
( )
, u = o can be obtained from the
following asymptotic property of
( )
, u = o
( )
( )
( )
1
2
N 0, I( )
u u u (18)
where I(u) is the Fishers information matrix given by
2 2
2
2 2
2
ln L ln L
E E
I( )
ln L ln L
E E
( | | | |
c c
( | |
| |
co c
co (
\ . \ .
u =
(
| | | |
( c c
| |
(
| |
co c
c
( \ . \ .
(19)
In practice, it is useless that the MLE has asymptotic
variance ( )
1
I( )
u because we do not know u. Hence, we
approximate the asymptotic variance by plugging in the
estimated value of the parameters. The common procedure is
to use observed Fisher information matrix
O( ) u (as an
estimate of the information matrix I(u)) given by
2 2
2
2 2
2
( , )
ln L ln L
O( ) H( )
ln L ln L
u=u
o
| |
c c
|
coc
co |
u = = u
|
c c
|
|
coc
c \ .
(20)
where H is the Hessian matrix, u=(o, ) and
= ( , ) u o . The
observed Fisher information is evaluated at MLE rather than
determining the expectation of the Hessian at the observed
data. This is simply the negative of the Hessian of the log-
likelihood at MLE. If the Newton-Raphson algorithm is used to
maximize the likelihood then the observed information matrix
can easily be calculated. Therefore, the variance-covariance
matrix is given by
( )
1
Var( ) cov( , )
H( )
cov( , ) Var( )
u=u
| |
o o
u = |
|
o
\ .
. (21)
Hence, from the asymptotic normality of MLEs,
approximate 100(1-)% confidence intervals for o and can be
constructed as
/ 2
z Var( )
o o and
/ 2
z Var( )
(22)
where z
/2
is the upper percentile of standard normal variate.
III. BAYESIAN ESTIMATION IN OPENBUGS
The most widely used piece of software for applied
Bayesian inference is the OpenBUGS. It is a fully extensible
modular framework for constructing and analyzing Bayesian
full probability models. This open source software requires
incorporation of a module (code) to estimate parameters of
Exponential Power model.
A module dexp.power_T(alpha, lambda) is written in
component Pascal, enables to perform full Bayesian analysis of
Exponential Power model into OpenBUGS using the method
described in [15] and [16].
A. Implementation of Module - dexp.power_T(alpha, lambda)
The developed module is implemented to obtain the Bayes
estimates of the Exponential Power model using MCMC
method. The main function of the module is to generate
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
41 | P a g e
http://ijacsa.thesai.org/
MCMC sample from posterior distribution under informative
set of priors, i.e. Gamma priors.
1) Data Analysis
We are using software reliability data set SYS2.DAT - 86
time-between-failures [10] is considered for illustration of the
proposed methodology. In this real data set, Time-between-
failures is converted to time to failures and scaled.
B. Computation of MLE and Approximate ML estimates using
ANN
The Exponential Power model is used to t this data set.
We have started the iterative procedure by maximizing the log-
likelihood function given in eq.(15) directly with an initial
guess for o = 0.5 and = 0.06, far away from the solution. We
have used optim( ) function in R with option Newton-Raphson
method. The iterative process stopped only after 7 iterations.
We obtain o = 0.905868898,
m
n
PnWn Ci
1
.. (1)
Where, Ci = i
th
main attribute.
M = number of sub-factors in the i
th
attribute.
Pn = Fuzzy value of n
th
input parameter
Wn = Expert weight of the relative input parameter
TABLE IV
EXAMPLES OF INPUT CASES
Three different examples as input cases to the Fuzzy
Expert system
All Sub Factors
Case
A
Case
B
Case
C
Proficiency in teaching 0.0054 H M L
Personal Interest In Teaching 0.0075 H M L
Presentation & Comm. Skills 0.0063 VH H L
Speaking Style & Body lang. 0.0050 M L M
Content knowledge 0.0059 M L M
Lecture preparation 0.0067 H M H
Language command 0.0059 VH H VH
Response to Student queries 0.0067 VH VH H
Question Tackling 0.0050 M L M
Courses taught (nature) 0.0038 M M L
Students Performance 0.0029 VH M M
Work load 0.0046 H H H
Fairness in marking 0.0071 M M M
The above input data (Table-IV) is entered to the Fuzzy
Expert System; through a built-in interface of the system for
computing decision score as shown in Figure-3. Numbers 5,
4, 3, 2, 1 entered as input representing 5=Very High, 4=High,
3=Medium, 2=Low, 1=Very Low. In Figure 3, the interface
two buttons have also shown, i.e., Explain and Why which
are available for explanation of inputs and reasoning
capabilities respectively.
After completion of the input data the Fuzzy Expert
System used the scale in Table-V, to rank the three cases A,
B, C respectively.
TABLE V
DECISION MAKING SCALE TO A LINGUISTIC DESCRIPTION
(MAX WEIGHT 0.0729)
Fuzzy Expert system output Linguistic Description
X< 0.0109 Poor
0.0109 X < 0.0.0218 Satisfied
0.0.0218 X< 0.0328 Good
0.0328 X < 0.0437 Very Good
0.0437 X < 0.0546 Excellent
X 0.0546 Outstanding
According to the developed scale in Table-V, the Fuzzy
Expert System mapped the calculated numeric results of the
three cases from qualitative input data into linguistic output
description, as shown in Table-VI.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
55 | P a g e
http://ijacsa.thesai.org/
Fig. 2 Fuzzy Expert System Model
Fig. 3 User Interface
TABLE VI
THREE CASES RANKING
Case System Calculation Description
A 0.0573 Outstanding
B 0.0465 Excellent
C 0.0451 Excellent
Users
I
n
p
u
t
s
R
e
s
u
l
t
s
Input Sources:
ACR, Students, HoD,
Colleagues, teachers
Personal data, others
Knowledge Representation
Decision
Making
Reference Knowledge
Fuzzy Logic Concept
for handling linguistic
mapping
Representation
of Extracted
Knowledge in
Fuzzy Rules
Explaining the
Reasoning
capabilities
Draw
Conclusions
Inference engine
User Interface
Explanation
Facility
A Pool of High Qualified
& Experience Subject
Experts
E
l
i
c
i
t
Set of attributes
extracted from subject
experts expertise &
knowledge for problem
solution
K
n
o
w
l
e
d
g
e
A
c
q
u
i
s
i
t
i
o
n
Intelligent Information
Knowledge Engineers
Questionnaire as a
knowledge
collection Tool
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
56 | P a g e
http://ijacsa.thesai.org/
V. CONCLUSION & FUTURE DIRECTION
Regular teachers assessment is suggested to maintain
quality in higher education; literature clearly depicts taht
there is a vast potential of the applications of fuzzy logic &
expert system in teachers assessment. Expert system
technology using Fuzzy Logic is very interesting for
qualitative facts evaluation. A model of fuzzy expert system
is proposed to evaluate teachers performance on the basis of
various key performance attributes that have been validated
previously through subject experts. The fuzzy scale has been
designed to map & control the input data values from
absolute truth to absolute false. The qualitative variables are
mapped into numeric results by implementing the fuzzy
expert systems model through various input examples and
provided a basis to use the system ranking for further
decision making. Thus, the uncertain and qualitative
knowledge of the problem domain have been handled
absolutely through integration of expert system technology
with fuzzy logic concept.
The proposed model produced significant bases for
performance assessment and adequate support in decision
making, so the research on the issue can be continued.
Important aspect of this issue that could focus on in the
future is the fuzzy expert systems model that could be
extended to all type of employees assessment in universities
as well as in others government & private organizations.
REFERENCES
[1] H. Amin, A. R. Khan, Acquiring Knowledge for Evaluation of
Teachers Performance in Higher Education using a Questionnaire.
International Journal of Computer Science and Information Security
(IJCSIS) 2(2009), 180-187.
[2] H. Amin, An Intelligent Frame work for teachers performance
evaluation at higher education institutions of Pakistan, Master Thesis,
Institute of Information Technology, Kohat University of Science &
Technology, Kohat, NWFP, Islamic Republic of Pakistan.(2009).
(Unpublished)
[3] S.Ammar, W.Duncombe, B.Jump, R.Wright, Constructing a fuzzy-
knowledge-based-system: An application for assessing the financial
condition of public schools. Expert Systems with Applications,
27(2004), 349364.
[4] S.M.Bai, S.M.Chen, A new method for students learning achievement
using fuzzy membership functions. In Proceedings of the 11th
conference on artificial intelligence, Kaohsiung, Taiwan, Republic of
China. (2006).
[5] D.Biggs, M.Sagheb-Tehrani, Providing developmental feedback to
individuals from different ethnic minority groups using expert systems.
Expert Systems, 25(2008), 87-97.
[6] A.Berrais, A knowledge-based expert system for earthquake resistant
design of reinforced concrete buildings. Expert Systems with
Applications, 28 (2005), 519530.
[7] S.M.Chen, C.H.Lee, New methods for students evaluating using fuzzy
sets. Fuzzy Sets and Systems, 104(1999), 209218.
[8] D.F.Chang, C.M.Sun, Fuzzy assessment of learning performance of
junior high school students. In Proceedings of the 1993 first national
symposium on fuzzy theory and applications, Hsinchu, Taiwan,
Republic of China, 1993,pp. 110.
[9] C. F.Cheung, W.B.Lee, W. M.Wang, K.F.Chu, S.To, A multi-
perspective knowledge-based system for customer service management.
Expert Systems with Applications, 24(2003), 457470.
[10] T.T.Chiang, C.M.Lin, Application of fuzzy theory to teaching
assessment. In Proceedings of the 1994 second national conference on
fuzzy theory and applications, Taipei, Taiwan, Republic of China,
1994, pp. 9297.
[11] H. K. H.Chow, K.L.Choy, W.B.Lee, F.T.S.Chan, Design of a
knowledge-based logistics strategy system. Expert Systems with
Applications, 29(2005), 272290.
[12] J.A.Clark, F.Soliman, A graphical method for assessing knowledge-
based systems investments. Logistics Information Management,
12(1999), 6377.
[13] A.J.Day, A.K.Suri, A knowledge-based system for postgraduate
engineering courses. Journal of Computer Assisted Learning, 15(1999),
1427.
[14] J. Durkin, Application of Expert Systems in the Sciences. OHIO J. SCI.
90 (1990), 171-179.
[15] D.J.Fonseca, G.Uppal, T.J.Greene, A knowledge-based system for
conveyor equipment selection. Expert Systems with Applications,
26(2004), 615623.
[16] M.Hamidullah, Comparisons of the Quality of Higher Educations in
Public and Private Sector Institutions, PhD Thesis, University of Arid
Agriculture Rawalpindi, PAK, 2005.
[17] H.Iranmanesh, M.Madadi, An Intelligent System Framework for
Generating Activity List of a Project Using WBS Mind map and
Semantic Network. Proceedings of World Academy of Science,
Engineering and Technology. 30 (2008), 338-345.
[18] A.Kazaz, Application of an Expert System on the Fracture Mechanics
of Concrete. Artificial Intelligence Review. 19(2003), 177190.
[19] R.Kumra, R.M.Stein, I.Assersohn, Assessing a knowledgebase
approach to commercial loan underwriting. Expert Systems with
Applications, 30(2006), 507518.
[20] S. H. Liao, Problem solving and knowledge inertia. Expert Systems
with Applications, 22(2002), 2131.
[21] J.Ma, D.Zhou, Fuzzy set approach to the assessment of student-
centered learning. IEEE Transactions on Education, 43(2000), 237
241.
[22] W.W.Melek, A.Sadeghian, A theoretic framework for intelligent
expert systems in medical encounter evaluation. Expert Systems,
26(2009), 87-97.
[23] M.Naeemullah, Designing a Model for Staff Development in Higher
Education of Pakistan, PhD Thesis, University of Arid Agriculture
Rawalpindi, PAK, 2005.
[24] T.T.Pham, G.Chen, Some applications of fuzzy logic in rules-based
expert systems, Expert System,19(2002), 208-223.
[25] J.Pomar,C.Pomar, A knowledge-based decision support system to
improve sow farm productivity. Expert Systems with Applications,
29(2005), 3340.
[26] W. K.Wang, A knowledge-based decision support system for
measuring the performance of government real estate investment.
Expert Systems with Applications, 29(2005), 901912.
[27] W. K.Wang, H.C.Huang, M.C.Lai, Design of a knowledgebase
performance evaluation system: A case of high-tech state-owned
enterprises in an emerging economy. Expert Systems with
Applications. doi:10.1016/j.eswa.2007.01.032.
[28] W.Wen, W. K.Wang, T.H.Wang, A hybrid knowledgebased decision
support system for enterprise mergers and acquisitions. Expert Systems
with Applications, 28(2005a), 569582.
[29] W.Wen, W.K.Wang, C.H.Wang, A knowledge-based intelligent
decision support system for national defense budget planning. Expert
Systems with Applications, 28(2005b), 5566.
[30] M.H.Wu, Research on applying fuzzy set theory and item response
theory to evaluate learning performance. Master Thesis, Department of
Information Management, Chaoyang University of Technology,
Wufeng, Taichung County, Taiwan, Republic of China,2003.
[31] M.R.Shen, Y.Y.Tang, Z.T.Zhang, The intelligent assessment system in
Web-based distance learning education. 31st Annual Frontiers in
Education Conference , 1(2001), TIF-7-TIF-11.
[32] N.H.Yim, S.H.Kim, H.W.Kim, K.Y.Kwahk, Knowledge based
decision making on higher level strategic concerns: System dynamics
approach. Expert Systems with Applications, 27(2004), 143158.
[33] L.A. Zadeh, Fuzzy sets. Inform. and control, 8 (1965), 338-353.
[34] L.A.Zadeh, The concept of a linguistic variable and its application to
appropriate reasoning. Information sciences, 8(1975), 43-80.
AUTHORS PROFILE
Dr. Abdur Rashid Khan
Dr. Abdur Rashid Khan is presently working as Associate Professor at
Institute of Computing & Information Technology, Gomal University, Dera
Ismail Khan, Pakistan. He has completed his PhD in Computer Science
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
57 | P a g e
http://ijacsa.thesai.org/
from Krygyze Republic in 2004, and have published a number of articles in
national and international journals. His current interest includes Expert
Systems, Software Engineering, Management Information System, and
Decision Support System.
Hafeez Ullah Amin
Mr. Hafeez is a research student at Institute of Information Technology,
Kohat University of Science & Technology, Kohat 26000, KPK, Pakistan.
He has completed BS(Hons) in Inforamtion Technology and MS in
Computer Science in 2006 & 2009 respectiviely from the above cited
institution. His current research interests includes Artificial Intelligence,
Information System, and Data Base.
Zia ur Rehman
Mr.Zia ur Rehman is currently working as a Lecturer in Computer Science
in Fuji Foundation School & College, Kohat, Pakistan . He has completed
his MS in computer science from Institute of Information Technology,
Kohat University of Science & Technology, Kohat, KPK, Pakistan. His
current research interest includes Software Engineering, Expert System
Develpoment, and Information System.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
58 | P a g e
http://ijacsa.thesai.org/
Dynamic Approach to Enhance Performance of
Orthogonal Frequency Division Multiplexing
(OFDM) In a Wireless Communication Network
James Agajo (M.Eng)
1
, Isaac O. Avazi Omeiza(Ph.D)
2
, Idigo Victor Eze(Ph.D)
3
,Okhaifoh Joseph(M.Eng.)
4
1
Dept. of Electrical and Electronic Engineering, , Federal Polytechnic, Auchi, Edo state, Nigeria
2
Dept. of Electrical and Electronics, University of Abuja, Nigeria
3
Dept. Electronics/Computer Engineering, Nnamdi Azikiwe University, Awka, Anambra state, Nigeria
4
Dept. of Electrical and Electronic, Federal University of Petroleum Resources Warri, Delta State Nigeria
Email: [email protected]
AbstractI n the mobile radio environment, signals are usually
impaired by fading and multipath delay phenomenon. This work
modeled and simulates OFDM in a wireless environment, it also
illustrates adaptive modulation and coding over a dispersive
multipath fading channel whereby simulation varies the result
dynamically. Dynamic approach entails adopting probabilistic
approach to determining channel allocation; First an OFDM
network environment is modeled to get a clear picture of the
OFDM concept. Next disturbances such as noise are deliberately
introduced into systems that are both OFDM modulated and
non-OFDM modulated to see how the system reacts. This enables
comparison of the effect of noise on OFDM signals and non-
OFDM modulated signals. Finally efforts are made using digital
encoding schemes such as QAM and DPSK to reduce the effects
of such disturbances on the transmitted signals. In the mobile
radio environment, signals are usually impaired by fading and
multipath delay phenomenon. In such channels, severe fading of
the signal amplitude and inter-symbol-interference (ISI) due to
the frequency electivity of the channel cause an unacceptable
degradation of error performance. Orthogonal frequency
division multiplexing (OFDM) is an efficient scheme to mitigate
the effect of multipath channel.
Keywords- OFDM, I nter-Carrier I nterference, I FFT,
multipath,Signal.
I. INTRODUCTION
Mobile radio communication systems are increasingly
demanded to provide a variety of high- quality services to
mobile users. To meet this demand, modern. mobile radio
transceiver system must be able to support high capacity,
variable bit rate information transmission and high bandwidth
efficiency. In the mobile radio environment, signals are usually
impaired by fading and multipath delay phenomenon. In such
channels, severe fading of the signal amplitude and inter-
symbol-interference (ISI) due to the frequency selectivity of the
channel cause an unacceptable degradation of error
performance. Orthogonal frequency division multiplexing
(OFDM) is an efficient scheme to mitigate the effect of
multipath channel. Since it eliminates ISI by inserting guard
interval (GI) longer than the
delay spread of the channel [1], [2]. Therefore, OFDM is
generally known as an effective technique for high data rate
services. Moreover, OFDM has been chosen for several
broadband WLAN standards like IEEE802.11a, IEEE802.11g,
and European HIPERLAN/2, and terrestrial digital audio
broadcasting (DAB) and digital video broadcasting (DVB) was
also proposed for broadband wireless multiple access systems
such as IEEE802.16 wireless MAN standard and interactive
DVB-T [3], In OFDM systems, the pilot signal averaging
channel estimation is generally used to identify the channel
state information (CSI) [5]. In this case, large pilot symbols are
required to obtain an accurate CSI. As a result, the total
transmission rate is degraded due to transmission of large pilot
symbols. Recently, carrier interferometry (CI) has been
proposed to identify the CSI of multiple-input multiple-output
(MIMO). However, the CI used only one phase shifted pilot
signal to distinguish all the CSI for the combination of
transmitter and receiver antenna elements.[3,4]
In this case, without noise whitening, each detected channel
impulse response is affected by noise [6]. Therefore, the pilot
signal averaging process is necessary for improving the
accuracy of CSI [7]. To reduce this problem, time, frequency
interferometry (TFI) for OFDM has been proposed. [8] [10].
The main problem with reception of radio signals is fading
caused by multipath propagation. There are also inter-symbol
interference (ISI), shadowing etc. This makes link quality vary.
As a result of the multipath propagation, there are many
reflected signals, which arrive at the receiver at different times.
Some of these reflections can be avoided by using a directional
antenna, but it is impossible to use them for a mobile user. A
solution could be usage of antenna arrays, but this technology
is still being developed.
This is why this research and development of the OFDM
have received considerable attention and have made a great
deal of progress in all parts of the world. OFDM is a wideband
modulation scheme that is specifically able to cope with the
problems of the multipath reception. This is achieved by
transmitting many narrowband overlapping digital signals in
parallel, inside one wide band. [5]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
59 | P a g e
http://ijacsa.thesai.org/
A. Objective of Project
The aim of this project is to simulate the physical layer of
an OFDM system. It investigates the OFDM system as a whole
and provides a simple working model on which subsequent
research can be built. Hence the successful completion of this
work shall involve;
1) Practical description of the OFDM system.
2) Algorithm development based on mathematical analysis
of the OFDM scheme.
3) Modeling of the algorithm and a software test based on
the MATLAB/Simulink environment.
Thus a typical OFDM system modeling the data source, the
transmitter, the air channel and the receiver side of the system
is simulated.
This project is intended to model and simulate an OFDM
network environment. A simple data source is provided to
serve as the input; likewise the transmitter, channel and
receiver are modeled using appropriate block-sets in the
Simulink. A simple representation of the OFDM system is
modeled, though with little deviation from the real
implementation. But all efforts had been taken in this work to
reduce the effects of such deviations.[6]
B. OVERVIEW
Orthogonal Frequency Division is where the spacing
between carriers is equal to the speed (bit rate) of the message.
In earlier multiplexing literature, a multiplexer was primarily
used to allow many users to share a communications medium
like a phone trunk between two telephone central offices. In
OFDM, it is typical to assign all carriers to a single user; hence
multiplexing is not used with its generic meaning.
Orthogonal frequency division multiplexing is then the
concept of typically establishing a communications link using a
multitude of carriers each carrying an amount of information
identical to the separation between the carriers. In comparing
OFDM and single carrier communication systems (SCCM), the
total speed in bits per second is the same for both, 1 Mbit/sec
(Mbps) in this example. For single carrier systems, there is one
carrier frequency, and the 1 Mbps message is modulated on this
carrier, resulting in a 1 MHz bandwidth spread on both sides of
the carrier. For OFDM, the 1,000,000 bps message is split into
10 separate messages of 100,000 bit/sec each, with a 100 KHz
bandwidth spread on both sides of the carrier.[7]
To illustrate how frequencies change with time, we can use
the analogy of the sounds of an orchestra or band. One carrier
wave is analogous to one instrument playing one note, whereas
many carriers is analogous to many instruments playing at
once. Single carrier systems using a high speed message is
analogous to a drum roll where the sticks are moving fast.
A more detailed understanding of Orthogonal arises when
we observe that the bandwidth of a modulated carrier has a so
called sinc shape (sinx/x) with nulls spaced by the bit rate. In
OFDM, the carriers are spaced at the bit rate, so that the
carriers fit in the nulls of the other carriers.
II. CHOICE OF APPROACH
The bottom-up design approach is chosen for this work
because of its concise form and ease of explanation.
A. Modeling the OFDM system
For the simulation, the Signal Processing and the
Communication Block-sets are used. The OFDM network can
be divided into three parts i.e. the transmitter, receiver and
channel. A data source is also provided which supplies the
signal to be transmitted in the network. Thereafter, the bit error
rate can be calculated by comparing the original signal at the
input of the transmitter and the signal at the output of the
receiver.
Transmitter
Convolutional encoder. In order to decrease the error rate of
the system, a simple convolution encoder of rate 1/2 is used as
channel coding.
Interleaver.
The interleaver rearranges input data such that consecutive
data are split among different blocks. This is done to avoid
bursts of errors. An interleaver is presented as a matrix. The
stream of bits fills the matrix row by row. Then, the bits leave
the matrix column by column. The depth of interleaver can be
adjusted.
Modulation.
A modulator transforms a set of bits into a complex
number corresponding to a signal constellation. The
modulation order depends on the subcarrier.
Bits flow through an interleaver with high SNR will be
assigned more bits than a subchannel with low SNR.
Modulations implemented here are QPSK, 16QAM and
64QAM.
Symmetrical IFFT. Data are transformed into time-
domain using IFFT. The total number of subcarriers
translates into the number of points of the IFFT/FFT. A
mirror operation is performed before IFFT in order to get
real symbols as output
Cyclic Prefix (CP). To preserve the orthogonality
property over the duration of the useful part of signal, a
cyclic prefix is then added. The cyclic prefix is a copy of
the last elements of the frame.
D/A. Convert digital symbols to analog signals. This
operation is done using the AIC codec inside the DSP.
Channel[8]
The channel must have the same characteristics as the pair
of twisted wires found in the telephone network. In order
to achieve this, we use a telephone line emulation
hardware. Also, we have the possibility to use the
adjustable filter ZePo and the noise generator. This can be
very useful to test the system performance.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
60 | P a g e
http://ijacsa.thesai.org/
=
=
1
0
) ( ) ( ) , (
L
l
t hl t h t o t t
=
=
1
0
) 2 exp(
L
l
l f j hl t t
}
=
0
) 2 exp( ) , ( ) , ( dt ft j t h t f H t t
Fig. 2.0 OFDM system
B. Receiver
A/D. Convert analog signals to digital symbols for
processing.
Synchronization. Due to the clock difference between
transmitter and receiver, a synchronization algorithm is
needed to find the first sample in the OFDM frame.
Remove cyclic prefix. This block simply removes the
cyclic prefix added in the transmitter.
Symmetrical FFT. Data are transformed back to
frequency-domain using FFT. Then the
complex conjugate mirror added in the transmitter is
removed.
Channel estimation. The estimation is achieved by pilot
frames.
Channel compensation. The channel estimation is used
to compensate for channel distortion.
Bit loading. The receiver computes the bit allocation
and send it to the transmitter.[9]
Demodulation. Symbols are transformed back to bits.
The inverse of the estimated channel response is used to
compensate the channel gain.
Deinterleaver (Interleaving inverse operation). The
stream of bits fills the matrix column by column. Then,
the bits leave the matrix row by row.
Convolution decoder. The decoder performs the Viterbi
decoding algorithm to generate transmitted bits from the
coded bits.
We assume that a propagation channel consists of L discrete
paths with different time delays. The impulse response h(,t) is
represented as
.( 2.1)
where hl and l are complex channel gain and the time
delay of lth propagation path, respectively,
The channel transfer function H(f,t) is the Fourier transform
of h(,t) and is given by
.( 2.2)
. (2.3)
g(t) ={ ... (2.4)
The guard interval Tg is inserted in order to eliminate the
ISI due to the multi-path fading, and hence, we have
. T = T s + T g 2.5
In OFDM systems, Tg is generally considered as Ts/4 or
Ts/5. Thus, we assume Tg = Ts/4 in this paper. In (3), g(t) is the
transmission pulse which gives g(t)
1 for Tg < t <Ts
0 otherwise
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
61 | P a g e
http://ijacsa.thesai.org/
C. The Simulation Process
The simulation process is carried out in stages using
different digital modulators, and considering different effects of
the wireless interface. Hence, the wireless channel effects are
varied, as well as the type of digital modulators used for the
OFDM modulation. The effect of additive white Gaussian
noise (AWGN), is considered on a signal which is QAM
modulated. Finally, the combined effect of Phase noise and
AWGN on QAM modulated OFDM signal is modeled and
simulated. This enables comparison of the effect of noise on
OFDM signals using QAM modulation.[11]
D. Mathematical analysis of OFDM system.
This system compares the Error Rates of an OFDM
modulated signal with that of a non-OFDM modulated signal.
The error rates of an OFDM modulated are expected to be
lower than those of non-OFDM
. (2.6)
Where
(2.7)
This is of course a continuous signal. If we consider the
waveforms of each component of the signal over one symbol
period, then the variables Ac(t) and fc(t) modulated signals
which be written as equation 2.8.
OFDM spectrum
(a) A single Sub-channel
(b) Five carriers
Fig 2.1 Examples of OFDM spectrum
Mathematically, each carrier can be described as a complex
wave; [3] take on fixed values, which depend on the frequency
of that particular carrier, and so can be rewritten as:
.. (2.8)
If the signal is sampled using a sampling frequency of 1/T,
then the resulting signal is represented by:
(2.9)
At this point, we have restricted the time over which we
analyze the signal to N samples. It is convenient to sample over
the period of one data symbol. Thus we have a relationship
(2.10)
If we now simplify eqn. (2.9), without a loss of generality
by letting w0=0, then the signal becomes:
(2.10)
Now Eq. (2.10) can be compared with the general form of
the inverse Fourier transform:
. (2.11)
In eq. (2.10) and (2.11), the function
is no more than a definition of the signal in the sampled
frequency Domain, and is the time domain
representation. Eqns. (2.10) and (2.11) are equivalent if:
(2.12)
E. Factors influencing the control system
1) Signal- To- noise Ratio
AWGN is additive, which means that the noise signal adds
to the existing signal, resulting in a distorted version of the
original signal.
It is possible to determine the quality of a digitally
modulated signal influenced by AWGN using the probability
density function and the standard deviation signal. The signal
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
62 | P a g e
http://ijacsa.thesai.org/
}
= < < =
01
2
) 2 2 /( )) 92 . 4 ( (
10 00 ) 0 ( ) / (
t
dx x
x P S U P
| |
) 52 ( 10 / 10 10 10 Log NoisePower power QPSKsignal Log + =
| |
NoisePower power QPSKsignal Log SNR dB QPSK / 10 10 =
to- noise ratio is defined as the ratio [4] of the power of the
signal to the noise power.
2
)) , ( / ) , ( ( ) / ( c rms c rms power power f t n f t S N S SNR = = (2.13)
Or in decibel;
) ( 10 10 SNR Log SNRdB =
..... (2.14)
F. Probability of Error in QPSK modulation
Because of the randomness of AWGN, it is impossible to
predict the exact locations of incorrectly decoded bits, it is
however possible to theoretically predict the amount of
incorrectly decoded bits in the long run, and from that calculate
error probabilities like the symbol- error rates and bit- error
rates.
(2.15)
QPSK encodes two data bits into a sinusoidal carrier wave
by altering the sinusoidal carrier waves phase. The probability
that a QPSK decoder will incorrectly decode a symbol 00 U
given that the correct transmitted symbol was S
10
is given as
[5].
SNR
QPSK-Db =
.2.16
TABLE 2.1 : THE OFDM SYMBOL DECODING PROCESS
Unfortunately, this function is not directly solvable and look up
tables are used to determine the results.
There exists a function, though that is closely related to p,
The Q- function [5]
The total symbol error rate of a QPSK decoder can finally
be calculated as the average symbol error probability of
) / ( 10 00 S U p , ) / ( 01 01 S U p ,
) / ( 10 10 S U p
and
) / ( 10 11 S U p
..(2.16.1)
The SER is given as
| | ) 2 / ( 2 A Q SER = (2.17)
The Bit- Error- Rate of a QPSK decoder is given as:
| | ) 2 / (A Q BER =
.. (2.18)
Expressing the SER and BER as a function of SNR, we
have:
| | ) ( 2 SNR Q SER =
.............................(2.19)
| | ) ( SNR Q BER =
.............................. (2.20)
From the above relationships of eqns (2.19) and (2.20), the
plot for the BER & SER of a QPSK modulated signal is as
shown below
Figure 2.2: shows the state transition diagram for the model of a cell operating
under the proposed algorithm.
- - 2.21
Since IEEE 802.11a OFDM signal has st N = 52 2.22
QPSK sub carriers, the signal has 52 times more power
SNR BER SER
2 .11 .05
4 .33 .28
6 .55 .45
8 .69 .63
10 .75 .70
12 .80 .74
14 .85 .76
16 .89 .79
18 .91 .80
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
63 | P a g e
http://ijacsa.thesai.org/
: dB SNR SNR QPSK dB OFDM dB 6 . 17 + = (4.99d)..
2.23
G. STATE TRANSITION DIAGRAM
It is a very useful pictorial representation that clearly shows
the protocol (rule) operation.
Thus, its a tool to design the electronics that implements
the protocol and troubleshoots communication problems. In a
state diagram, all possible activity states of the system are
shown in nodes. At each state node, the system must respond to
some event occurring and then proceed to the appropriate next
state. [9]
H. Probabilitic Channel Allocation in OFDM
For a cell having S channels, the model has 2s+1 states,
namely 0, 1, 2, 3, S, I
O,
I,I
S-1
.
A cell is defined as cold cell if it is in state K, for 0 K n,
whereas a cell is called hot cell if it is not a state greater than n.
If a data call with rate d (respectively, a voice call with rate
v) arrives in a cold cell with state K, then the cell enters state
I
K
(respectively, state K+1). On the other hand, if a data call
with rate m or a new voice call with rate vn arrives in a hot
cell in state j for j>n, then the cell enters state Ij. A handoff
voice call is always assigned a whole channel. However, a new
voice is assigned a whole channel if the cell is a cold cell.
Let Pj denote the steady state probability that the process is
in state j, for j =0; 1; 2; _ _ _; S. Assuming that all channel
holding times are exponentially distributed.
For j = 0 (i.e., states 0 and I
0
):
(d + v)P
0
= P
1
(d + v) ........ (2.24)
d P
0
=P I
0
0
... (2.25)
It follows from (1) that
P
1
= d+v P0 =P
0.(2.26)
v+d
For j = 1; 2; .. n
(d + v)Pj + j(v + d)Pj = v P
j-1
+ (j + 1)(v + d)P
j+1
+
PI
j-1
j-1
..............(2.27)
dPj = PIj j (2.28)
Eq. (1.5) implies that d P
j-1
can be substituted for PI
j-1
j-1
in (4.4). Hence, Eq. (4.4) can be rewritten as
(d + v)P
j
+ j(v + d)Pj = vP
j-1
+ (j + 1)(v + d)P
j+1
+
dP
j-1
= (d + v)P
j-1
+ (j + 1)(d + v)P
j+1
.(4.6)
By solving (1.6) recursively by letting j = 1; 2; 3 and n-1 in
order to obtain
P
2
=1/2!
2
P
0
P
3
= 1/3!
3
P
0
P
4
= 1/4!
4
P
0
P
5
= 1/5!
5
P
0
and
P
n
= 1/n!
n
P
0
respectively.
Therefore, for 0 j n, we obtain
Pj =1/j!
j
P
0
.. (2.29)
For j = n+1, n+2, S-1, we have the following
balance equation in the equilibrium case
(m + vh)P
j
+ j(v + d)Pj = vhP
j-1
+ (j + 1)(v + d)P
j+1
+
mPI
j-1
(d +v) P
j
+ j(d + v)Pj =(d +v) P
j-1
+
(j+1)(d+v) P
(j+1)
...................................................(2.30)
Where d + v = m + vh
Note that Equation (10) is the same as Equation (1.6)
therefore Equ. (4.7) also holds for j= n+1, n+2,S-1.
For j = S
S(v + d)P
s
= vh P
s-1
+ PI
s-1
j-1
= vh PI
s-1
= (vh +m)P
s-1
=(d +v)P
s-1
(2.31)
P
s
=1/S!
s
P
0
....(2.32)
Thus for 0 J S, the steady state probability, P
j
is
P
j
=1/J!
j
P
0
.. (2.33)
The P
0
for handoff dropping probability, P
0d
, and new call
blocking probability, P
0b
are equal to the same steady-state
probability P
0s
, that is,
P
0
for P
d
= P
b
= P
s (2.34)
If a handoff calls requests a free packet slots of a channel
and is available. Then based on the algorithm proposed
P
d
=1/3j!
j
P
0
................................................(2.35)
When a new call is assigned a channel, the call is also
assigned a channel holding time, which is generated by an
exponential distribution function with a mean value of 15 time
slots. Call arrival is modeled with Markov Chain as a Poisson
process with different mean arrival rates, and the call duration
is exponentially distributed with a mean value of 15 time slots.
The traffic is characterized by the arrival rate of new calls and
by the transition probabilities of handoff calls. It is assumed
that base station has a buffer with a substantially large buffer
capacity to avoid significant packet loss.
III. SYSTEM IMPLEMENTATION
A. Software Subsystem Implementation
The OFDM system was modeled and simulated using
MATLAB & Simulink to allow various parameters of the
system to be varied and tested, including those established by
the standard as shown in fig 5.1 the simulation includes all the
stages for transmitter, channel and receiver, according to the
standard. Because of the MATLAB sampling time, the
transmission was implemented in baseband to avoid long
periods of simulation. Considering additive white gaussian
noise (AWGN) and multipath path Rayleigh fading effect, a
good approximation to the real performance can be observed,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
64 | P a g e
http://ijacsa.thesai.org/
over all in the degradation of the BER. At the transmitter,
OFDM signals are generated by Bernoulli Binary and mapped
by one of the modulation techniques. Then by using a
Sequence Generation, The transmitter section converts digital
data to be transmitted, into a mapping of sub carrier amplitude
and phase. It then transforms this spectral representation of the
data into the time domain using an Inverse Fast Fourier
Transform (IFFT). The OFDM symbol is equal to the length of
the IFFT size used (which is 1024) to generate the signal and it
has an integer number of cycles. The Cyclic Prefix and
Multiple Parameters were added before the signal conversion
from Parallel to Serial mode. The addition of a guard period to
the start of each symbol makes further improvement to the
effect of ISI on an OFDM signal. For generation an OFDM
signal, all the model variables parameters were setting in
suitable values in order to have smooth generated signal to be
transmitted. The channel simulation will allow for us to
examine the effects of noise and multipath on the OFDM
scheme. By adding small amount of random data of AWGN to
the transmitted signal, noise can be simulated. Generation of
random data at a bit rate that varies during the simulation. The
varying data rate is accomplished by enabling a source block
periodically for a duration that depends on the desired data rate.
The result above denotes the effect of noise (AWGN) on a
non-OFDM modulated signal. This result is to be compared
with the result of fig (4.3) so as to draw a comparison between
the effect of noise on a non-OFDM modulated data signals and
that of an OFDM modulated signal.
It is expected that OFDM performs better in noisy and
disturbed environment than any other modulation technique
compared with it here. It is also expected that the Bit Error
Rate (BER) and Symbol Error Rate (SER) of OFDM
modulated signals is always less than that of a non-OFDM
modulated signals. The theoretical symbol error probability of
PSK is Where erfc is the complementary error function,
O S
N E / is the ratio of energy in a symbol to noise power
spectral density, and M is the number of symbols.
..(3.1)
To determine the bit error probability, the symbol error
probability, PE, needs to be converted to its bit error
equivalent. There is no general formula for the symbol to bit
error conversion. Upper and lower limits are nevertheless easy
to establish. The actual bit error probability, Pb, can be shown
to be bounded by
..(3.2)
The lower limit corresponds to the case where the symbols
have undergone Gray coding. The upper limit corresponds to
the case of pure binary coding. Because increasing the value of
Eb/No lowers the number of errors produced, the length of
each simulation must be increased to ensure that the statistics
of the errors remain stable
Using the sim command to run a Simulink simulation from
the MATLAB Command Window, the following code
generates data for symbol error rate and bit error rate curves. It
considers Eb/No values in the range 0 dB to 12 dB, in steps of
2 dB.
IV. SIMULATION RESULT
The importance of modulation using (OFDM) can be seen
in the above simulations, since the un-modulated signal always
performs poorer than the modulated signal. Hence, modulation
makes a signal more conducive for transmission over the
transmission medium (in this case, the wireless channel). It is
also observed that appropriate choice of modulation techniques
could either increase or decrease the error rates of the signals.
Hence, a DPSK-modulated OFDM signal fig (4.6) is much
more conducive for transmission over the wireless channel than
any other type of modulation tested. The effect of the Additive
White Gaussian Noise (AWGN) is observed by modeling a
signal passing through a noisy channel without any form of
modulation. Afterwards, OFDM modulated signals (using
digital modulators such as QAM) are passed through the same
channel. The error rate is then compared.
A matlab file (see Appendix A) is written to vary the
signal-to-noise ratio (SNR) and plot the graph of the Bit Error
Rate (BER). The BER of the un-modulated signal is found to
be constant at 0.4904. The effect of a noisy channel on a QAM
signal is modeled as shown in fig 4.4. It is observed that an un-
modulated signal has a BER of about 50%, whereas OFDM
modulation reduces the BER significantly. It was also observed
that the DPSK-modulated OFDM signal reduces the BER
significantly, graph on Fig 4.7 shows a probabilistic approach
on the Comparison of the outage probability with the signal to
interference ratio of CCI, Appendiv 1 and Appendix 2
represent a classical representation of how OFDM and QPSK
simulation.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
65 | P a g e
http://ijacsa.thesai.org/
Fig 4.1 Modeling an OFDM network environment
Fig 4.2 Graph of Transmission spectrum
Fig 4.3 Graph of Receiver constellation
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
66 | P a g e
http://ijacsa.thesai.org/
Fig 4.4 Receiver constellation
Fig 4.5 BER probability graph an OFDM modulated signal
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
67 | P a g e
http://ijacsa.thesai.org/
Fig 4.6 BER Probability graph an OFDM Modulated Signal
Fig 4.7 Comparison of the outage probability with the signal to
interference ratio of CCI
A. Deployment
Modulation is a very important aspect of data transmission
since it makes a signal more conducive for transmission over
the transmission medium (in this case, the wireless channel).
Hence OFDM should be widely applied to Broadband wireless
access networks most especially in situations where the effects
of multipath fading and noise have to be eliminated.
V. CONCLUSION
This work was able to show that modulation using OFDM
technique is very important in Broadband wireles Access
Networks and noisy environments.The importance of
modulation can be seen in the above simulations, since the un-
modulated signal always performs poorer than the modulated
signal. Hence, modulation makes a signal more conducive for
transmission over the transmission medium (in this case, the
wireless channel). This work strongly recommends OFDM as a
strong candidate for Broadband Wireless Access Network.
APPENDIX 1
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
68 | P a g e
http://ijacsa.thesai.org/
APPENDIX 2
REFERENCES
[1] CIMINI, L, Analysis and simulation of digital mobile channel using
OFDM. IEEE Trans. Commun., vol. 33, no. 7, pp. 665-675, 1985.
[2] Guglielmo Marconi, Early Systems of Wireless Communication: A
downloadable paper (PDF) based on R.W. Simons' address to the
Institution of Electrical Engineers PP112-123, 1984.
[3] S.B. Weinstein and P.M. Ebert, Data transmission by Frequency-division
multiplexing using the Discrete Fourier transform, IEEE Trans.
Commun. Technol., vol. COM-19, pp. 628-634, Oct. 1971.
[4] V.E. IDIGOOrthogonal Frequency Division Multiplexing implemented as
part of a Software Defined Radio (SDR) environment by Christoph
Sonntag department of Electrical/Electronic Engineering University of
Stellenbosch South Africa Dec. 2005
[5] V.E. IDOGHO, OFDM as a possible modulation technique for Multimedia
applications in the range of mm waves Duan Mati, Prim
10/30/98/TUD-TVS
[6] ABEL KUO, Joint Bit-Loading and Power-Allocation for
OFDM,Macmillian publisher,pp1411-151, 2006
[7] E. I. Tolochko, M. Pereira and M. Faulkner, SNR Estimation in Wireless
LANs with Transmitter Diversity, in the 3rd ATcrc
Telecommunications and Net- working Conference, Melbourne,
Australia, Dec. 2003.
[8] Fischer , Symbol-Error Rate Analysis of Fischers Bit-Loading Algorithm,
approximate analysis of the performance of Fischers algorithm for a
system with a large number of sub channels, Sept 2004.
[8] AHN, C., SASASE, I. The effects of modulation combination, target BER,
Doppler frequency, and adaptive interval on the performance of adaptive
OFDM in broadband mobile channel,.IEEE Trans. Consum. Electron., ,
vol. 48, no. 1, pp.167 - 174, Feb. 1999.
[9] YOKOMAKURA, K., SAMPEI, S., MORINAGA, N. A carrier
interferometry based channel estimation technique for one-cell reuse
MIMO-OFDM/TDMA cellular systems. In Proc. VTC 2006, pp. 1733-
1737, 2006.
[10] YOKOMAKURA, K., SAMPEI, S., HARADA, H., MORINAGA, N. A
channel estimation technique for dynamic parameter controlled-
OF/TDMA systems. In Proc. IEEE PIMRC, vol.1, pp. 644- 648, 2005.
[11] AHN, C. Accurate channel identification with time-frequency
interferometry for OFDM. IEICE Trans. Fundamentals, vol. E90- A, no.
11, pp. 2641-2645, Nov. 2007.
[12] YOFUNE, M., AHN, C., KAMIO, T., FUJISAKA, H., HAEIWA,K.
Decision direct and linear prediction based fast fading compensation for
TFI-OFDM. In Proc. of ITC-CSCC2008, pp. 81 to 84, July 2008.
[13] YOSHIMURA, T., AHN, C., KAMIO, T., FUJISAKA,H.,HAEIWA, K.
Performance enhancement
AUTHORS PROFILE
Engr. James Agajo is into a Ph.D Programme in the field of Electronic
and Computer Engineering, He has a Masters Degree in Electronic and
telecommunication Engineering from Nnamdi Azikiwe University Awka
Anambra State, and also possesses a Bachelor degree in Electronics and
Computer Engineering from the Federal University of Technology Minna
Nigeria. His interest is in intelligent system development with a high flare for
Engineering and Scientific research. He has Designed and implemented the
most resent computer controlled robotic arm with a working grip mechanism
2006 which was aired on a national television , he has carried out work on
using blue tooth technology to communicate with microcontroller. Has also
worked on thumb print technology to develop high tech security systems with
many more He is presently on secondment with UNESCO TVE as a supervisor
and a resource person. James is presently a member of the following
association with the Nigeria Society of Engineers(NSE), International
Association of Engineers(IAENG) UK, REAGON, MIRDA,MIJICT.
Dr. Isaac Avazi Omeiza holds B.Eng, M.Eng and Ph.D degrees in
Electrical/Electronics Engineering. His lecturing career at the university level
has spanned a period of about two decades. He has lectured at the Nigerian
Defence Academy Kaduna( the Nigerian Military university), University of
Ilorin and the Capital-City-University of Nigeria University of Abuja. He
has also been a member of the Nigerian society of Engineers ( NSE) and a
member of the Institute of Electrical and Electronic Engineers of America (
IEEE ). He has supervised several undergraduate final-year projects in
Electronic designs and has done a number of research works in digital image
processing, Fingerprint processing and the processing of video signals.
Engr. Joseph Okhaifoh is into a PH.D programme, he holds a Masters
degree in Electronics and telecommunication Engineering and a Bachelor
Degree in Electrical and Electronics Engineering, he is presently a member of
Nigeria society of Engineers.
Dr V.E Idigo holds a Ph.D, M.Eng, BEng. in Communication
Engineering, a Member of IAENG,MNSE and COREN, he is presently the
Head of Department of Electrical Electronics in Nnamdi Azikiwe University
Awka Anambra State, Nigeria
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
69 | P a g e
http://ijacsa.thesai.org/
Sectorization of Full Kekres Wavelet Transform for
Feature extraction of Color Images
H.B.Kekre
Sr. Professor
MPSTME, SVKMs NMIMS (Deemed-to be-University)
Vile Parle West, Mumbai -56,INDIA
[email protected]
Dhirendra Mishra
Associate Professor & PhD Research Scholar
MPSTME, SVKMs NMIMS (Deemed-to be-University)
Vile Parle West, Mumbai -56,INDIA
[email protected]
AbstractAn innovative idea of sectorization of Full Kekres
Wavelet transformed (KWT)[1] images for extracting the
features has been proposed. The paper discusses two planes i.e.
Forward plane (Even plane) and backward plane (Odd plane).
These two planes are sectored into 4, 8, 12 and 16 sectors. An
innovative concept of sum of absolute difference (AD) has been
proposed as the similarity measuring parameters and compared
with the well known Euclidean distance (ED).The performances
of sectorization of two planes into different sector sizes in
combination with two similarity measures are checked. Class
wise retrieval performance of all sectors with respect to the
similarity measures i.e. ED and AD is analyzed by means of its
class (randomly chosen 5 images) average precision- recall cross
over points, overall average (average of class average) precision-
recall cross over points and two new parameters i.e. LIRS and
LSRR.
Keywords- CBIR, Kekres Wavelet Transform (KWT), Euclidian
Distance, Sum of Absolute Difference, LI RS, LSRR, Precision and
Recall.
I. INTRODUCTION
Content based image retrieval i.e. CBIR [2-6] is well
known technology being used and being researched upon for
the retrieval of images from the large image databases. CBIR
has been proved to be very much needed technology to be
researched on due to its applicability in various applications
like face recognition, finger print recognition, pattern
matching[7][8][9], verification /validation of images etc. The
concept of CBIR can be easily understood by the figure 1 as
shown below. Every CBIR systems needs functionality for
feature extraction of an image viz. shape, color, texture which
can represent the uniqueness of the image for the purpose of
best match in the database to be searched. The features of the
query image are compared with the features of all images in the
feature database using various mathematical construct known
as similarity measures. These mathematical similarity
measuring techniques checks the similarity of features
extracted to classify the images in the relevant and irrelevant
classes. The research in CBIR needs to be done to explore two
aspects first is the better method of feature extraction having
maximum components of uniqueness and faster, accurate
mathematical models of similarity measures. As the figure 1
shows the example of query image of an horse being provided
to CBIR system as query and the images of relevant classes are
retrieved. Relevance feedback of the retrieval is used for the
machine learning purpose to check the accuracy of the retrieval
which in turn helps one to focus on the modification in the
current approach to have improved performance.
Figure 1.The CBIR System [2]
Many researches are currently working on the very open
and demanding field of CBIR. These researches focus on to
generate the better methodologies of feature extractions in both
spatial domain and frequency domain. Some methodologies
like block truncation coding [10-11], various transforms: FFT
[12-14], Walsh transform [15-21], DCT [22], DST [23] and
other approaches like Hashing [24], Vector quantization [25],
Contour let transform [5], has already been developed
In this paper we have introduced a novel concept of
Sectorization of Full Kekres Wavelet transformed color
images for feature extraction. Two different similarity
measures parameters i.e. sum of absolute difference and
Euclidean distance are used. Average precision, Recall, LIRS
and LSRR are used for performances study of these
approaches.
II. KEKRES WAVELET [1]
Kekres Wavelet transform is derived from Kekres
transform. From NxN Kekres transform matrix, we can
generate Kekres Wavelet transform matrices of size
(2N)x(2N), (3N)x(3N),, (N2)x(N2). For example, from
5x5.Kekres transform matrix, we can generate Kekres
Wavelet transform matrices of size 10x10, 15x15, 20x20 and
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
70 | P a g e
http://ijacsa.thesai.org/
25x25. In general MxM Kekres Wavelet transform matrix can
be generated from NxN Kekres transform matrix, such that M
= N * P where P is any integer between 2 and N that is, 2 P
N. Kekres Wavelet Transform matrix satisfies [K][K]T = [D]
Where D is the diagonal matrix this property and hence it is
orthogonal. The diagonal matrix value of Kekres transform
matrix of size NxN can be computed as
(1)
III. PLANE FORMATION AND ITS SECTORIZATION
[12-19],[22-23]
The components of Full KWT transformed image shown in
the red bordered area (see Figure 2) are used to generate feature
vectors. The average of zeoeth row, column and last row and
column components are augmented to feature vector generated.
Color codes are used to differentiate between the co-efficients
plotted on Forward (Even) plane as light red and light blue for
co-efficients belonging to backward (Odd) plane. The co-
efficient with light red background i.e. at position
(1,1),(2,2);(1,3),(2,4) etc. are taken as X1 and Y1 respectively
and plotted on Even plane. The co-efficient with light blue
background i.e. at position (2,1),(1,2);(2,3),(1,4) etc. are taken
as X2 and Y2 respectively and plotted on Odd plane.
Figure 2: KWT component arrangement in an Transformed Image.
Even plane of Full KWT is generated with taking KWT
components into consideration as all X(i,j), Y(i+1, j+1)
components for even plane and all X(i+1, j), Y(i, j+1)
components for odd plane as shown in the Figure 3. Henceforth
for our convenience we will refer X(i,j) = X1, Y(i+1,j+1) =Y1
and X(i+1,j) = X2 and Y(i,j+1) = Y2.
X(i,j) Y(i,j+1)
X(i+1,j) Y(i+1,j+1)
Figure 3: Snapshot of Components considered for Even/Odd Planes.
As shown in the Figure 3 the Even plane of Full KWT
considers X1 i.e. all light red background cells (1,1),
(2,2),(1,3),(2,4) etc. on X axis and Y1 i.e. (1,2), (2,1),(1,4),(2,3)
etc. on Y axis. The Odd plane of Full KWT considers X1 i.e.
all light blue background cells (1,2), (2,1),(1,4),(2,3) etc. on X
axis and Y1 i.e. (1,2), (2,1),(1,4),(2,3) etc. on Y axis.
IV. RESULTS AND DISCUSSION.
Augmented Wang image database [4] has been used for the
experiment. The database consists of 1055 images of 12
different classes such as Flower, Sunset, Barbie, Tribal,
Cartoon, Elephant, Dinosaur, Bus, Scenery, Monuments,
Horses, Beach. Class wise distribution of all images in the
database has been shown in the Figure 7.
Figure 7: Class wise distribution of images in the Image database consists of
Sunset:51, Cartoon:46,
Flower:100,Elephants:100,Barbie:59,Mountains:100,Horse:100,Bus:100,Triba
l:100,Beaches:99,Monuments:100,Dinasour :100
Figure8. Query Image
The query image of the class dinosaur has been shown in
Figure 8. For this query image the result of retrieval of both
approaches of Full KWT wavelet transformed image
sectorization of even and odd planes. The Figure 9 shows First
20 Retrieved Images sectorization of Full KWT wavelet
Forward (Even) plane (16 Sectors) with sum of absolute
difference as similarity measure. There are two retrieval from
irrelevant class.
The first irrelevant image occurred 15
th
position and second
on 20
th
position (shown with red boundary) in the even plane
sectorization. The result of odd plane sectorization shown in
Figure 10; the retrieval of first 20 images containing 2
irrelevant retrievals but the first irrelevant class has occurred at
17
th
position.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
71 | P a g e
http://ijacsa.thesai.org/
Figure 9: First 20 Retrievals of Full KWT Forward (Even) plane sectorization
into 16 Sectors with sum of absolute difference as similarity measure.
Figure 10: First 20 Retrievals of Full KWT Backward (Odd) plane
sectorization into 16 Sectors with sum of absolute difference as similarity
measure..
Feature database includes feature vectors of all images in
the database. Five random query images of each class were
used to search the database. The image with exact match gives
minimum sum of absolute difference and Euclidian distance.
To check the effectiveness of the work and its performance
with respect to retrieval of the images we have calculated the
overall average precision and recall and its cross over values
and plotted class wise. The Equations (2) and (3) are used for
precision and recall calculation whilst two new parameters i.e.
LIRS (Length of initial relevant string of images) and LSRR
(Length of string to recover all relevant images) are used as
shown in Equations (4) and (5).
All these parameters lie between 0-1 hence they can be
expressed in terms of percentages. The newly introduced
parameters give the better performance for higher value of
LIRS and Lower value of LSRR [8-13].
Figure 11: Class wise Average Precision and Recall cross over points of
Forward Plane (Even) sectorization of Full KWT Wavelet with sum of
Absolute Difference (AD) and Euclidean Distance (ED) as similarity measure.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
72 | P a g e
http://ijacsa.thesai.org/
Figure 12: Class wise Average Precision and Recall cross over points of
Backward Plane (Odd) sectorization of Full KWT Wavelet with Absolute
Difference (AD) and Euclidean Distance (ED) as similarity measure.
Figure 13: Comparison of Overall Precision and Recall cross over points of
sectorization of Full KWT Wavelet with sum of Absolute Difference (AD)
and Euclidean Distance (ED) as similarity measure.
Figure 14: The LIRS Plot of sectorization of forward plane of Full KWT
transformed images . Overall Average LIRS performances (Shown with
Horizontal lines :0.082 (4 Sectors ED), 0.052 (4 Sectors AD), 0.071(8 Sectors
ED), 0.051(8 Sectors AD), 0.075(12 Sectors ED), 0.069(12 Sectors AD),
0.053(16 Sectors ED), 0.053(16 Sectors AD) ).
Figure 15: The LIRS Plot of sectorization of Backward plane of Full KWT
transformed images . Overal Average LIRS performances (Shown with
Horizontal lines :0.081 (4 Sectors ED), 0.054 (4 Sectors AD), 0.073(8 Sectors
ED), 0.050(8 Sectors AD), 0.064(12 Sectors ED), 0.049(12 Sectors AD),
0.056(16 Sectors ED), 0.042(16 Sectors AD) ).
O
v
e
r
a
l
l
A
v
e
r
a
g
e
P
r
e
c
i
s
i
o
n
a
n
d
R
e
c
a
l
l
C
r
o
s
s
o
v
e
r
p
o
i
n
t
Methods (Combination of Plane and
Similarity measures)
Sectorization of Kekre's Wavelet (Full)
(With Augmentation)
Sector 4
Sector 8
Sector 12
Sector 16
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
73 | P a g e
http://ijacsa.thesai.org/
Figure 16: The LSRR Plot of sectorization of forward plane of Full KWT
transformed images . Overall Average LSRR performances (Shown with
Horizontal lines :0.77 (4 Sectors ED), 0.71 (4 Sectors AD), 0.76(8 Sectors
ED), 0.71(8 Sectors AD), 0.76(12 Sectors ED), 0.73(12 Sectors AD), 0.74(16
Sectors ED), 0.71(16 Sectors AD) ).
Figure 17: The LSRR Plot of sectorization of backward plane of Full KWT
transformed images . Overall Average LSRR performances (Shown with
Horizontal lines :0.77(4 Sectors ED), 0.729 (4 Sectors AD), 0.76(8 Sectors
ED), 0.725(8 Sectors AD), 0.756(12 Sectors ED), 0.726(12 Sectors AD),
0.759(16 Sectors ED), 0.727(16 Sectors AD) ).
V. CONCLUSION
The work experimented on 1055 image database of 12
different classes discusses the performance of sectorization of
Full KWT wavelet transformed color images for image
retrieval. The work has been performed with both approaches
of sectorization of forward (even) plane and backward (odd)
planes. The performance of the methods proposed checked
with respect to various sector sizes and similarity measuring
approaches viz. Euclidian distance and sum of absolute
difference. We calculated the average precision and recall cross
over point of 5 randomly chosen images of each class and the
overall average is the average of these averages. The
observation is that sectorization of both planes of full KWT
wavelet transformed images give less than 30% of the overall
average retrieval of relevant images as shown in the Figure 13.
The class wise plot of these average precision and recall cross
over points as shown in Figure 11 and Figure 12 for both
approaches depicts that the retrieval performance varies from
class to class and from method to method wherein horses,
flower and dinosaur classes have retrieval more than 50%.
They have the performance above the average of all methods as
shown by horizontal lines. New parameter LIRS and LSRR
gives good platform for performance evaluation to judge how
early all relevant images is being retrieved (LSRR) and it also
provides judgement of how many relevant images are being
retrieved as part of first set of relevant retrieval (LIRS).The
value of LIRS must be minimum and LSRR must be minimum
for the particular class if the overall precision and recall cross
over point of that class is maximum. This can be clearly seen in
Figures 14 to Figure 17. This observation is very clearly visible
for dinosaur class however the difference of LIRS and LSRR
of other classes varies. The sum of absolute difference as
similarity measure is recommended due to its lesser complexity
and better retrieval rate performance compared to Euclidian
distance.
REFERENCES
[1] H.B.Kekre, Archana Athawale and Dipali sadavarti, Algorithm to
generate Kekres Wavelet transform from Kekres Transform,
International Journal of Engineering, Science and Technology,
Vol.2No.5,2010 pp.756-767.
[2] Dr. Qi, semantic based CBIR(content based image
retrieval),http://cs.usu.edu/htm/REU-Current-Projects.
[3] Kato, T., Database architecture for content based image retrieval in
Image Storage and Retrieval Systems (Jambardino A and Niblack W
eds),Proc SPIE 2185, pp 112-123, 1992.
[4] Ritendra Datta,Dhiraj Joshi,Jia Li and James Z. Wang, Image
retrieval:Idea,influences and trends of the new age,ACM Computing
survey,Vol 40,No.2,Article 5,April 2008.
[5] Ch.srinivasa rao,S. srinivas kumar,B.N.Chaterjii, content based image
retrieval using contourlet transform, ICGST-GVIP Journal, Vol.7 No.
3, Nov2007.
[6] John Berry and David A. Stoney The history and development of
fingerprinting, in Advances in Fingerprint Technology, Henry C. Lee
and R. E. Gaensslen, Eds., pp. 1-40. CRC Press Florida, 2
nd
edition,
2001.
[7] Arun Ross, Anil Jain, James Reisman, A hybrid fingerprint matcher,
Intl conference on Pattern Recognition (ICPR), Aug 2002.
[8] A. M. Bazen, G. T. B.Verwaaijen, S. H. Gerez, L. P. J. Veelenturf, and
B. J. van der Zwaag, A correlation-based fingerprint verification
system, Proceedings of the ProRISC2000 Workshop on Circuits,
Systems and Signal Processing, Veldhoven, Netherlands, Nov 2000.
[9] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, DST Applied to
Column mean and Row Mean Vectors of Image for Fingerprint
Identification, International Conference on Computer Networks and
Security, ICCNS-2008, 27-28 Sept 2008, Vishwakarma Institute of
Technology, Pune.
[10] H.B.Kekre, Sudeep D. Thepade, Using YUV Color Space to Hoist the
Performance of Block Truncation Coding for Image Retrieval, IEEE
International Advanced Computing Conference 2009 (IACC09), Thapar
University, Patiala, INDIA, 6-7 March 2009.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
74 | P a g e
http://ijacsa.thesai.org/
[11] H.B.Kekre, Sudeep D. Thepade, Image Retrieval using Augmented
Block Truncation Coding Techniques, ACM International Conference
on Advances in Computing, Communication and Control (ICAC3-2009),
pp.: 384-390, 23-24 Jan 2009, Fr. Conceicao Rodrigous College of
Engg., Mumbai. Available online at ACM portal.
[12] H. B. Kekre, Dhirendra Mishra, Digital Image Search & Retrieval using
FFT Sectors published in proceedings of National/Asia pacific
conference on Information communication and technology(NCICT 10)
5
TH
& 6
TH
March 2010.SVKMS NMIMS MUMBAI
[13] H.B.Kekre, Dhirendra Mishra,Digital Image Search & Retrieval using
FFT Sectors of Color Images published in International Journal of
Computer Science and Engineering (IJCSE) Vol.
02,No.02,2010,pp.368-372 ISSN 0975-3397 available online at
http://www.enggjournals.com/ijcse/doc/IJCSE10-02- 02-46.pdf
[14] H.B.Kekre, Dhirendra Mishra, CBIR using upper six FFT Sectors of
Color Images for feature vector generation published in International
Journal of Engineering and Technology(IJET) Vol. 02, No. 02, 2010,
49-54 ISSN 0975-4024 available online at
http://www.enggjournals.com/ijet/doc/IJET10-02- 02-06.pdf
[15] H.B.Kekre, Dhirendra Mishra, Four walsh transform sectors feature
vectors for image retrieval from image databases, published in
international journal of computer science and information technologies
(IJCSIT) Vol. 1 (2) 2010, 33-37 ISSN 0975-9646 available online at
http://www.ijcsit.com/docs/vol1issue2/ijcsit2010010201.pdf
[16] H.B.Kekre, Dhirendra Mishra, Performance comparison of four, eight
and twelve Walsh transform sectors feature vectors for image retrieval
from image databases, published in international journal of
Engineering, science and technology(IJEST) Vol.2(5) 2010, 1370-1374
ISSN 0975-5462 available online at
http://www.ijest.info/docs/IJEST10-02-05-62.pdf
[17] H.B.Kekre, Dhirendra Mishra, density distribution in walsh transfom
sectors ass feature vectors for image retrieval, published in international
journal of compute applications (IJCA) Vol.4(6) 2010, 30-36 ISSN
0975-8887 available online at
http://www.ijcaonline.org/archives/volume4/number6/829-1072
[18] H.B.Kekre, Dhirendra Mishra, Performance comparison of density
distribution and sector mean in Walsh transform sectors as feature
vectors for image retrieval, published in international journal of Image
Processing (IJIP) Vol.4(3) 2010, ISSN 1985-2304 available online at
http://www.cscjournals.org/csc/manuscript/Journals/IJIP/Volume4/Issue
3/IJIP-193.pdf
[19] H.B.Kekre, Dhirendra Mishra, Density distribution and sector mean
with zero-sal and highest-cal components in Walsh transform sectors as
feature vectors for image retrieval, published in international journal of
Computer scienece and information security (IJCSIS) Vol.8(4) 2010,
ISSN 1947-5500 available online http://sites.google.com/site/ijcsis/vol-
8-no-4-jul-2010
[20] H.B.Kekre, Vinayak Bharadi, Walsh Coefficients of the Horizontal &
Vertical Pixel Distribution of Signature Template, In Proc. of Int.
Conference ICIP-07, Bangalore University, Bangalore. 10-12 Aug 2007.
[21] J. L. Walsh, A closed set of orthogonal functions American Journal of
Mathematics, Vol. 45, pp.5-24,year 1923.
[22] H.B.Kekre, Dhirendra Mishra, DCT sectorization for feature vector
generation in CBIR, International journal of computer application
(IJCA),Vol.9, No.1,Nov.2010,ISSN:1947-5500
http://ijcaonline.org/archives/volume9/number1/1350-1820
[23] H.B.Kekre, Dhirendra Mishra, DST Sectorization for feature vector
generation, Universal journal of computer science and engineering
technology(Unicse),Vol.1,No.1Oct 2010 available at
http://www.unicse.oeg/index.php?option=com content and
view=article&id=54&itemid=27
[24] H.B.Kekre, Dhirendra Mishra, Content Based Image Retrieval using
Weighted Hamming Distance Image hash Value published in the
proceedings of international conference on contours of computing
technology pp. 305-309 (Thinkquest2010) 13th & 14
th
March 2010.
[25] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, Image Retrieval
using Color-Texture Features from DST on VQ Codevectors
obtained by Kekres Fast Codebook Generation, ICGST International
Journal on Graphics, Vision and Image Processing (GVIP),
Available online at http://www.icgst.com/gvip
AUTHORS PROFILE
H. B. Kekre has received B.E. (Hons.) in Telecomm.
Engg. from Jabalpur University in 1958, M.Tech
(Industrial Electronics) from IIT Bombay in 1960,
M.S.Engg. (Electrical Engg.) from University of
Ottawa in 1965 and Ph.D.(System Identification) from
IIT Bombay in 1970. He has worked Over 35 years as
Faculty and H.O.D. Computer science and Engg. At
IIT Bombay. From last 13 years working as a professor in Dept. of Computer
Engg. at Thadomal Shahani Engg. College, Mumbai. He is currently senior
Professor working with Mukesh Patel School of Technology Management and
Engineering, SVKMs NMIMS University vile parle west Mumbai. He has
guided 17 PhD.s 150 M.E./M.Tech Projects and several B.E./B.Tech Projects.
His areas of interest are Digital signal processing, Image Processing and
computer networking. He has more than 350 papers in National/International
Conferences/Journals to his credit. Recently ten students working under his
guidance have received the best paper awards. Two research scholars working
under his guidance have been awarded Ph. D. degree by NMIMS University.
Currently he is guiding 10 PhD. Students. He is life member of ISTE and
Fellow of IETE.
Dhirendra Mishra has received his BE (Computer Engg) and M.E.
(Computer Engg) degree from University of Mumbai,
Mumbai, India He is PhD Research Scholar and
working as Associate Professor in Computer
Engineering department of Mukesh Patel School of
Technology Management and Engineering, SVKMs
NMIMS University, Mumbai, India. He is life member
of Indian Society of Technical education (ISTE),
Member of International association of computer
science and information technology (IACSIT),
Singapore, Member of International association of Engineers (IAENG). His
areas of interests are Image Processing, Image Databases; Pattern matching,
Operating systems, Information Storage and Management.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
75 | P a g e
http://ijacsa.thesai.org/
Dominating Sets and Spanning Tree based
Clustering Algorithms for Mobile Ad hoc Networks
R Krishnam Raju Indukuri
Dept of Computer Science
Padmasri Dr.B.V.R.I.C.E
Bhimavaram, A.P, India
[email protected]
Suresh Varma Penumathsa
Dept of Computer Science
Adikavi Nannaya University
Rajamundry, A.P, India.
[email protected]
Abstract The infrastructure less and dynamic nature of
mobile ad hoc networks (MANET) needs efficient clustering
algorithms to improve network management and to design
hierarchical routing protocols. Clustering algorithms in mobile
ad hoc networks builds a virtual backbone for network nodes.
Dominating sets and Spanning tree are widely used in
clustering networks. Dominating sets and Spanning Tree based
MANET clustering algorithms were suitable in a medium size
network with respect to time and message complexities. This
paper presents different clustering algorithms for mobile ad
hoc networks based on dominating sets and spanning tree.
Keywords : mobile ad hoc networks, clustering, dominating set
and spanning trees
I. INTRODUCTION
MANETs do not have any fixed infrastructure and consist
of wireless mobile nodes that perform various data
communication tasks. MANETs have potential applications
in rescue operations, mobile conferences, battlefield
communications etc. Conserving energy is an important
issue for MANETs as the nodes are powered by batteries
only.
Clustering has become an important approach to manage
MANETs. In large, dynamic ad hoc networks, it is very hard
to construct an efficient network topology. By clustering the
entire network, one can decrease the size of the problem into
small sized clusters. Clustering has many advantages in
mobile networks. Clustering makes the routing process
easier, also, by clustering the network, one can build a
virtual backbone which makes multicasting faster. However,
the overhead of cluster formation and maintenance is not
trivial. In a typical clustering scheme, the MANET is firstly
partitioned into a number of clusters by a suitable
distributed algorithm. A Cluster Head (CH) is then allocated
for each cluster which will perform various tasks on behalf
of the members of the cluster. The Performance metrics of a
clustering algorithm are the number of clusters and the
count of the neighbour nodes which are the adjacent nodes
between clusters that are formed.
In this paper we discussed various clustering algorithms
based on dominating sets [1] [4] [11] [14] [16] and
Spanning Trees 6] [8] [15]. The performance metrics of a
clustering algorithm are the number of clusters and the
count of the neighbor nodes which are the adjacent nodes
between clusters that are formed.
II. DOMINATING SETS BASED CLUSTERING ALGORITHMS
A dominating set [9] is a subset S of a graph G such that
every vertex in G is either in S or adjacent to a vertex in S.
Dominating sets are widely used in clustering networks.
Dominating sets can be classified into three main classes i)
Independent Dominating Set ii) Weakly Connected
Dominating Set and iii) Connected Dominating Set.
A. Independent Dominating Set (IDS)
IDS [6] [11] is a dominating set S of a graph G in which
there are no adjacent vertices. Fig.1. shows a sample
independent dominating set
Figure 1. Independent Dominating Set.
B. Weakly Connected Dominating Sets (WCDS)
WCDS [10] [12] is Sw is a subset S of a graph G that
contains the vertices of S, their neighbors and all edges of
the original graph G with at least one endpoint in S. A
subset S is a weakly-connected dominating set, if S is
dominating and Sw is connected. Fig.2. shows a Weakly
Connected Dominating Sets.
Figure 2. Weakly Connected Dominating Set.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
76 | P a g e
http://ijacsa.thesai.org/
C. Connected Dominating Set (CDS)
CDS [11] [13] is a subset S of a graph G such that S
forms a dominating set and S is connected. Fig.3. shows a
Weakly Connected Dominating Sets.
Figure 3. Connected Dominating Set.
D. Determining Dominating Sets
Algorithms that construct a CDS in ad hoc networks can
be divided into two categories: centralized algorithms that
depend on network-wide information or coordination and
decentralized that depend on local information only.
Centralized algorithms usually yield a smaller CDS than
decentralized algorithms, but their application is limited due
to the high maintenance cost.
Decentralized algorithms can be further divided into
cluster-based algorithms and pure localized algorithms.
Cluster-based algorithms have a constant approximation
ratio in unit disk graphs and relatively slow convergence (
O(n) in the worst case). Pure localized algorithms take
constant steps to converge, produce a small CDS on
average, but have no constant approximation ratio. A
cluster-based algorithm usually contains two phases. In the
first phase, the network is partitioned into clusters and a
clusterhead is elected for each cluster. In the second phase,
clusterheads are interconnected to form a CDS. Several
clustering algorithms [2] [4] [7] have been proposed to
elect clusterheads that have the minimal id, maximal degree,
or maximal weight. A host v is a clusterhead if it has the
minimal id (or maximal degree or weight) in its 1-hop
neighbourhood. A clusterhead and its neighbours form a
cluster and these hosts are covered. The election process
continues on uncovered hosts and, finally, all hosts are
covered.
Wu and Li [9] proposed a simple and efficient localized
algorithm that can quickly determine a CDS in ad hoc
networks. This approach uses a marking process where
hosts interact with others in the neighbourhood.
Specifically, each host is marked true if it has two
unconnected neighbours. These hosts achieve a desired
global objective set of marked hosts forms a small CDS.
Figure 4. Example of ad hoc networks.
In Wu and Lis approach, the resultant dominating set
derived from the marking process is further reduced by
applying two dominant pruning rules. According to
dominant pruning Rule 1, a marked host can unmark itself if
its neighbour set is covered by another marked host; that is,
if all neighbours of a gateway are connected with each other
via another gateway, it can relinquish its responsibility as a
gateway. In Fig. 4. either u or w can be unmarked (but not
both).According to Rule 2, a marked host can unmark itself
if its neighbourhood is covered by two other directly
connected marked hosts. The combination of Rules 1 and 2
is fairly efficient in reducing the number of gateways while
still maintaining a CDS.
III. LOCALIZED DOMINATING SET FORMATION ALGORITHM
A. Localized Dominating Set Formation
Fei Dai, Jie Wu [9] proposed a generalized dominant
pruning rule, called Rule k, which can unmark gateways
covered by k other gateways, where k can be any number.
Rule k can be implemented in a restricted way with local
neighbourhood information that has the same complexity as
Rule 1 and, surprisingly, less complexity than Rule 2.
Given a simple directed graph G=(V,E) where V is a set
of vertices (hosts) and E is a set of directed edges
(unidirectional links), a directed edge from u to v is denoted
by an ordered pair (u,v). If (u,v) is an edge in G, we say that
u dominates v and v is an absorbent of u. The dominating
neighbour set N
d
(u) of vertex u is defined as {w : (w,u)
E}. The absorbent neighbour set N
a
(u) as {v : (u,v) E}.
N(u) = N
d
(u) N
a
(u) Fig. 5. vertex x dominates vertex u, y
is an absorbent of u, and v is a dominating and absorbent
neighbour of u. The dominating neighbour set of vertex u is
N
d
(u) = {v,x}, the absorbent neighbour set of u
isN
a
(u)={v,y}, and the neighbour set of u is N(u)={v,x,y}.
The general disk graph and unit disk graph are special cases
of directed graphs.
Figure 5. Example of dominating set reduction.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
77 | P a g e
http://ijacsa.thesai.org/
A set V V is a dominating set of G if every vertex v
V V is dominated by at least one vertex u V. Also, a
set V V is called an absorbent set if for every vertex u V
V, there exists a vertex v V which is an absorbent of u.
For example, vertex set {u,v} in Fig. 5. is both dominating
and absorbent sets of the corresponding directed graphs. The
following marking process can quickly find a strongly
connected dominating and absorbent set in a given directed
graph.
Algorithm Marking process
1: Initially assign marker F to each u in V .
2: Each u exchanges its neighbour set Nd(u) and Na(u)
with all its neighbours.
3: u changes its marker m(u) to T if there exist vertices v
and w such that (w,u) E and (u,v) E, but (w,v) E.
The marking process is a localized algorithm, where
hosts only interact with others in the neighbourhood. Unlike
clustering algorithms, there is no sequential propagation
of information. The marking process marks every vertex in
G. m(v) is a marker for vertex v V , which is either T
(marked) or F (unmarked). Suppose the marking process is
applied to the network represented by Fig. 5. host u will be
marked because (x,u) E and (u,y) E, but (x,y) E host
v will also be marked because (u,v) E and (v,z) E, but
(u,z) E. All other hosts will remain unmarked because no
such pair of neighbour hosts can be found. V is the set of
vertices that are marked T in V ; that is, V={v : vV
m(v) = T }. The induced graph G is the subgraph of G
induced by V ; that is G=G[V]. Wu [9] showed that
marked vertices form a strongly connected dominating and
absorbent set and, furthermore, can connect any two vertices
with minimum hops.
B. Dominating Set Reduction
In the marking process, a vertex is marked T because it
may be the only connection between its two neighbours.
However, if there are multiple connections available, it is
not necessary to keep all of them. We say a vertex is
covered if its neighbours can reach each other via other
connected marked vertices. Two dominant pruning rules are
as follows: If a vertex is covered by no more than two
connected vertices, removing this vertex from V will not
compromise its functionality as a CDS. To avoid
simultaneous removal of two vertices covering each other, a
vertex is removed only when it is covered by vertices with
higher ids. Node id id(v) of each each vertex v V serves
as a priority. Nodes with high priorities have high
probability of becoming gateways. Id uniqueness is not
necessary, but equal ids will produce more gateways.
Rule 1. Consider two vertices u and v in G. If N
d
(u)
{v} N
d
(v) and N
a
(u) {v} Na(v) in G and id(u) < id(v),
change the marker of u to F; that is, G is changed to G
{u}.
Rule 2. Assume that v and w are bi-directionally
connected in G. If N
d
(u) {v,w} N
d
(v) U N
d
(w) and N
a
(u)
{v,w} N
a
(v) N
a
(w) in G and id(u) < min{id(v),id(w)},
then change the marker of u F.
C. Generalized Pruning Rule
Assume G=(V,E) is the induced subgraph of a given
directed graph =(V,E) from marked vertex set V. In the
following dominant pruning rule, N
d
(V
k
) to represent the
dominating (absorbent) neighbour set of a vertex set V
k
that
is, N
d
(V
k
) = U
uiVk
N
d
(u
i
).
Rule k. V {v1, v2, ... , vk} is the vertex set of a
strongly connected subgraph in G. If N
d
(u) V
k
N
d
(V
k
)
and N
a
(u) - V
k
N
a
(V
k
) in G and id(u) < min{ id(v1),
id(v2),...,id(vk) }, then change the marker of u to F.
Rules 1 and 2 are special cases of Rule k, where |V| is
restricted to 1 and 2, respectively. Note that V
k
may contain
two subsets: V
k1
that really covers us neighbour set, and
V
k2
that acts as the glue to make them a connected set.
Obviously, if a vertex can be removed from V by applying
Rule 1 or Rule 2, it can also be removed by applying Rule k.
On the other hand, a vertex removed by Rule k is not
necessarily removable via Rule 1 or Rule 2. For example, in
Fig. 6(a), both vertices u and v can be removed using Rule k
(for k >= 3) because they are covered by vertices w, x, y,
and z; in Fig. 6(b), vertex u can be removed because it is
covered by vertices w, x, and y. Note that, although x and y
are not bi directionally connected, they can reach each other
via vertex w. However, none of these vertices can be
removed via Rule 1 or Rule 2.
Figure 6. Limitation of Rule 1 and 2.
D. Performance Analysis
The restricted Rule k is a more efficient dominant
pruning rule than the combination of the restricted Rules 1
and 2, especially in dense networks with a relatively high
percentage of unidirectional links. For these networks, the
resultant dominating set can be greatly reduced by Rule k
without any performance or resource penalty. One
advantage of the marking process and the dominant pruning
rules is their capability to support unidirectional links. For
networks without unidirectional links, the marking process
and the restricted Rule k is as efficient as several cluster-
based schemes and another pure localized algorithm, in
terms of the size of the dominating set; this is achieved with
lower cost and higher converging speed.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
78 | P a g e
http://ijacsa.thesai.org/
IV. A ZONAL CLUSTERING ALGORITHM
Zonal distributed algorithm [3] is to find a small weakly
connected dominating set of the input graph G = (V,E). The
graph is first partitioned into non-overlapping regions.
Then a greedy approximation algorithm [1] is executed to
find a small weakly-connected dominating set of each
region. Taking the union of these weakly-connected
dominating sets we obtain a dominating set of G. Some
additional vertices from region borders are added to the
dominating set to ensure that the zonal dominating set of G
is weakly-connected.
A. Graph partitioning using minimum spanning forests
The first phase of zonal distributed clustering algorithm
partitions a given graph G = (V,E) into non overlapping
regions. This is done by growing a spanning forest of the
graph. At the end of this phase, the subgraph induced by
each tree defines a region. This phase is based on an
algorithm of Gallager, Humblet, and Spira GHS [8[ that is
based on Kruskal's classic centralized algorithm for
Minimum Spanning Tree (MST), by considering all edge
weights are distinct, breaking ties using the vertex IDs of the
endpoints.
The MST is unique for a given graph with distinct edge
weights. The algorithm maintains a spanning forest.
Initially, the spanning forest is a collection of trees of single
vertices. At each step the algorithm merges two trees by
including an edge in the spanning forest. During the process
of the algorithm, an edge can be in any of the three states:
tree edge, rejected edge, or candidate edge. All edges are
candidate edges at the beginning of the algorithm. When an
edge is included in the spanning forest, it becomes a tree
edge. If the addition of a particular edge would create a
cycle in the spanning forest, the edge is called a rejected
edge and will not be considered further. In each iteration,
the algorithm looks for the candidate edge with minimum
weight, and changes it to a tree edge merging two trees into
one. During the algorithm, the tree edges and all the vertices
form a spanning forest. The algorithm terminates when the
forest becomes a single spanning tree.
The partitioning process consists of a partial execution
of the GHS algorithm [8], which terminates before the MST
is fully formed. The size of components is controlled by
picking a value x. Once a component has exceeded size x, it
no longer participates.
B. Computing Weakly-Connected Dominating Sets of the
Regions
Once the graph G is partitioned into regions and a
spanning tree has been determined for each region, runs the
following algorithm within each region. This color-based
algorithm is a distributed implementation of the centralized
greedy algorithm for finding small weakly-connected
dominating sets [10] [12] in graphs.
For given a graph G = (V;E) assign color (white, gray, or
black) with each vertex. All vertices are initially white and
change color as the algorithm progresses. The algorithm is
essentially an iteration of the process of choosing a white or
gray vertex to dye black. When any vertex is dyed black,
any neighbouring white vertices are changed to gray. At the
end of the algorithm, the black vertices constitute a weakly-
connected dominating set.
The term piece is used to refer to a particular substructure of
the graph. A white piece is simply a white vertex. A black
piece contains a maximal set of black vertices whose weakly
induced subgraph is connected plus any gray vertices that
are adjacent to at least one of the black vertices of the piece.
The improvement of a (non-black) vertex u is the number of
distinct pieces within the closed neighborhood of u. That is,
the improvement of u is the number of pieces that would be
merged into a single black piece if u were to be dyed black.
In each iteration, the algorithm chooses a single white or
gray vertex to dye black. The vertex is chosen greedily so as
to reduce the number of pieces as much as possible until
there is only one piece left. In particular, a vertex with
maximum improvement value is chosen (with ties broken
arbitrarily). The black vertices are the required weakly-
connected dominating set S.
C. Fixing the Borders
After calculating a small weakly-connected dominating
set S
i
for each region R
i
of G, combining these solutions
does not necessarily give us a weakly connected dominating
set of G. it is likely need to include some additional vertices
from the borders of the regions in order to obtain a weakly-
connected dominating set of G. The edges of G are either
dominated (that is, they have either endpoint in some
dominating set S
i
) or free (in which case neither endpoint is
in a dominating set). Two regions R
i
and R
j
joined by a
dominated edge can comprise a single region with
dominating set S
i
S
j
, and do not need to have their shared
border fixed.
The root of region R can learn, by polling all the vertices
in its region, which regions are adjacent and can determine
which neighbouring regions are not joined by a dominated
edge. For each such pair of adjacent regions, one of the
regions must "fix the border". To break ties, the region with
lower region ID takes control of this process, where the
region ID is the vertex ID of the region root. In other words,
if neighboring regions R
i
and R
j
are not joined by a shared
dominated edge, the region with the lower subscript adds a
new vertex from the R
i
/R
j
border into the dominating set.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
79 | P a g e
http://ijacsa.thesai.org/
Figure 7. Fixing Boarders.
For example, in Fig. 7, have regions have weakly-
connected dominating sets indicated by the solid black
vertices. Region R1 is adjacent to regions R2, R3, R4, and
R5. Among these, regions R2 and R3 do not share
dominated edges with R1. As R1 has a lower region ID than
either R2 or R3, R1 is responsible for fixing these borders.
The root of R1 adds u and v into the dominating set. R2 is
adjacent to two regions, R1 and R3, but it is only
responsible for fixing the R2/R3 border, due to the region
IDs. The root of R2 adds w to the dominating set. A detailed
description of this process for a given region R follows. The
goal is for the root r to find a small number of dominated
vertices within R to add to the dominating set. Here every
vertex knows the vertex ID, color, and region ID of all of its
neighbors. (This can be done with a single round of
information exchange.) Root r collects the above
neighborhood information from all of the border vertices of
R.
This define a problem region with regard to R to be any
region R0 that is adjacent to R, does not share dominated
edges with R, and has a higher region ID than R. Region R
is responsible for fixing its border with each problem
region.
Figure 8. Bipartite Graph
A bipartite graph B(X,Y,E) can be constructed from the
collected information for root r. Vertex set X contains a
vertex for each problem region with regard to R, and vertex
set Y contains a vertex for every border vertex in R. There is
an edge between vertices y
i
and x
j
iff y
i
is adjacent to a
vertex in problem region x
i
in the ordinal graph. Fig. 8.
shows the bipartite graph constructed by region R
1
in the
example of Fig. 7. In this bipartite graph, X = {R
2
, R
3
} and
Y = {u, y, v}. In this case, {u,v} is a possible solution for R
1
to add to the weakly-connected dominating set in order to x
its borders with R
2
and R
3
. To find the smallest possible set
of vertices to add to the dominating set, r must find a
minimum size subset of Y to dominate X.
D. Performance Analysis:
The execution time of this algorithm is O(x(log x+|S
max
|))
and it generates O(m + n(log x + |S
max
|)) messages, where
S
max
is the largest weakly connected dominating set
generated by all regions and can be trivially bounded by
O(x) from above. This zonal algorithm is regulated by a
single parameter x, which controls the size of regions. When
x is small, the algorithm finishes quickly with a large
weakly-connected dominating set. When it is large, it
behaves more like the non-localized algorithm and generates
smaller weakly-connected dominating
V. CLUSTERING USING A MINIMUM SPANNING TREE
An undirected graph is defined as G = (V,E), where V is a
finite nonempty set and E V V . V is a set of nodes v
and the E is a set of edges e. A graph G is connected if there
is a path between any distinct v. A graph GS = (VS,ES) is a
spanning subgraph of G = (V,E) if VS = V . A spanning tree
[6] [8] [15] of a graph is an undirected connected acyclic
spanning subgraph. Intuitively, a minimum spanning
tree(MST) for a graph is a subgraph that has the minimum
number of edges for maintaining connectivity.
Gallagher, Humblet and Spira [8] proposed a distributed
algorithm which determines a minimum weight spanning
tree for an undirected graph that has distinct finite weights
for every edge. Aim of the algorithm is to combine small
fragments into larger fragments with outgoing edges. A
fragment of an MST is a subtree of the MST. An outgoing
edge is an edge of a fragment if there is a node connected to
the edge in the fragment and one node connected that is not
in the fragment. Combination rules of fragments are related
with levels. A fragment with a single node has the level L =
0. Suppose two fragments F at level L and F at level L.
If L < L, then fragment F is immediately absorbed as
part of fragment F. The expanded fragment is at level L.
Else if L = L and fragments F and F have the same
minimum-weight outgoing edge, then the fragments
combine immediately into a new fragment at level L+1
Else fragment F waits until fragment F reaches a high
enough level for combination.
Under the above rules the combining edge is then called
the core of the new fragment. The two essential properties
of MSTs for the algorithm are:
Property 1: Given a fragment of anMST, let e be a
minimum weight outgoing edge of the fragment. Then
joining e and its adjacent non-fragment node to the
fragment yields another fragment of an MST.
Property 2: If all the edges of a connected graph have
different weights, then the MST is unique
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
80 | P a g e
http://ijacsa.thesai.org/
The algorithm defines three different states of operation
for a node. The states are Sleeping, Find and Found. The
states affect what of the following seven messages are sent
and how to react to the messages: Initiate, Test, Reject,
Accept, Report (W), Connect (L) and Change-core. The
identifier of a fragment is the core edge, that is, the edge
that connects the two fragments together. A sample
MANET and a minimum spanning tree constructed with
Gallagher, Humblet, Spiras algorithm can be seen in Fig.
9. where any node other than the leaf nodes which are
shown by black color depict a connected set of nodes. The
upper bound for the number of messages exchanged during
the execution of the algorithm is 5Nlog2N +2E, where N is
the number of nodes and E is the number of edges in the
graph. A worst case time for this algorithm is O(NlogN).
Dagdeviren et. al. proposed the Merging Clustering
Algorithm (MCA) [6] which finds clusters in a MANET by
merging the clusters to form higher level clusters as
mentioned in Gallagher et. al.'s algorithm [28]. However,
they focused on the clustering operation by discarding the
minimum spanning tree. This reduces the message
complexity from O(nlogn) to O(n). The second contribution
is to use upper and lower bound parameters for clustering
operation which results in balanced number of nodes in the
clusters formed. The lower bound is limited by a parameter
which is defined by K and the upper bound is limited by 2K.
Figure 9. A MANET and its Spanning Tree.
VI. CONCLUSIONS
In this paper we discussed dominating set and spanning
tree based clustering in mobile ad hoc networks and it
performance analysis. The efficiency of dominating set
based routing mainly depends on the overhead introduced in
the formation of the dominating set and the size of the
dominating set. We discussed two algorithms which have
less overhead in dominating set formation. Finally we
discussed spanning tree approach in clustering MANET.
Distributed spanning tree and dominating set approaches
can be merged to improves clustering in MANET.
VII. FUTURE WORK
The interesting open problem in mobile ad hoc networks
is to study the dynamic updating of the backbone efficiently
when nodes are moving in a reasonable speed integrate the
mobility of the nodes. The work can be extended to develop
connected dominating set construction algorithms when
hosts in a network have different transmission ranges.
REFERENCES
[1] Baruch Averbuch Optimal Distributed Algorithms for Minimum
Weight Spanning tree Counting, Leader Election and elated
problems, 9th Annual ACM Symposium on Theory of Computing,
(1987).
[2] Fabian, ivan Connectivity Based k-Hop Clustering in Wireless
Networks Telecom System, (2003).
[3] Chen, Y P, Liestman A L, A Zonal Algorithm for Clustering Ad Hoc
Networks International Journal of Foundations of Computer Science,
vol. 14(2), (2003).
[4] Chan, H, Luk, M, Perrig, A, Using Clustering Information for Sensor
Network Localization, DCOSS (2005).
[5] Das B, Bharghavan V, Routing in ad-hoc networks using minimum
connected dominating sets, Communications, ICC97, (1997).
[6] Dagdeviren O, Erciyes K, Cokuslu D, Merging Clustering
Algorithms in Mobile Ad hoc Networks ICDCIT (2005).
[7] Gerla, M Tsai, Multicluster, mobile, multimedia radio network,
Wireless Networks, Wireless Networks, vol. 1(3), (1995).
[8] Gallagher, R. G., Humblet Distributed Algorithm for Minimum-
Weight Spanning Tree Transactions on Programming Languages and
Systems (1983).
[9] Fei Dai & Jie Wu, An extended localized algorithm for connected
dominating set formation in ad hoc wireless, IEEE Transaction on
Parallel and Distributed System, (2004).
[10] Yuanzhu Peter Chen, Arthur L. Liestman Maintaining weakly-
connected dominating sets for clustering ad hoc networks Elsevier
(2005).
[11] Deniz Cokuslu Kayhan Erciyes and Orhan Dagdeviren A Dominating
Set Based Clustering Algorithm for Mobile Ad hoc Networks,
Springer Computational Science ICCS (2006).
[12] Bo Hana Weijia Jiab, Clustering wireless ad hoc networks with weakly
connected dominating set, ELSEVIER (2007).
[13] K. Alzoubi, P.J. Wan O. FriederNew Distributed Algorithm for
Connected Dominating Set in Wireless Ad Hoc Networks, Proceedings
of the 35th Annual Hawaii International Conference on System
Sciences (HICSS'02)-Volume 9 (2002).
[14] Stojmenovic, Dominating sets and neighbor elimination-based
broadcasting algorithm IEEE Transaction on Parllel and Distributed
Systems, (2001).
[15] P. Victer Paul, T. Vengattaraman, P. Dhavachelvan & R. Baskaran,
Improved Data Cache Scheme Using Distributed Spanning Tree in
Mobile Adhoc Network, International Journal of Computer Science &
CommunicationVol. 1, No. 2, July-December (2010).
[16] G.N. Purohit and Usha Sharma, Constructing Minimum Connected
Dominating Set Algorithmic approach, International journal on
applications of graph theory in wireless ad hoc networks and sensor
networks GRAPHHOC (2010).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
81 | P a g e
http://ijacsa.thesai.org/
AUTHORS PROFILE
Dr. Suresh Varma Penumathsa, currently is a Principal and Professor
in Computer Science, Adikavi Nannaya University, Rajahmudry, Andhra
Pradesh, India.. He received Ph.D. in Computer Science and Engineering
with specialization in Communication Networks from Acharya Nagarjuna
University in 2008. His research interests include Communication
Networks and Ad hoc Networks. He has several publications in reputed
national and international journals. He is a member of
ISTE,ORSI,ISCA,IISA and AMIE.
Mr. R Krishnam Raju Indukuri, currently is working as Sr. Asst.
Professor in the Department of CS, Padamsri Dr. B.V.R.I.C.E,
Bhimavaram, Andhra Pradesh, India. He is a member of ISTE. He has
presented and published papers in several national and International
conferences and journals. His areas of interest are Ad hoc networks and
Design and analysis of Algorithms.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
82 | P a g e
http://ijacsa.thesai.org/
Distributed Group Key Management with Cluster
based Communication for Dynamic Peer Groups
Rajender Dharavath
1
Department of Computer Science & Engineering
Aditya Engineering College
Kakinada, India.
[email protected]
K Bhima
2
Department of Computer Science & Engineering
Brilliant Institute of Engineering &Technology
Hyderabad, India.
[email protected]
AbstractSecure group communication is an increasingly
popular research area having received much attention in recent
years. Group key management is a fundamental building block
for secure group communication systems. This paper introduces
a new family of protocols addressing cluster based
communication, and distributed group key agreement for secure
group communication in dynamic peer groups. In this scheme,
group members can be divided into sub groups called clusters.
We propose three cluster based communication protocols with
tree-based group key management. The protocols (1) provides the
communication within the cluster by generating common group
key within the cluster, (2) provides communication between the
clusters by generating common group key between the clusters
and (3) provides the communication among all clusters by
generating common group key among the all clusters. In our
approach group key will be updated for each session or when a
user joins or leaves the cluster. More over we use Certificate
Authority which guarantees key authentication, and protects our
protocol from all types of attacks.
Keywords- Secure Group Communication; Key Agreement; Key
Tree; Dynamic peer groups,Cluster.
I. INTRODUCTION
As a result of the increased the popularity of group oriented
applications such as pay-TV, distributed interactive games,
video and teleconference and chat rooms. There is a growing
demand for the security services to achieve the secure group
communication. A common method is to encrypt messages
with a group key, so that entities outside the group cannot
decode them. A satisfactory group communication system
would possess the properties of group key security, forward
secrecy, backward secrecy, and key independence [1,2,3].
In this paper research efforts have been put into the design
of a group key management and three different cluster based
communication protocols. There are three approaches for
generating such group keys: centralized, decentralized, and
distributed. Centralized key distribution uses a dedicated key
server, resulting in simpler protocols. However, centralized
methods fail entirely once the server is compromised, so that
the central key server makes a tempting target for adversaries.
In addition, centralized key distribution is not suitable for
dynamic peer groups, in which all nodes play the same function
and role, thus it is unreasonable to make one the key server,
placing all trust in it. In the decentralized approach, multiple
entities are responsible for managing the group as opposed to a
single entity. In contrast to both approaches, the distributed key
management requires each member to contribute a share to
generate the group key, resulting in more complex protocols.
And each member is equally responsible for generating and
maintaining the group key.
In this paper the group key or common key is generated
based on distributed key management approach. The group key
is updated on every membership change, and for every session,
for forward and backward secrecy [1, 2, 3], a method called
group rekeying.
To reduce the number of rekeying operations, Woung.et al
[7] proposed a logical data structure called a key tree. And Kim
et al [1], proposed a tree-based key agreement protocol, TGDH
which is combination of key tree and Diffie-Hellman key to
generate and maintain the group key. But it suffers from the
impersonation attack because of not regularly updation of keys
and generates unnecessary messages. Based on above two ideas
Zhou, L., C.V. Ravishanker and Kim et al [6], proposed an
AFTD (Authenticated Fault-tolerant Tree-based Diffie-
Hellman key exchange Protocol) protocol, which is the
combination of key trees and Diffie-Hellman key exchange
for group key generation.
Assume that the total network topology considered as a
group, which can be divided into subgroups called clusters.
Group is divided into clusters based on the location
identification number; LIDs of users, and cluster is assigned
with cluster identification numbers, CID, which are given by
the Certificate Authority, CA at the time of user joining into
cluster or group. Issuing location identification number and
public key certificate to the new user are the offline actions
performed by the certificate authority, CA.
Each cluster member maintains its own cluster key tree and
generates the cluster group key for secure communication. We
assume in every cluster, every node can receive a message
broadcasted from the other nodes. Each cluster is headed by a
cluster head or sponsor of cluster and he is responsible for
generating cluster group key, who is shallowest rightmost to
the user (in cluster key tree) joins or leaves from the cluster.
Cluster group key or cluster common key is shared by all
the cluster members and communicates with it. The
authentication is provided by certificate authority by issuing the
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
83 | P a g e
http://ijacsa.thesai.org/
public key certificate and location identification number, LID
prior to the time of joining in the cluster or group.
The rest of the paper is organized as follows. Section 2
focuses on related work in this field. We present our proposed
scheme in Section 3, communication protocols and group key
management techniques are discussed in Section 4. Dynamic
network peer groups are presented in Section 5, security
analysis in section6. Finally we make a conclusion in Section 7.
II. RELATED WORK
Key trees [6] were first proposed for centralized key
distribution, while Kim et al.[1], adapted it to distributed key
agreement protocol TGDH. In TGDH [1] every group member
creates a key tree separately. Each leaf node is associated with
a real group member, while each non-leaf node is considered as
virtual member. In TGDH, every node on the key tree has a
Diffie-Hellman key pair based on the prime p and generator ,
used to generate the group key. Secret-public key pair for real
member M
i
is as follows.
} mod , { p BKM KM
i
KM
i i
o = (1)
And Secret-public key pair for virtual member V
i
is as
follows.
} mod , { p BKV KV
i
KV
i i
o =
(2)
Public key BKM
i
is also called as blinded key. Consider a
node Mv whose left child is Mlv and right child node is Mrv (to
simplify the description, we do not distinguish real members
from virtual members here). Secret key of M
i
s can be
computed in the usual Diffie-Hellman key exchange fashions
as follows.
} mod ) ( ) ( { p BKMrv BKMlv KMv
KMlv KMrv
(3)
With all blinded keys well-known, each group member can
compute the secret keys of all nodes on its key path,
comprising the nodes from the leaf node up to the root. The
root nodes secret key KV
0
is known to all group members, and
becomes the group key. In Figure 2, cluster member U
12
knows
the key pairs of U
12
, V
11
and V
10
. V
10
s secret key is the cluster
group key.
In AFTD [6], as increasing the group size, the group
rekeying operation becomes complex and it leads to the
performance degradation and generates more messages to
distribute the group key, this is the main limitation of the
AFTD protocol.
Renuka A. and K.C.Shet [9] were proposed the cluster
based communications, which is different from our approach in
key management and in communication protocols. Our detailed
communication protocols and key management scheme are
discussed in this paper.
Lee et al. [4,5] have designed several tree-based distributed
key agreement protocols, reducing the rekeying complexity by
performing interval based rekeying. They also present an
authenticated key agreement protocol. As the success of their
scheme is partially based on a certificate authority, their
protocol will encounter the same problems as centralized trust
mechanisms.
Nen-Chung Wang, Shian-Zhang Fang [10], have proposed
A hierarchical key management scheme for secure group
communications in mobile ad hoc networks. This paper
involves very complex process to form the cluster and for
communications.
Gouda et al. [11], who describe a new use of key trees.
They are concerned about using the existing subgroup keys in
the key tree to securely multicast data to different subgroups
within the group. Unlike their approach, which depends on a
centralized key server to maintain the unique key tree and
manage all keys, our paper solves this problem in a distributed
fashion.
III. PROPOSED SCHEME
A. Sytem Model
To overcome the limitations of AFTD [6] protocol the
entire set of group members in the network is divided into a
number of subgroups called clusters and the layout of the
network is as shown in Figure 1.
The cluster is formed based on location identification
number, LIDs of the users and clusters are assigned with
cluster identification numbers, CID, which are given offline by
the Certificate Authority CA. If the CID is equal to the LID
then those users are belongs to that particular cluster.CID and
LID are unique for each cluster.
In this paper each cluster member maintains its own cluster
key tree as shown in Figure 2 (a,b,c), the leaf nodes in cluster
key tree are the cluster users (real users), and non leaf nodes
are the virtual users. We propose three different types of
communication protocols with distributed tree-based group key
management.
The cluster communications protocols are given below.
- Intra Cluster Communication protocol (ICC),
- Inter Cluster Communication protocol (IRCC) and
- Global Communication (GC) protocol.
Communication among the users within the cluster is called
Intra Cluster Communication. Communication between the
clusters is called Inter Cluster Communication. When IRCC
occurs between the clusters then the respective cluster key tree
is generated as shown in Figure 4, for generating group key.
Communication among all clusters is called Global
Communication and corresponding cluster key tree is generated
as shown in Figure 5, for generating group key. The
illustrations of communications are as shown in Figure 3.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
84 | P a g e
http://ijacsa.thesai.org/
Figure 1. Network Layout and Initialization
a. Cluster C
1
Key Tree
b. Cluster C
2
Key Tree c. Cluster C
3
Key Tree
Figure 2. Key trees of clusters
Figure 3. Illustration of communications.
B. Group Key Management Scheme
In fact an update of a blinded key need be sent only to a
cluster members, instead of entire group (all clusters) based on
the type of communications. We send each nodes blinded keys
only to its cluster members. In this paper each cluster member
constructs a key independently. Each real user U
ij
of a cluster
C
i
has two key pairs first one is: Diffie-Hellman key pair,
which is used to generate the group key is given below.
} mod , { p BKU KU
ij
KU
ij ij
o = (4)
And an RSA secret-public key pair {Dij, Eij}, which is
used to provide source authentication. In key tree non-leaf
nodes are virtual users (virtual clusters for global
communication or for inter cluster communications), and have
only a Diffie-Hellman key pair as given below.
} mod , { p BKV KV
ij
KV
ij ij
o = (5)
Group key management for user communications is occurs
in two phases.
- Initialization phase
- Group key generation and distribution phase
1) Initialization Phase
Certificate authority, CA will distribute the appropriate
public key certificates to clusters and it does not issue renewed
public key certificates for existing group members during the
process of cluster or group key updation.
New member wishing to join the group may obtain joining
certificate and LID (based on location where user wants to join)
from the CA at any time prior to join.
The certificate authority (CA), uses an RSA secret- public
key pair {Sk, Pk} and establishes public key certificates for
each cluster user U
ij
by signing U
ij
s public key with its secret
key Sk. User U
ij
s public key certificate <U
ij
, PUBU
ij
, E
ij
>Sk is
now distributed to its cluster user since public key Pk is well
known, any user of cluster can verify this certificate and
obtains U
ij
s public key.
2) Group Key Generation and Distribution Phase
Group key generation and distribution for cluster
communication occurs in three different ways.
- Group key generation and distribution in ICC.
- Group key generation and distribution in IRCC.
- Group key generation and distribution in GC.
The above group key generation and distribution techniques
for cluster communications are implemented in respective
communication protocols and in dynamic peer groups (in
section 5).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
85 | P a g e
http://ijacsa.thesai.org/
IV. COMMUNICATION PROTOCOLS
The communication protocols are as follows.
- Intra Cluster Communications (ICC).
- Inter Cluster Communications (IRCC).
- Global Communications (GC).
A. Intra Cluster Communications(ICC)
Communication among the users within the cluster is called
Intra Cluster Communication. Example of intra cluster
communication is shown in Figure 3, and corresponding cluster
key tree is shown in Figure 2.
In order to communicate users with each other within the
cluster, they need to have the common cluster group key,
which is generated from their cluster key trees based on Diffie-
Hellman key exchange fashion.
Steps for generation and distribution of cluster group key in
ICC (algorithm for cluster common key generation in ICC).
- Select the cluster in which Intra Cluster
Communication is to be done.
- Each cluster (C
i
) generates its own cluster key trees.
- The root node (V
ij
) of cluster C
i
s secret key KV
ij
is
generated using the DH Key exchange fashion from its
leaf nodes (the generation of Cluster group key or
common key is explained in dynamic peer groups).
- The secret key of the root node V
ij
is KV
ij
will become
the cluster group key or common key for cluster C
i
and
that will be shared by all members of a cluster.
- For each session the cluster group key will be changed
by changing their contribution.
- New generated cluster group key KV
ij
will be
distributed among all members of cluster.
B. Inter Cluster Communications(IRCC)
Communicating one cluster with another cluster is called an
Inter Cluster Communication. The example of IRCC is shown
in Figure 3, and corresponding reduced cluster key tree is
generated as shown in Figure 4. In this figure VC0 is virtual
cluster and it has only DH key pair as shown below.
} mod , { p BKVC KVC
i
KVC
i i
o = (6)
The secret-public key pair of virtual cluster VCi is for
generating clusters common key, which is generated according
DH Key fashion and distributed to the both clusters for
communicating each other.
Figure 4. Reduced IRCC Key Tree.
The steps for Generation and distribution of common key
for clusters in IRCC (algorithm for group key generation in
IRCC)
- Select the clusters for IRCC and form reduced cluster
key tree as shown in Figure 4.
- Each cluster has its own cluster group key or clusters
common key, which is generated from their cluster key
tree based on DH key fashion.
- Cluster C
i
and cluster C
j
s secret keys KC
i
, KC
j
are
calculated respectively (as explained in intra cluster
communication algorithm).
- Using KC
i
and KC
j
, the root node VC
i
(parent node of
C
i
and C
j
, or virtual cluster) calculates its secret key
KVC
i
using DH key exchange fashion.
- The root nodes VC
i
is, KVCi which is common key for
both cluster C
i
and cluster C
j
.
- KVC
i
is distributed to both cluster and that will be
shared by all members of each cluster for
communicating each other.
- For each session the common key for clusters is
recalculated by changing their shares of each clusters
members and distributed to all members of both
clusters.
C. Global Communication(GC)
Communicating all clusters in a group is called Global
Communication. When cluster C1, C2 and C3 are
communicating, then reduced global communication key tree is
generated as shown in Figure 5, and common global key is
generated according to DH key exchange fashion. In this figure
leaf nodes are real clusters and non-leaf nodes are virtual
clusters.
Figure 5. Reduced GC Key Tree
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
86 | P a g e
http://ijacsa.thesai.org/
Steps for global key generation and distribution in GC
(algorithm for global key generation & distribution in GC).
- Each cluster generates its own cluster key trees
- For each cluster key tree there will be generated the
roots secret keys, which are common keys for all
respective clusters.
- Cluster C
i
, C
j
and C
k
s secret keys KC
i
, KC
j
and KC
k
are calculated respectively from their cluster key trees.
- With these three clusters, Reduced Global
Communication Key tree will be formed as shown in
Figure 5.
- The root node VC
i
s (from Reduced GC key tree)
secret key KVC
i
is calculated using DH key fashion,
which is common key for all clusters C
i
, C
j
and C
k
.
- VC
i
s secret key KVC
i
is distributed to all clusters and
that will be shared by all members of each cluster for
communicating globally.
- For each session the global key recalculated by
changing their shares of each clusters members and
distributed to all members of clusters.
V. DYNAMIC PEER GROUPS
The numbers of nodes or clusters in the network are not
necessarily fixed. New node (user) or cluster may join the
network or existing nodes or cluster may leave the network.
A. User Joins the Cluster
Assume that a new user U
ij
+1 wish to join a k-users cluster
{U
1
, U
2
. U
K
}. U
ij
+1, is required to authenticate itself by
presenting a join request signed with SK. U
ij
+1 may obtain a
signature on its join request by establishing credentials with the
offline certificate authority.
When the users of clusters receive the joining request, they
independently determine U
ij
+1s insertion node in the key tree,
which is defined as in [1], which is the shallowest rightmost
node or the root node when the key tree is well-balanced. They
also independently determine a real user called join sponsor Us
[1], to take responsible for coordinating the join, which is the
rightmost leaf node in the sub tree rooted at the insertion node.
No keys change in the key tree at a join, except the blinded
keys for nodes on the key path for the sponsor node. The
sponsor simply re computes the cluster group key, and sends
updates for blinded keys on its own key path to their
corresponding clusters. The join works as shown below.
Steps for group key or cluster common key generation and
distribution when user joins in cluster (algorithm for user joins
in cluster).
- New User U
ij
+1 takes the LID and public key
certificates from the CA.
- User U
ij
+1 selects appropriate cluster by comparing its
LID with CID (for LID=CID).
- The user U
ij
+1 broadcast the signed join request to its
cluster C
i
.
- Cluster C
i
s members determine the insertion point,
and update their key trees by creating a new
intermediate node and promoting it to become the
parent of the insertion node and U
ij
+1.
- Each cluster member adjusts the cluster key tree by
adding U
ij
+1 to its selected clusters adjacent to the
insertion point.
- The sponsor Us compute the new cluster group key or
cluster common key.
- Then sponsor Us sends the updated blinded keys of
nodes on its key path to their corresponding clusters.
- These messages are signed by the sponsor Us.
- U
ij
+1 takes the public keys needed for generating the
cluster group key, generates group key.
The cluster group key (for cluster C
3
) or cluster common
key for Figure 6 is generated as follows (steps for group key or
common key generation).
-
Let U
31
s secret share is KU
31,
and then secret-public
key pair of U
31
(according to DH Key fashion)
is as
shown below.
} mod , {
31
31 31
p BKU KU
KU
o =
(7)
- Let U
32
s secret share is KU
32
then secret-public key
pair of U
32
(according to DH Key fashion) is shown
below.
} mod , {
32
32 32
p BKU KU
KU
o = (8)
- Let U
33
s secret share is KU
33
then secret-public key
pair of U
33
(according to DH Key fashion) is shown
below.
} mod , {
33
33 33
p BKU KU
KU
o = (9)
- Let U
34
s secret share is KU
34
then secret-public key
pair of U
34
(according to DH Key fashion) is shown
below.
} mod , {
34
34 34
p BKU KU
KU
o = (10)
- Let U
35
s secret share is KU
35
then secret-public key
pair of U
35
(according to DH Key fashion) is shown
below.
} mod , {
35
35 35
p BKU KU
KU
o = (11)
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
87 | P a g e
http://ijacsa.thesai.org/
- Now V
33
s Secret-Public keys (KV
33
, BKV
33
) are
calculated as follows (according to the DH Key
Exchange fashion from U
31
and U
32
).
} mod ) ( ) ( {
31 32
32 31 33
p BKU BKU KV
KU KU
(12)
} mod {
33
33
p BKV
KV
o = (13)
- Now V
32
s Secret-Public keys (KV
32
, BKV
32
) are
calculated as follows (according to the DH Key
Exchange fashion from U
34
and U
35
).
} mod ) ( ) ( {
34 35
35 34 32
p BKU BKU KV
KU KU
(14)
} mod {
32
32
p BKV
KV
o = (15)
- Now V
31
s Secret-Public key pair (according to the DH
Key Exchange fashion from V
33
and U
33
) is
} mod ) ( ) ( {
33 33
33 33 31
p BKU BKV KV
KV KU
(16)
} mod {
31
31
p BKV
KV
o (17)
- Finally V
30
s Secret-Public key pair (according to the
DH Key Exchange fashion from V
31
and V
32
) is
} mod ) ( ) ( {
31 32
32 31 30
p BKV BKV KV
KV KV
(18)
} mod {
30
30
p BKV
KV
o = (19)
- The root node V
30
s Secret key is considered as cluster
C
3
s Group key or cluster common key, through which
communication is need to done.
- And this common cluster key is distributed to all
cluster members.
Like above steps for group key or common key generation,
the common key or group key for all the different cluster
communication and in dynamic peer, are generated.
In Figure 6, a new user U
36
wants to joins in C
3
cluster. The
join sponsor U
33
creates a new intermediate node V
34
in the key
tree and promotes it to become the parent of U
33
and U
36
. The
sponsor U
33
computes the new cluster group key, and sends the
updated BKV
34
and BKV
31
to remaining members
{U
31
,U
32
,U
34
,U
35
} of the cluster C
3
.
Figure 6. User joins in Cluster C3
B. User Leaves the Cluster
Assume that a member U
ij
wishes to leave an n-member
cluster. First U
ij
initiates the leave protocol by sending a leave
request. When the other users of cluster receive the request,
they independently determine the sponsor node, which is the
right-most leaf node of the Sub tree rooted at the leaving
members sibling node which is defined as in [1]. The leave
protocol works as given below.
Steps for group key generation and distribution when user
leaves the cluster (algorithm for user leave from cluster).
- User U
ij
broadcasts its leave request to remaining users
of that cluster C
i
.
- The former sibling node of U
ij
is promoted to replace
U
ij
s parent node.
- The size of the cluster that formerly contained U
ij
is
decreased by one.
- The sponsor Us picks a new secret key KUs, and
computes the new cluster group key, and sends the
updated blinded keys of nodes on its key path to their
corresponding cluster users.
- These messages are signed by the sponsor Us
- Group prepared based on DH key exchange fashion, as
explained in dynamic peer groups.
In Figure 7, U
36
leaves a cluster C
3
. The sponsor U
33
picks a
new secret key KU
33
and computes the new group key, sends
updated BKU
33
, BKV
31
and BKV
30
to their cluster users {U
31
,
U
32
, U
34
, and U
35
}.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
88 | P a g e
http://ijacsa.thesai.org/
Figure 7. User U
36
leaves the Cluster C
3
.
C. Updating Secret Shares &RSA keys
In this scheme, each group user is required to update its
Diffie-Hellman keys before each group session, or during a
session when it is selected as a sponsor on a users leaving.
Source authentication of the updated blinded keys is guaranteed
by the senders RSA signature. Further, to ensure the long-term
secrecy of the RSA keys, group user to renew its RSA key pair
periodically, and send it to its cluster users securely using its
current RSA secret key.
VI. PERFORMANCE ANALYSIS
Security Analysis: Users in a network group are usually
considered to be part of the security issue since there are no
fixed nodes to perform the service of authentication. The
Certificate Authority, which may be distributed, is on-line
during initialization, but remains offline subsequently. During
initialization, the CA distributes key certificates and location
IDs, so that the function of key authentication can be realized
and distributed across appropriate clusters.
A. Forward Secrecy
If a hacker (or old member) can compromise any node and
obtain its key, it is possible that the hacker can start new key
agreement protocol by impersonating the compromised node.
For our scheme we can conclude that a passive hacker who
knows a contiguous subset of old group keys cannot discover
any subsequent group key. In this way, forward secrecy can be
achieved.
B. Backward Secrecy
A passive hacker (or new joined member) who knows a
contiguous subset of group keys cannot discover how a
previous group key is changed upon a group join or leave.
C. Key Independence
This is the strongest property of the dynamic peer groups. It
guarantees that a passive adversary who knows some previous
group key cannot determine new group keys.
VII. CONCLUSION
In this paper, we have presented three communication
protocols with distributed group key management for dynamic
peer groups using key trees, by dividing group into subgroups
called clusters. We provided the strong authentication with
LIDs, CIDs for cluster formations. We provide the source
authentication of user in communication with RSA keys. The
DH secret-public key pairs are used for common key
generations. Certificate Authority provided the RSA keys,
LIDs for all users and CIDs for all clusters for all types of
cluster communications.
In future we can extend this application with cluster head
communications, sponsor coordination and cluster merging or
cluster disjoining in dynamic network.
ACKNOWLEDGMENT
We would like to thank to K Sahadeviah for help full
discussion about different key management schemes and
modes of providing authentications. We thank Krishna Prasad
for discussion of effective presentations of concepts. We also
thank our friends for designing of network frame work.
REFERENCES
[1] Kim, Y., Perrig, A., Tsudik, G.: Simple and fault-tolerant key
agreement for dynamic collaborative groups. In: Proceedings of the
CCS00. (2000).
[2] Steiner, M., Tsudik, G., Waidner, M.: Key agreement in dynamic peer
groups. IEEE TRANSACTIONS on Parallel and Distributed Systems 11
(2000).
[3] Perrig, A.: Efficient collaborative key management protocols for secure
automonomous group communication. In: Proceedings of CrypTEC99.
(1999).
[4] Lee, P., Lui, J., Yau, D.: Distributed collaborative key agreement
protocols for dynamic peer groups. In: Proceedings of the ICNP02.
(2002).
[5] Lee, P., Lui, J., Yau, D.: Distributed collaborative key agreement
protocols for dynamic peer groups. Technical report, Dept. of Computer
Science and Engineering, Chinese University of Hong Kong (2002).
[6] Zhou, L., C.V.Ravishankar: Efficient, authenticated, and fault-tolerant
key agreement for dynamic peer groups. Technical Report 88, Dept. of
Computer Science and Engineering, University of California, Riverside
(2004).
[7] Wong, C., Gouda, M., Lam, S.: Secure group communication using key
graphs. In: Proceedings of the ACM SIGCOMM98, Vancouver, Canada
(1998).
[8] Steiner, M., Tsudik, G., Waidner, M.: Cliques: A new approach to group
key agreement. In: Proceedings of the ICDCS98, Amsterdam,
Netherlands (1998).
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
89 | P a g e
http://ijacsa.thesai.org/
[9] Renuka A. and K.C.Shet: Cluster Based Group Key Management in
Mobile Ad hoc Networks (2009).
[10] Nen-Chung Wang, Shian-Zhang Fang.: A hierarchical key management
scheme for secure group communications in mobile ad hoc networks
(2007).
[11] M.G.Gouda, Huang, C., E.N.Elnozahy: Key trees and the security of
interval multicast. In: Proceedings of the ICDCS02, Vienna, Austria
(2002).
[12] Wallner, D., Harder, E., Agee, R.: Key management for multicast: Issues
and architecture. In: Internet Draft, draft-wallner-key-arch-01.txt.
(1998).
[13] Ateniese, G., Steiner, M., Tsudik, G.: New multiparty authentication
services and key agreement protocols. IEEE Journal of Selected Areas in
Communications 18 (2000).
[14] Pereira, O., Quisquater, J.: A security analysis of the cliques protocols
suites. In: Proceedings of the 14-th IEEE Computer Security
Foundations Workshop. (2001).
[15] L.Zhou et Z. J. Haas, Securing Ad Hoc Networks, IEEE Network
Magazine, 13(6),1999.
[16] Eui-Nam Huh, Nahar Sultana, Application Driven Cluster Based Group
Key Management with Identifier in Mobile Wireless Sensor
Networks(2007).
[17] D. Balenson, D. Mcgew, and A. Sherman, Key management for large
dynamic groups: One way function trees and amortized initializations,
IETF, Feb 1999.
[18] Y.Kim,A. Perrig and G.Tsudik, A common efficient group key
agreement, Proc. IFTP-SEP 2001, pp,229-244,2001.
[19] Del Valle Torres Gerardo, Gomez Cardenas Roberto,Overview the Key
Management in Ad Hoc Networks (2004).
[20] Rafaeli, S. and Hutchison, D. (2003) A survey of key management for
secure group communications, ACM Computing for secure group
communication, ACM Computing Surveys, Vo. 35, No.3 pp.309-329.
[21] Bing Wu, Jie Wuand Yuhong Dong,(2008) An efficient group key
management scheme for mobile ad hoc networks, Int. J. Security and
Networks, 2008.
AUTHORS PROFILE
Mr. Rajendar Dharavath, currently is a Assistant
Professor in the Department of Computer Science
and Engineering, Aditya Engineering College,
Kakinada, Andhra Pradesh, India. He completed
B.Tech in CSE from CJITS Jangaon, Warangal, and
M.Tech in CSE from JNTU Kakinada. His research
interest includes: Mobile ad hoc networks, Network
Security and Data Mining & Data Warehouse.
Mr. Bhima K, currently is a Associate Professor and
Head of Department of Computer Science and
Engineering,Brilliant Institute of Engineering and
Technology, Hyderabad, Andhra Pradesh, India. He
completed B.Tech in CSE from RVR&JC Engg.
College, Guntur, and M.Tech in SE from NIT
Alahabad. His research interest includes: Mobile ad
hoc networks, Network Security, Computer
Networks and Software Engineering.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
90 | P a g e
http://ijacsa.thesai.org/
Extracting Code Resource from OWL by Matching
Method Signatures using UML Design Document
UML Extractor
Gopinath Ganapathy
1
1
Department of Computer Science
Bharathidasan University
Trichy, India.
[email protected]
S. Sagayaraj
2
2
Department of Computer Science
Sacred Heart College
Tirupattur, India
[email protected]
AbstractSoftware companies develop projects in various
domains, but hardly archive the programs for future use. The
method signatures are stored in the OWL and the source code
components are stored in HDFS. The OWL minimizes the
software development cost considerably. The design phase
generates many artifacts. One such artifact is the UML class
diagram for the project that consists of classes, methods,
attributes, relations etc., as metadata. Methods needed for the
project can be extracted from this OWL using UML metadata.
The UML class diagram is given as input and the metadata about
the method is extracted. The method signature is searched in
OWL for the similar method prototypes and the appropriate
code components will be extracted from the HDFS and reused in
a project. By doing this process the time, manpower system
resources and cost will be reduced in Software development.
Keywords- Component: Unified Modeling language, XML, XMI
Metadata I nterchange, Metadata, Web Ontology Language, J ena
framework.
I. INTRODUCTION
The World Wide Web has changed the way people
communicate with each other. The term Semantic Web
comprises techniques that dramatically improve the current
web and its use. Todays Web content is huge and not well-
suited for human consumption. The machine processable Web
is called the Semantic Web. Semantic Web will not be a new
global information highway parallel to the existing World
Wide Web; instead it will gradually evolve out of the existing
Web [1]. Ontologies are built in order to represent generic
knowledge about a target world [2]. In the semantic web,
ontologies can be used to encode meaning into a web page,
which will enable the intelligent agents to understand the
contents of the web page. Ontologies increase the efficiency
and consistency of describing resources, by enabling more
sophisticated functionalities in development of knowledge
management and information retrieval applications. From the
knowledge management perspective, the current technology
suffers in searching, extracting, maintaining and viewing
information. The aim of the Semantic Web is to allow much
more advanced knowledge management system.
To develop such a knowledge management system the
software companys can make use of the already developed
coding. That is to develop new software projects with reusable
codes. The concept of reuse is not a new one. It is however
relatively new to the software profession. Every Engineering
discipline from Mechanical, Industrial, Hydraulic, Electrical,
etc, understands the concept of reuse. However, Software
Engineers often feel the need to be creative and like to design
one time use components. The fact is they come with unique
solution for every problem. Reuse is a process, an applied
concept and a paradigm shift for most people. There are many
definitions for reuse. In plain and simple words, reuse is, The
process of creating new software systems from existing
software assets rather then building new ones.
Systematic reuse of previously written code is a way to
increase software development productivity as well as the
quality of the software [3, 4, 5]. Reuse of software has been
cited as the most effective means for improvement of
productivity in software development projects [6, 7]. Many
artifacts can be reused including; code, documentation,
standards, test cases, objects, components and design models.
Few organizations argue the benefits of reuse. These benefits
certainly will vary organization to organization and to a degree
in economic rational. Some general reusability guidelines,
which are quite often similar to general software quality
guidelines, include [8] ease of understanding, functional
completeness, reliability, good error and exception handling,
information hiding, high cohesion and low coupling, portability
and modularity. Reuse could provide improved profitability,
higher productivity and quality, reduced project costs, quicker
time to market and a better use of resources. The challenge is to
quantify these benefits.
For every new project Software teams design new
components and code by employing new developers. If the
company archives the completed code and components, they
can be used with no further testing unlike open source code and
components. This has a recursive effect on the time of
development, testing, deployment and developers. So there is a
base necessity to create system that will minimize these factors.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
91 | P a g e
http://ijacsa.thesai.org/
Code re-usability is the only solution for this problem. This
will reduce the development of an existing work and testing.
As the developed code has undergone the rigorous software
development life cycle, it will be robust and error free. There is
no need to re-invent the wheel. To reuse the code, a tool can be
create that can extract the metadata such as function, definition,
type, arguments, brief description, author, and so on from the
source code and store them in OWL. This source code can be
stored in the HDFS repository. For a new project, the
development can search for components in the OWL and
retrieve them at ease. The OWL represents the knowledgebase
of the company for the reuse code.
The projects are stored in OWL and the source code is
stored in the Hadoop Distributed File System (HDFS) [9]. The
client and the developer decide and approve the design
document. For the paper the UML class diagram is one such
design document considered as the input for the system. The
method metadata is extracted from the UML and passed to the
SPARQL to extract the available methods from the OWL.
Selecting appropriate method from the list the code component
is retrieved from the HDFS. The purpose of using an UML
diagram as input is before developing software this tool can be
used to estimate how many methods is to be developed by
extraction. The UML diagram is a powerful tool that acts
between the developer and the user. So it is like a contract
where both parties agree for software development using UML
diagram. After extracting the methods from the UML diagram
these methods are matched in the OWL. From the retrieved
methods the developer can account for how many are already
available in the repository and how many to be developed. If
the retrieved methods are more the development time will be
shorter. To have more method matches the corporate should
store more projects. The uploading of projects in the OWL and
HDFS the corporate knowledge grows and the developers will
use more of reuse code than developing themselves. Using the
reuse code the development cost will come down, development
time will become shorter, resource utilization will be less and
quality will go up.
The paper begins with a note on the related technology
required in Section 2. The detailed features and framework for
source code retriever is found in Section 3. The Keyword
Extractor for UML is in section 4. The Method Retriever by
Jena framework is in section 5. The Source Retriever from the
HDFS is in section 6. The implementation scenario is in
Section 7. Section 8 deals with the findings and future work of
the paper.
II. RELATED WORK
A. Metadata
Metadata is defined as data about data or descriptions of
stored data. Metadata definition is about defining, creating,
updating, transforming, and migrating all types of metadata that
are relevant and important to a users objectives. Some
metadata can be seen easily by users, such as file dates and file
sizes, while other metadata can be hidden. Metadata standards
include not only those for modeling and exchanging metadata,
but also the vocabulary and knowledge for ontology [10]. A lot
of efforts have been made to standardize the metadata but all
these efforts belong to some specific group or class. The
Dublin Core Metadata Initiative (DCMI) [11] is perhaps the
largest candidate in defining the Metadata. It is simple yet
effective element set for describing a wide range of networked
resources and comprises 15 elements. Dublin Core is more
suitable for document-like objects. IEEE LOM [12], is a
metadata standard for Learning Objects. It has approximately
100 fields to define any learning object. Medical Core
Metadata (MCM) [13] is a Standard Metadata Scheme for
Health Resources. MPEG-7 [14] multimedia description
schemes provide metadata structures for describing and
annotating multimedia content. Standard knowledge ontology
is also needed to organize such types of metadata as content
metadata and data usage metadata.
B. Hadoop & HDFS
The Hadoop project promotes the development of open
source software and it supplies a framework for the
development of highly scalable distributed computing
applications [15]. Hadoop is a free, Java-based programming
framework that supports the processing of large data sets in a
distributed computing environment and it also supports data
intensive distributed application. Hadoop is designed to
efficiently process large volumes of information[16]. It
connects many commodity computers so that they could work
in parallel. Hadoop ties smaller and low-priced machines into a
compute cluster. It is a simplified programming model which
allows the user to write and test distributed systems quickly. It
is an efficient, automatic distribution of data and it works
across machines and in turn it utilizes the underlying
parallelism of the CPU cores. The monitoring system then re-
replicates the data in response to system failures which can
result in partial storage. Even though the file parts are
replicated and distributed across several machines, they form a
single namespace, so their contents are universally accessible.
Map Reduce [17] is a functional abstraction which provides an
easy-to-understand model for designing scalable, distributed
algorithms.
C. Ontology
The key component of the Semantic Web is the collections
of information called ontologies. Ontology is a term borrowed
from philosophy that refers to the science of describing the
kinds of entities in the world and how they are related. Gruber
defined ontology as a specification of a conceptualization
[18].Ontology defines the basic terms and their relationships
comprising the vocabulary of an application domain and the
axioms for constraining the relationships among terms [19].
This definition explains what an ontology looks like [20].The
most typical kind of ontology for the Web has taxonomy and a
set of inference rules. The taxonomy defines classes of objects
and relations among them. Classes, subclasses and relations
among entities are a very powerful tool for Web use.
III. SOURCE CODE RETRIEVER FRAMEWORK
The Source Code Retriever makes use of OWL is
constructed for the project and the source code of the project is
stored in the HDFS [21]. All the project information of a
software company is stored in the OWL. The size of the project
source will be of terabytes and the corporate branches are
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
92 | P a g e
http://ijacsa.thesai.org/
spread over in various geographical locations so, it is stored in
Hadoop repository to ensure distributed computing
environment. Source Code Retriever is a frame work that takes
UML class diagram or XMI (XML Metadata Interchange) file
as an input from the user and suggests the reusable methods for
the given Class Diagram. The Source Code Retriever consists
of three components: Keyword Extractor for UML, Method
Retriever and Source Retriever. The process of the Source
Code Retriever Framework is presented in the Fig. 1 The
Keyword Extractor for UML extracts the metadata from the
UML class diagram. The class diagram created by Umberllo
tool is passed as input to the Keyword Extractor for UML.
The input for the frame work can be an existing UML class
diagram or created by the tool. Both types of input are loaded
in to Umberllo and the file type for storing UML class diagram
is XMI format. The file is parsed for metadata extraction. The
parser extracts method signatures from the XMI file and passes
it to the Method Retriever component. Method Retriever
component
UML Extractor
Source Retriever
Method Retriever
HDFS
Retrieved
Method &
source
Figure 1. Process of Source Retriever
retrieves the matched methods from the repository. Method
Retriever constructs SPARQL query to retrieve the matched
results. The user should select the appropriate method from the
list of methods and retrieve the source code by Source
Retriever component which interacts with HDFS and displays
the source code.
IV. KEYWORD EXTRACTOR FOR UML
Unified Modeling Language (UML) is a visual language for
specifying, constructing, and documenting the artifacts of
systems. It is a standardized general-purpose modeling
language in the field of software engineering. To create UML
class diagram Umberllo UML Modular open source tool is
used. The diagram is stored in XMI format. Umbrello UML
Modeller is a Unified Modeling Language diagram program for
KDE. UML allows the user to create diagrams of software and
other systems in a standard format. Umbrello It can support in
the software development process especially during the
analysis and design phases of this process. UML is the
diagramming language used to describing such models.
Software ideas can be represented in UML using different
types of diagrams. Umbrello UML Modeller 1.2 supports Class
Diagram, Sequence Diagram, Collaboration Diagram, Use
Case Diagram, State Diagram, Activity Diagram, Component
Diagram and Deployment Diagram.
The XMI is an Object Management Group (OMG) standard
for exchanging metadata information using XML. The initial
proposal of XMI "specifies an open information interchange
model that is intended to give developers working with object
technology the ability to exchange programming data over the
Internet in a standardized way, thus bringing consistency and
compatibility to applications created in collaborative
environments. "The main purpose of XMI is to enable easy
interchange of metadata between modeling tools and between
tools and metadata repositories in distributed heterogeneous
environments. XMI integrates three key industry standards:
(a) XML - a W3C standard (b) UML - an OMG (c) MOF -
Meta Object Facility and OMG modeling and metadata
repository standard. The integration of these three standards
into XMI marries the best of OMG and W3C metadata and
modeling technologies allowing developers of distributed
systems share object models and other Meta data over the
Internet.
The process flow of Keyword Extractor for UML is given
in the Fig. 2. The XMI or UML file is parsed with the help of
the SAX (Simple API for XML) Parser. SAX is a sequential
access parser API for XML. SAX provides a mechanism for
reading data from an XML document. SAX loads the XMI or
UML file and get the list of tags by passing name. It gets the
attribute value of the tags by attributes.getValue(<Name of the
attributes>) method. The methods used to retrieve the attributes
are Parse, Attributes and getValue(nameOfAttibute). The
Parse() method will parse the XMI file. The Attribute is to hold
the attribute value. GetValue(nameOfAttibute) method returns
class information, method information and parameter
information of the attribute.
UML Extractor
Class Name
(Name, scope)
Method Information
( Name, type)
Parameter Information
UML or XMI
file
Figure 2. Process of Keyword Extractor for UML
The XMI file consists of XML tags. To extract class
information, method information and parameter information
are identified with the appropriate tag as given in the Table I.
TABLE I. TAGS USED TO EXTRACT METADATA FROM XMI FILE
Tag Purpose
UML:DataType It holds the data type information
UML:Class It holds the class informations
like name of the class, visibility
of the class ,etc.,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
93 | P a g e
http://ijacsa.thesai.org/
UML:Attribute Attribute is a sub tag of class. It
holds the informations of the class
attributes like name of the
attributes, type of the attribute,
and visibility of the attribute etc.,
UML:Operation It holds the methods information
of the class like name of the
method, return type of the
methods, visibility of the method.
UML:BehavioralFeatu
re.parameter
It holds the information of the
methods parameters like name of
the parameter, data type of the
parameter.
Using the tags the metadata of the UML or the XMI is
extracted. The extracted metadata are class, methods, and
attributes etc., which are passed to the Method Retriever
component.
V. METHOD RETRIEVER
Method Retriever component interact with the OWL and
returns the available methods from the OWL for the given class
diagram is represented diagrammatically in Fig. 3. The
extracted information from the UML file by the Keyword
Extractor for UML is passed to the Method Retriever
component. It interacts with OWL and retrieves matched
method information using SPARQL query. SPARQL is a
Query language for RDF. The SPARQL Query is executed on
OWL file. Jena is a Java framework for building Semantic Web
applications. It provides a programmatic environment for RDF,
RDFS and OWL, SPARQL and includes a rule-based inference
engine. Jena is a Java framework for manipulating ontologies
defined in RDFS and OWL Lite [22]. Jena is a leading
Semantic Web toolkit [23] for Java programmers. Jena1 and
Jena2 are released in 2000 and August 2003 respectively. The
main contribution of Jena1 was the rich Model API. Around
this API, Jena1 provided various tools, including I/O modules
for: RDF/XML [24], [25], N3 [26], and N-triple [27]; and the
query language RDQL [28]. In response to these issues, Jena2
has a more decoupled architecture than Jena1. Jena2 provides
inference support for both the RDF semantics [29] and the
OWL semantics [30].
SPARQL is an RDF query language; its name is a recursive
acronym that stands for SPARQL Protocol and RDF Query
Language used to retrieve the information from the OWL.
SPARQL can be used to express queries across diverse data
sources, whether the data is stored natively as RDF or viewed
as RDF via middleware. SPARQL contains capabilities for
querying required and optional graph patterns along with their
conjunctions and disjunctions. SPARQL also supports
extensible value testing and constraining queries by source
RDF graph. The results of SPARQL queries can be results sets
or RDF graphs.
A. Query processor
A query processor executes the SPARQL Query and
retrieves the matched results. The SPARQL Query Language
for RDF[31] and the SPARQL Protocol for RDF[32] are
increasingly used as a standardized query API for providing
access to datasets on the public Web and within enterprise
settings. The SPARQL query takes method parameters and the
returns the results. The retrieved results contains project details
like name of the project, version of the project and method
details like name of the package, name of the class, method
name , method return type, method parameter. Query processer
takes the extracted method name and the method parameter as
an input and retrieves the methods and project information
from the OWL.
Extracted Method
OWL
Project Detail
Query Processor
Method Detail
Matched Results form OWL
Figure 3. Method Retriever Process
B. SPARQL query
The SPARQL query is constructed to extracting project
name, version of the project, package name, class name,
method name, return type, and return identifier name, method
parameter name and type. The sample query is as follows
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?pname ?version ?packname ?cname ?mname
?rType ?identifier ?paramName ?parmDT ?paramT
WHERE {
?project rdf:type base:Project .
?project base:Name ?pname .
?project base:Project_Version ?version .
?project base:hasPackage ?pack .
?pack base:Name ?packname .
?pack base:hasClass ?class .
?class base:Name ?cname .
?class base:hasMethod ?subject .
?subject base:Name ?mname .
?subject base:Returns ?rType.
?subject base:Identifier ?identifier.
?subject base:hasParameter ?parameter.
?parameter base:Name ?paramName.
?parameter base:DataType ?parmDT.
?parameter base:DataType ?parmT.
FILTER regex ( ?mname , "add" , "i" ) .
FILTER regex ( ?parmT , "java.lang.String" , "i" ) .
}
VI. SOURCE RETRIEVER
Source Retriever component retrieves the appropriate
source code of the user selected method from the HDFS. It is
the primary storage system used by Hadoop applications.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
94 | P a g e
http://ijacsa.thesai.org/
HDFS creates multiple replicas of data blocks and distributes
them on compute nodes throughout a cluster to enable reliable,
extremely rapid computations. The source code file location of
the Hadoop repository path is obtained from the OWL and
retrieved from the HDFS by the
copyToLocal(FromFilepath,localFilePath) method.
QDox is a high speed, small footprint parser for extracting
class/interface/method definitions from source files. When the
java source file or folder that consists java source file loaded to
QDox; it automatically performs the iteration. The loaded
information is stored in the JavaBuilder object. From the java
builder object the list of packages as an array of string are
returned. This package list has to be looped to get the class
information. From the class information the method
information is extracted. It returns the array of JavaMethod.
From this java method the information like scope of the
method, name of method, return type of the method and
parameter informations are extracted from the JavaMethod.
QDox finds the methods from the source code. The file
that is retrieved from the HDFS is stored in the local temporary
file. This file is passed to the Qdox addSource() method for
parsing. Through Qdox each method is retrieved one by one.
The retrieved methods are compared with methods that the user
requested for source code retrieval method. If it matches the
source code is retrieved by getSourceCode() method. Then the
temporary file is deleted after the process. In Hadoop
repository files are organized in the same hierarchy of java
folder. So it gets the source location from the OWL and
retrieve the java source file to a temp file. The temporary file is
loaded into QDox to identify methods. Each method is
compared with method to be searched. If it matches; the source
code of the method is retrieved by getMethodSourceCode()
method.
VII. CASE STUDY
The input for the frame work is a UML class diagram. The
sample class diagram is given below
The entire process of the framework is given in the Table
II. The Keyword Extractor for UML uses the class diagram and
retrieves the method validateLogin(username:string). The
output is given to the Method Extractor and generates the
SPAQL query and extracts the matched methods which are
listed in the Table III. From the list the appropriate method will
be selected and the QDox retrieves the source code from the
HDFS and displays the method definition of the selected
methods as shown in the output of the Source Retriever in
Table II.
TABLE II. PROCESS FLOW OF THE FRAMEWORK
Proces Input Output
UML
Extraction
Given
Class
Diagram
Method Information
Name : validateLogin
Return : Boolean
visibility : public
Parameters : User Name
DataType : username
Method
Retriever
validateLo
gin(String
userName)
Refer Table 2
Source
Retriever
validateLo
gin(String
userName)
boolean returnStatus = false;
DatabaseOperation
databaseOperation = new
DatabaseOperation();
String strQuery = "SELECT *
FROM login WHERE
uname='"+userId+"'";
ResultSet resultSet =
databaseOperation.selectFro
mDatabase(strQuery);
try {
while(resultSet.next()){
returnStatus = true;
}
} catch (SQLException e) {
e.printStackTrace();
}
return returnStatus;
To test the performance of this framework the reusable
OWL files are created by uploading the completed projects.
The first OWL file is uploaded with first java project. The
second OWL file is uploaded with first and the second java
projects. The third OWL file is uploaded with first, second and
third java projects. Similarly five OWL files are constructed.
The purpose of creating OWL is to show how reusability
increases when the knowledgebase grows. A sample new
project is considered and it contains ten methods to be
developed. The OWL files are listed with the number of
packages, number of classes, number of methods and number
of parameters. These methods are matches with the OWL files
and the number of matches is listed in the Table IV.
TABLE III. METHOD RETRIEVER OUTPUT
Sl.
No.
Information
1
Project Name : CBR_1.0
Package : com.cbr.my.engine
Class Name : Login
Method
Name : ValidateLogin
Parameters : UserName
Return Type : boolean
2
Project Name : RBR_1.0
Package : com.my.rbr.utils.engine
Class Name : LoginManger
Method
Name : LoginLog
Parameters : UserName,ActivityCode
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
95 | P a g e
http://ijacsa.thesai.org/
Return Type : Boolean
Method
Name : LoginContol
Parameters : UserName,password
Return Type : Boolean
3
Project Name : BHR_1.0
Package :
com.boscoits.BHR.utils.Action
Class Name : ControlManager
Method
Name : ManageLogin
Parameters :
UserName,password,memberId,ActionId
Return Type : Boolean
Method
Name : ValidateLogin
Parameters : UserName,password
Return Type : Boolean
These data in the row of the Table IV shows that the
number of matched methods. The reusability graph shown in
the Fig. 4 shows that how the matches increases when the
number of projects in the OWL grows. For the graph only five
new method names are used instead of ten listed in the Table
IV. The X-axis represents the OWL file numbers and the Y-
axis represents the number of method matched for the new
method legends. This progress shows that by uploading more
projects in the knowledgebase can able to provide nearly
hundred percent of the methods for reuse during software
development.
TABLE IV. NEW METHOD MATCHES WITH VARIOUS
KNOWLEDGEBASE
1
OWL
2
OWL
3
OWL
4
OWL
5
OWL
Classes 86 116 129 297 321
Methods 50 1088 1130 3405 3697
Packages 12 15 22 27 31
Parameters 765 1119 1174 4552 4802
Method Name
ValidateLogin 5 26 27 46 40
getUserType 0 0 0 0 2
addStudent 4 6 6 18 18
ManageRole 6 14 0 28 29
connect 4 5 8 11 16
InsertQuery 2 3 3 5 6
deleteQuery 2 3 3 5 6
updateQuery 2 3 3 5 6
selectQuery 2 3 3 5 6
connect 2 3 3 5 6
Figure 4. The number of matches for methods to the Projects
VIII. CONCLUSION
The paper presents a framework to extract the method code
components from the OWL using the UML design document.
OWL is semantically much more expressive than needed for
the results of our searching. With these sample tests the paper
argues that it is indeed possible to extract code from OWL
using the UML class diagram. The purpose of the paper is to
achieve the code reusability for the software development. The
OWL for the source code has already been created and this
paper searches and extracts the code and components and
reuses to shorten the software development life cycle. Before
starting the coding phase of the development the framework
helps the software development team to access the possibilities
of how much code can be reused and how much code need to
be developed. This assessment can help project manager to
allot resources to the project and reduce cost, time and
resource. The software companies can make use of this
framework and develop the project quickly and grab the project
at the lower cost among the competitors.
After developing OWL Ontology and storing the source
code in the HDFS, the code components can be reused. This
paper has taken design document from the user as input, then
extracted the method signature and try to search and match in
the OWL. The knowledgebase gets uploaded with more and
more projects the reuse rate is also higher. The future work can
take the SRS as input; text mining can be performed to extract
the keywords as classes and the process as methods. The SRS
artifact is much earlier phase than the UML. So considerable
amount of time can be reduced than using UML as input. The
method prototype can be used to search and match with the
OWL and the required method definition can be retrieved from
the HDFS. The purpose of storing the metadata in OWL is to
minimize the factors like time of development, time of testing,
time of deployment and developers. By creating OWL using
this framework can reduce these factors.
REFERENCES
[1]. Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer,
PHI Learning Private Limited, New Delhi, 2010, pp 1-3.
[2]. Bung. M, Treatise on Basic Philosophy. Ontology I. The Furniture of
the World. Vol. 3, Boston: Reidel.
[3]. Gaffney Jr., J. E,, Durek, T. A., Software reuse - key to enhanced
Productivity: Some quantitative models, Information and Software
Technology 31(5): 258-267.
[4]. Banker, R. D., Kauffman, R. J., Reuse and Productivity in Integrated
Computer-Aided Software Engineering: An Empirical Study, MIS
ValidateLo
gin
getUserTyp
e
addStudent
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
96 | P a g e
http://ijacsa.thesai.org/
Quarterly 15(3): 374-401.
[5]. Basili, V. R.,Briand L. C., Melo, W. L., How Reuse Influences
Productivity in Object-Oriented Systems, Communications of the ACM
39(10): 104-116.
[6]. Boehm B.W., Pendo M., Pyster A., Stuckle E.D., and William R.D., An
Environment for Improving Software Productivity, In IEEE Computer,
June 1984.
[7]. Paul R.A., Metric-Guided Reuse, In proceedings of 7
th
International
Conference on tools with artificial Intelligence (TAI95), 5-8 November,
1995, pp. 120-127.
[8]. Poulin Jeffrey S., Measuring Software Reusability, In proceedings of
3rd International Conference on Software Reuse, Brazil, 1-4 November
1994, pp. 126-138.
[9]. Gopinath Ganapathy and S. Sagayaraj, Automatic Ontology Creation by
Extracting Metadata from the Source code , in Global Journal of
Computer Science and Technology,Vol.10, Issue 14( Ver.1.0) Nov.
2010. pp.310-314.
[10]. Won Kim: On Metadata Management Technology Status and Issues,
In Journal of Object Technology, vol. 4, no. 2, 2005, pp. 41-47.
[11]. Dublin Core Metadata Initiative. <
http://dublincore.org/documents/>,2002.
[12]. IEEE Learning Technology Standards Committee,
http://ltsc.ieee.org/wg12, IEEE Standards for Learning Object Metadata.
[13]. Darmoni, Thirion, Metadata Scheme for Health Resources
American Medical Informatics Association, 2000 JanFeb; 7(1): 108
109.
[14]. MPEG-7 Overview: ISO/IEC JTC1/SC29/WG11 N4980,
Kla-genfurt, July 2002.
[15]. Jason Venner, Pro Hadoop : Build Scalable, Distributed Applications,
in The Cloud, Apress, 2009.
[16]. Gopinath Ganapathy and S. Sagayaraj, Circumventing Picture
Archiving and Communication Systems Server with Hadoop
Framework in Health Care Services, in Science Publication 6 (3) :
2010: pp.310-314.
[17]. Tom White, Hadoop: The Definitive Guide, OReilly Media, Inc.,
2009.
[18]. Gruber, T. What is an Ontology? (September, 2010):
http://www.ksl-stanford.edu/kst/what-is-an-ontology.html.
[19]. Yang, X. Ontologies and How to Build Them. (January, 2011):
http://www.ics.uci.edu/~xwy/publications/area-exam.ps.
[20]. Bugaite, D., O. Vasilecas, Ontology-Based Elicitation of Business
Rules. In A. G. Nilsson, R. Gustas, W. Wojtkowski, W. G.
Wojtkowski, S. Wrycza, J. Zupancic, Information Systems
Development: Proc. of the ISD2004. Springer- Verlag, Sweden, 2006,
pp. 795-806.
[21]. Gopinath Ganapathy and S. Sagayaraj, To Generate the Ontology from
Java Source Code, in International Journal of Advanced Computer
Science and Applications(IJACSA), Volume 2 No 2 February 2011.
[22]. McCarthy, P. Introduction to Jena,
www-106.ibm.com/developerworks/java/library/j-jena/, ,
22.01.2011.
[23]. B. McBride, Jena IEEE Internet Computing, July2002.
[24]. J.J. Carroll,CoParsing of RDF & XML, HP Labs Technical Report,
HPL-2001-292, 2001.
[25] J.J. Carroll, Unparsing RDF/XML,WWW2002,
http://www.hpl.hp.com/techreports/2001/HPL-2001-292.html.
[26]. T. Berners-Lee et al., Primer: Getting into RDF & Semantic Web using
N3, http://www.w3.org/2000/10/swap/Primer.html.
[27]. J. Grant, D. Beckett, RDF Test Cases, 2004, W3C6.
[28]. L. Miller, A. Seaborne, and A. Reggiori, Three Implementations of
SquishQL, a Simple RDF Query Language, 2002, p 423.
[29]. P. Hayes, RDF Semantics, 2004, W3C.
[30]. P.F. Patel-Schneider, P. Hayes, I. Horrocks, OWL Semantics &
Abstract Syntax, 2004, W3C.
[31]. Prudhommeax, E., Seaborne, A., SPARQL Query Language for
RDF, W3C Recommendation, Retrieved November 20, 2010,
http://www.w3.org/TR/rdf-sparql-query/
[32]. Kendall, G.C., Feigenbaum, L., Torres, E.(2008), SPARQL Protocol for
RDF. W3C Recommendation, Retrieved November 20, 2009,
http://www.w3.org/TR/rdf-sparql-protocol/
AUTHORS PROFILE
Gopinath Ganapathy is the Professor & Head, Department of Computer
Science and Engineering in Bharathidasan University, India. He obtained his
under graduation and post-graduation from Bharathidhasan University, India in
1986 and 1988 respectively. He submitted his Ph.D in 1996 in Maduari
Kamaraj University, India. Received Young Scientist Fellow Award for the
year 1994 and eventually did the research work at IIT Madras. He published
around 20 research papers. He is a member of IEEE, ACM, CSI, and ISTE. He
was a Consultant for a 8.5 years in the international firms in the USA and the
UK, including IBM, Lucent Technologies (Bell Labs) and Toyota. His research
interests include Semantic Web, NLP, Ontology, and Text Mining.
S. Sagayaraj is the Associate professor in the Department of Computer
Science, Sacred Heart College, Tirupattur, India. He did his Bachelor Degree in
Mathematics in Madras University, India in 1985. He completed his Master of
Computer Applications in Bharadhidhasan University, India in 1988. Received
Master of Philosophy in Computer Science from Bharathiar University, India in
2001. Registered for Ph.D. programme in Bharathidhasan University, India in
2008. His Research interests include Data Mining, Ontologies and Semantic
Web.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
97 | P a g e
http://ijacsa.thesai.org/
Magneto-Hydrodynamic Antenna Design and
Development Analysis with prototype
Rajveer S Yaduvanshi
Electronic and Communication
Deparment,
AIT, Govt of Delhi
India-110031
E mail: [email protected]
Harish Parthasarathy
Electronic and Communication
Deparment,
NSIT, Govt of Delhi
India-110075
E [email protected] in
Asok De
Principal
AIT, Govt of Delhi
India-110031
E mail: [email protected]
AbstractA new class of antenna based on
magnetohydrodynamic technique is presented. Magneto-
hydrodynamic Antenna, using electrically conducting fluid, such
as NaCl solution under controlled electromagnetic fields is
formulated and developed. Fluid resonator volume and electric
field with magnetic field decides the resonant frequency and
return loss respectively to make the antenna tuneable in the
frequency range 4.5 to 9 GHz. The Maxwells equations, Navier
Stokes equations and equations of mass conservation for the
conducting fluid and field have been set up. These are expressed
as partial differential equations for the stream function electric
and magnetic fields, these equations are first order in time. By
discretizing these equations, we are able to numerically evaluate
velocity field of the fluid in the near field region and
electromagnetic field in the far field region. We propose to
design, develop, formulate and fabricate an prototype MHD
antenna [1-3]. Formulations of a rotating fluid frame, evolution
of pointing vector, permeability and permittivity of MHD
antenna have been worked out. Proposed work presents tuning
mechanism of resonant frequency and dielectric constant for
frequency agility and configurability. Measured results of
prototype antenna possess return loss up to -51.1dB at 8.59 GHz
resonant frequency. And simulated resonant frequency comes out
to be10.5GHz.
Keywords- Frequency agility, reconfigurability, MHD, radiation
pattern, saline water.
I. INTRODUCTION
MHD antenna uses fluid as dielectric. The word magneto
hydrodynamics (MHD) is derived from magneto- meaning
magnetic field, and hydro- meaning liquid, and -dynamics
meaning movement. MHD is the study of flow of electrically
conducting liquids in electric and magnetic fields [3-5]. Here
we have developed and tested magneto-hydrodynamic proto-
type antenna with detailed physics. Ting and King determined
in 1970 that dielectric tube can resonate. To our knowledge no
work has been done on MHD antenna as described here. Based
on our own developed theory, we have proposed this prototype
model with return loss results. Fluid antenna has advantage of
shape reconfigurability and better coupling of electromagnetic
signal with the probe, as no air presents in between [12]. We
have developed physics as per equations (1-12) for
electromagnetic wave coupling with conducting fluid in
presence of electric and magnetic field. Design and testing
stages of MHD antenna is shown as per figs. 1-13. Here, we
demonstrate, how the directivity, radiation resistance and total
energy radiated by this magnetohydrodynamic antenna can be
computed, by the elementary surface integrals. We have
developed, equations for rotating frame of conducting fluid,
velocity field, electric field, magnetic field, pointing vector,
current density, permittivity, permeability and vector potentials
to realise an MHD Antenna [6-8].We have used saline water,
ionised with DC voltage applied with the help of electrodes, in
presence permanent magnetic field. Fluid acts as radiating
element in the PPR (propylene random copolymer) cylindrical
tube. SMA connector is used to supply RF input. Volume and
shape of the fluid decides the resonant frequency. Excellent
results of radiation parameters were reported on measurements
of return loss and radiation pattern by the prototype, as listed in
tables 1-5. We have divided this paper into five parts. First part
consists introduction of MHD antenna system. Second part
deals with formulations [9-11]. Section three focuses on brief
explanation of the prototype development. Fourth section
speaks about working of prototype system. Section five
describes conclusion possible applications and scope of future
work.
II. FORMULATIONS
A. Motion of fluid in rotating frame
The equation of motion of a fluid in a uniformly rotating
frame with angular velocity is given by
, t + .
+ 2 x + x ( x r) =
(1)
Assuming the flow to be two dimensional and fluid to be
incompressible, obtain an equation for the stream function.
Velocity of fluid is given below
=
(t, x, y) +
(t, x, y) (2)
Angular velocity
=
The equation
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
98 | P a g e
http://ijacsa.thesai.org/
= 0
It gives
(t, x, y) (3)
= -
(t, x, y)
for some scalar function called the stream function.
As we know vortisity
= x (4)
= -
Using this in the equation obtained by taking the curl of
the Navier Stokes equation, we have
, t +( x ) + 2 x ( x ) + x (( x r)) =
(5)
Note that
( x r) = (. r)
(6)
so that
x ( x ( x r)) = 0 (7)
Since is assumed to be constant, thus, the Navier Stokes
equation gives
, t + ( x ) + 2 x ( x )
=
(8)
B. Far field radiation Pattern
Space here r-radius, Angle of elevation, - azimuth
angle,
x = r Sin Cos
y = r Sin Sin
z = r Cos
v x B shall provide pointing vector in case of fluid. E x H
gives pointing vector , here H vector to embed v effect due to
conducting fluid .
and
also
=
|
|
|
|
Solution of above matrix shall provide us
) +
)
Hence
) =
And
(r.
) =
(9)
(Resulting Pointing Vector)
On substitution
Pointing vector =
Where
.
And
(
)
|
or
(
)
And second component
(
)
)
|
or
and
we can evaluate total magnitude of radiated energy
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
99 | P a g e
http://ijacsa.thesai.org/
per unit frequency per unit volume. This spectral density can be
evaluated by applying Parsevals theorem (mathematics of
DFT). As electric field E= - j
- j
),
Here
to
compute energy spectral density. On integration we can
evaluate total radiated energy. Also, we shall work to find x,
y, z component of pointing vector.
Where r denotes source and r denotes far field distance.
and
should be function of (
also
at large
distance, also
=
Our objective is to evaluate total energy radiated per unit
frequency per unit volume.
H =
,
assuming real part will effectively contribute.
= Cos Sin + Sin Sin + Cos
(
)
(
)
(
)
}
And
(
)
Hence, pointing vector can be defined as
(, , ) +
(, ,
)
And
(, , ) +
(, , ))
Hence energy spectral density
D
Energy Spectral Density be evaluated by applying
Parsevals Theorem
or
(10)
This shall provide us total energy radiated by the MHD
antenna system.
C. Permeability of MHD antenna
We evaluate permeability of MHD antenna taking,
conductivity and permittivity as constant. Hence becomes
function of polynomial
In (E ,H, v) in MHD system, where E electric filed , h
magnetic field and v velocity of the fluid. Here p, q ,r are
integers.
And a=1, 2, 3
[
From Maxwells equation, we have
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
100 | P a g e
http://ijacsa.thesai.org/
= -
= - (
)
And,
(20)
Thus, we observe that, permeability
becomes
coupled function of E, H and v.
We can minimize the difference or error (H -
) with
variational method, here p, q, r are integers
, Desired
outcome and a=1,2,3 . We derive the relationship of ,
permeability as function of E, H and v, where E, electric field
applied ,H magnetic field and v is velocity field
.Also
Lagrange multipliers. For3D analysis we have,
(t, x, y, z),
[]
)
Hence on taking inner product and Lagrange multiplier in
variational method, we get,
[
) and
) = 0
Where
i=1,2,..r,
(
This gives the error value by variational method
Lagrangian multiplier
),
and E, H, v are functions of permeability.
D. Permittivity of MHD antenna
As per Maxwells equation
x E =
(.E)
E =
)
And
x
= J +
)
On substitution, we get
) =
) +
When, is a function of (E, H, )
here
or
=
(11)
= Cyclic Tensor
(
) , hence (
)
(
)
On Summed over a, b and inverse of matrix
Let
(
) =
is matrix of element,
upto k
Hence
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
101 | P a g e
http://ijacsa.thesai.org/
[]
) |
Permittivity
solution can be worked out by difference method or test
function method.
(
(12)
III. PROTOTYPE DEVELOPMENT
Two cylindrical tubes of PPR Pipes of diameters of 10.5cm
and 6.2cm with 6.5cm lengths were mounted on copper coated
plate 35 cm diameter circular sheet as ground plane. SMA
connector was mounted on outer tube for RF input. Two
electrodes tin electrodes of size 1.2 cm x 4.1 cm mounted on
the inside wall of bigger tube having direct contact with the
conducting fluid. DC voltage was given with DC source, 5-25
V with BIAS TEE arrangement. Copper coated plate 0.2mm
thickness was connected 0.9cm to ground of SMA connector.
Copper coated circular plate was used to create ground plane.
The probe of .08 cm in diameter and 0.75cm protruded from
SMA connector was inserted in such way, that it to makes
direct contact with the conducting fluid. Two permanent bar
magnets of 15cm x4cm x2cm were placed perpendicular to the
electric field to produce Lorentz force to create fluid flow.
Inside the tube saline water having 1200 - 9000 TDS (total
dissolved salt) value was used in ionised state, to produce
radiations. 300ml volume of saline water was used for perfect
impedance matching at resonant frequency. The RF signal was
given by network analyser through SMA connector with mixed
DC voltage to the fluid. S11prameters were recorded as per fig
3-5.
IV. DETAILED DESCRIPTION OF MHD ANTENNA
In this antenna, only ionised currents contribute to radiate
energy in conducting fluid. Radiating resistance and resonant
frequency shall depend on shape of fluid inside the tube and
nano particles of the fluid. The tube was applied to external
magnetic field which interacts with electric field to produce
Lorentz forces, resulting in fluid flow with velocity v. Now
there are three main fields i.e. electric field, magnetic field and
velocity fields, which are responsible for the possible
radiations. The radiated energy and its pattern are function of
RF input excitation, fields applied, fluid shape and nano
particle of fluid. Hence an adaptive mechanism can be built in
antenna to produce versatility in radiation pattern and broad
band effects, due to dynamic material perturbations.
We have formulated various equations (1-12) to focus on
physics of the design analysis of an MHD antenna. Here we
describe complete mechanism for beam formation, radiating
patterns and resonance. Radiation pattern in the far fields
depends not only on electromagnetic field but also on fluid
velocity field. We have described mathematical relations of
permeability as the function of E, H and v, when conductivity
and permittivity are kept constant. With proper filtering
techniques, MHD antenna can made to operate at one single
frequency. Fluid shape with fields decides resonant frequency.
The effective permeability can be controlled by applying a
static magnetic field. This leads to the possibility of
magnetically tuning of polarisation of the antenna.
Polarisation tuning of antenna was measured as a function of
strength for magnetisation parallel to the x- and y-directions.
The effects of magnetic bias on antenna have been investigated.
The principle of this class of antenna is essentially that of a
dielectric resonator, where salt (in solution) and electric field
modifies the dielectric properties. The resonator column shape
determined the operating frequency, allowing impedance match
and frequency of operation to be fully tuneable. Figure 1
presents complete test set up of MHD antenna under electric
and magnetic field, with RF input for S11 measurements.
Figures 1-13 presents results obtained and steps of prototype
development.VNA-L5230 was used to measure return loss at
resonant frequency. We have varied fluid salinity, electric field,
magnetic field and fluid height for all possible radiation
measurement in experimentations. We have recorded return
loss and radiation patterns for all possible combinations as
mentioned in tables 1-2.
This antenna with conducting fluid may have multiple
advantages viz reconfigurability, frequency agility, polarisation
agility, broadband and beam steering capability. Here, we
developed control of polarization with magnetic field biasing,
frequency control with fluid height and return loss with electric
field control. Non reflecting stealth property of the fluid, when
no field presents, makes it most suitable for military
applications.
Fig 1 Measurements of return loss/resonant frequency, MHD antenna with
VNA and power supply with Bias TEE
Fig 2 Return loss -33.1 dB at resonant freq 8.59 GHz, when TDS 9000,
electric field Applied 15 V, DC with permanent magnetic field.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
102 | P a g e
http://ijacsa.thesai.org/
Fig 3. Return loss -51.1 dB at resonant freq 8.59 GHz, when TDS 9000,
electric field applied 17 V, DC with permanent magnetic field.
Fig 4 return loss -49.1 dB at resonant freq 8.59 GHz, when TDS
9000,electric field applied 16.9 V, DC with permanent Magnetic field.
Fig 5 return loss -34.1 dB at resonant freq 8.59 GHz, when TDS
9000,electric field applied 15.0 V, DC with permanent magnetic field.
Fig 6 complete set for measurements of VSWR on MHD antenna with
additional magnetic field
Fig 7 MHD antenna with Bias TEE
Fig 8 Fabricated MHD Antenna, SMA connector and filled Saline water top
view
Fig 9 view of fabricated MHD antenna without
ground plane
Fig 10 View of outer part of MHD antenna Tube with two tin electrodes
attached
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
103 | P a g e
http://ijacsa.thesai.org/
Fig 11 view of fabricated ground plane of MHD antenna with copper
coating.
Fig 12 HFSS generated cylindrical antenna
Fig 13 HFSS simulated resonant 10 GHz frequency
Table 1
Freq TDS Electric field Return Loss
8.59 GHz 9000 17.0V -51.1dB
4.50 GHz same same -16.7dB
2.09 GHz same same -11.7dB
Table 2
8.59 GHz 9000 16.9V -49.1dB
4.50 GHz same same -16.1dB
2.09 GHz same same -11.9dB
Table 3
8.59 GHz 9000 15.0V -39.1dB
4.50 GHz same same -16.2dB
2.09 GHz same same -11.7dB
Table 4
8.59 GHz 9000 13.2V -35.1dB
4.50 GHz same same -15.2dB
2.09 GHz same same -11.1dB
Table 5
TDS Fluid Height Resonant frequency
5000 3.5 cm 4.59 GHz
7000 6.0 cm 8.58 GHz
For measuring return loss, VSWR and resonant frequency,
we have used PNA-L Network analyser 10-40 GHz with DC
power supply. The resonant frequency for which antenna has
been fabricated destined was 8.59Ghz. However we have
frequency agility and reconfigurability in this antenna . Fluid
column height was varied from 2.5 cm to 6 cm and electric
field was varied from 2 V DC to 17 V DC, relevent results of
VSWR and Return loss were recorded.
We measured return loss by Agilent VNA(vector network
analyser) , the fluid tube height was kept fixed to 6 cm and
resonant frequency to 8.58Ghz , Dc voltage varied from 9V to
17 V. Return loss found varying proportionately to electric and
magnetic field. Also when TDS was increased from 200 to
9000 significant improvement in return loss were observed.
Mixed signal of DC and RF freq were fed to SMA connector of
antenna through Bias TEE. This test set up extended safety to
the network analyser.
V. CONCLUSION
It was observed from the measured results that there is
significant improvement in return loss when salinity of fluid is
enhanced. Also return loss improved due electric and magnetic
fields intensity. We have observed that electric field have
significant impact on return loss, these measured results are
placed in tables1-5. Bias TEE was used to feed mixed signal
from the same port .Return loss was significantly high at 17V,
DC. Height of fluid tube (fluid shape), nano particles of fluid
contribute to form resonant frequency of fluid antenna. When
height of fluid was 3.5 cm, our antenna resonated at 4.59 GHz
and when height of fluid increased to 6.0 cm , same antenna
resonated at 8.59 GHz. We have also simulated taking saline
water as dielectric in HFSS antenna software for resonant
frequency evaluation as per fig 12-13.We could thus achieved
reconfigurability and frequency agility in this antenna. It has
stealth property, as reflector is voltage dependent, hence can be
0.00 5.00 10.00 15.00
Freq [GHz]
-0.00000025
-0.00000020
-0.00000015
-0.00000010
-0.00000005
0.00000000
d
B
(
S
(
L
u
m
p
P
o
r
t
1
,
L
u
m
p
P
o
r
t
1
)
)
Ansoft Corporation HFSSDesign1
XY Plot 1
Curve Inf o
dB(S(LumpPort1,LumpPort1))
Setup1 : Sweep1
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
104 | P a g e
http://ijacsa.thesai.org/
most suitable for Military applications. We can also use this
antenna as MIMO (multiple input outputs). More work towards
micro-fluidic frequency reconfiguration, fluidic tuning of
matching networks for bandwidth enhancement need to be
explored.
As a Future work, we will investigate radiation patterns as
a special case to this cylindrical antenna with detailed physics
involved.
VI. ACKNOWLEDGEMENT & BIOGRAPHY
Prof Raj Senani, Director NSIT, who inspired me for this
research work and enriched with all necessary resources
required in the college. I extends special thanks to my lab
technician Mr Raman, who helped me in lab for developing
this prototype MHD antenna.
REFERENCES
[1] Rajveer S Yaduvanshi and Harish Parthasarathy, Design,
Development and Simulations of MHD Equations with its proto type
implementations(IJACSA) International Journal of Advanced
Computer Science and Applications,Vol. 1, No. 4, October 2010.
[2] Rajveer S Yaduvanshi and Harish Parthasarathy, EM Wave transport
2D and 3D investigations (IJACSA) International Journal of
Advanced Computer Science and Applications,Vol. 1, No. 6, December
2010.
[3] Rajveer S Yaduvanshi and Harish Parthasarathy, Exact solution of 3D
Magnetohydrodynamic system with nonlinearity analysis IJATIT ,
Jan 2011.
[4] EM Lifshitz and LD Landau,Theory of Elasticity, 3rd edition Elsevier.
[5] EM Lifshitz and LD Landau, Classical theory of fields , 4th edition
Elsevier.
[6] Bahadir, A.R. and T. Abbasov (2005), A numerical investigation of
the liquid flow velocity over an infinity plate which is taking place in a
magnetic field International journal of applied electromagnetic and
mechanics 21, 1-10.
[7] EM Lifshitz and LD Landau, Electrodynamics of continuous media
Butterworth-Heinemann.
[8] EM Lifshitz and LD Landau, Fluid Mechanics Vol. 6 Butterworth -
Heinemann.
[9] EM Lifshitz and LD Landau, Theory of Fields Vol. 2 Butterworth-
Heinemann.
[10] JD Jackson, Classical Electrodynamics third volume, Wiley
[11] CA Balanis, Antenna Theory, Wiley.
[12] Gregory H. Huff, Member, IEEE, David L. Rolando, Student Member,
IEEE, Phillip Walters, Student Member, IEEE and Jacob McDonald,
A Frequency Reconfigurable Dielectric Resonator Antenna using
Colloidal Dispersions IEEE ANTENNAS AND WIRELESS
PROPAGATION LETTERS,VOL. 9, 2010.
AUTHORS PROFILE
Author: Rajveer S Yaduvanshi, Asst Professor
Author has 21 years of teaching and
research experience. He has successfully
implemented fighter aircraft arresting barrier
projects at select flying stations of Indian Air
Force. He has worked on Indigenization
projects of 3D radars at BEL and visited France
for Radar Modernisation as Senior Scientific
Officer in Min of Defence. Currently he is
working on MHD projects. He is teaching in
ECE Deptt. of AIT, Govt of Delhi-110031. He is fellow member of IETE. His
research includes Ten number of research papers published in international
journals and conferences.
Co- Author: Prof Harish Parthasarathy is an eminent academician and
great researcher. He is professor in ECE Deptt. at NSIT, Dwarka, Delhi. He
has extra ordinary research instinct and a great book writer in the field of signal
processing. He has published more than ten books and has been associated with
seven PhDs scholars in ECE Deptt of NSIT, Delhi.
Co-author: Prof Asok De is an eminent researcher and effective
administrator. He has set up an engineering college of repute under Delhi
Government. Currently he is Principal of AIT. His research interests are micro
strip antenna design.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
105 | P a g e
http://ijacsa.thesai.org/
An Architectural Decision Tool Based on Scenarios
and Nonfunctional Requirements
Mr. Mahesh Parmar
Department of Computer
Engineering
Lakshmi Narayan College of
Tech.(LNCT).
Bhobal (MP), INDIA
Email:[email protected]
Prof. W.U. Khan
Department of Computer
Engineering
Shri G.S. Institute of Tech. &
Science(SGSITS)
Indore (M.P.), INDIA
Email : [email protected]
Dr. Binod Kumar
HOD & Associate Professor, MCA
Department.
Lakshmi Narayan College of
Tech.(LNCT).
Bhobal (MP), INDIA
Email : [email protected]
AbstractSoftware architecture design is often based on
architects intuition and previous experience. Little
methodological support is available, but there are still no
effective solutions to guide the architectural design. The most
difficult activity is the transformation from non-functional
requirement specification into software architecture. To achieve
above things proposed An Architectural Decision Tool Based on
Scenarios and Nonfunctional Requirements. In this proposed
tool scenarios are first utilized to gather information from the
user. Each scenario is created to have a positive or negative effect
on a non-functional quality attribute. The non-functional quality
attribute is then computed and compared to other non-quality
attributes to relate to a set of design principle that are relevant to
the system. Finally, the optimal architecture is selected by finding
the compatibility of the design principle.
Keywords- Software Architecture, Automated Design, Non-
functional requirements, Design Principle.
I. INTRODUCTION
Software architecture is the very first step in the software
lifecycle in which the nonfunctional requirements are
addressed [7, 8]. The nonfunctional requirements (e.g.,
security) are the ones that are blamed for a system re-
engineering, and they are orthogonal to system functionality
[7]. Therefore, software architecture must be confined to a
particular structure that best meets the quality of interest
because the structure of a system plays a critical role in the
process (i.e., strategies) and the product (i.e., notations) utilized
to describe and provide the final solution.
In this paper, we discuss an architectural decision tool
based on a software quality discussed in [14] in order to select
the software architecture of a system. In [14], we proposed a
method that attempted to bridge the chasm between the
problem domain, namely requirement specifications, and the
first phase in the solution domain, namely software
architecture. The proposed method is a systematic approach
based on the fact that the functionality of any software system
can be met by all kinds of structures but the structure that also
supports and embodies non-functional requirements (i.e.,
quality) is the one that best meets user needs. To this end, we
have developed a method based on nonfunctional requirements
of a system. The method applies a scenario-based approach.
Scenarios are first utilized to gather information from the user.
Each scenario is created to have a positive or negative affect on
a non-functional quality attribute. When creating scenarios, we
decided to start with some basic scenarios involving only single
quality attribute, multiple scenarios were then mapped to each
attribute that would have a positive or negative affect when the
user found the scenario to be true. Finally, it became clear to us
that we needed to allow each scenario to affect an attribute
positively or negatively in varying degrees.
In this work, we have studied and classified architectural
styles in terms of design principles, and a subset of
nonfunctional requirements. These classifications, in turn, can
be utilized to correlate between styles, design principles, and
quality. Once we establish the relationship between, qualities,
design principle, and styles, we should be able to establish the
proper relationship between styles and qualities, and hence we
should be able to select an architectural style for a given sets of
requirements [8], [13].
II. NON-FUNCTIONAL REQUIREMENT
Developers of critical systems are responsible for
identifying the requirements of the application, developing
software that implements the requirements, and for allocating
appropriate resources (processors and communication
networks). It is not enough to merely satisfy functional
requirements. Non-functional requirement is a requirement that
specifies criteria that can be used to judge the operation of a
system, rather than specific behaviours. This should be
contrasted with functional requirements that define specific
behaviour or functions. Functional requirements define what a
system is supposed to do whereas non-functional requirements
define how a system is supposed to be. Non-functional
requirements are often called qualities of a system. Critical
systems in general must satisfy non-functional requirement
such as security, reliability, modifiability, performance, and
other, similar requirements as well. Software quality is the
degree to which software possesses a desired combination of
attributes [15].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
106 | P a g e
http://ijacsa.thesai.org/
III. SCENARIOS
Scenarios are widely used in product line software
engineering: abstract scenarios to capture behavioral
requirements and quality-sensitive scenarios to specify
architecturally significant quality attributes. Scenario system
specific means translating it into concrete terms for the
particular quality requirement. Thus, a scenario is "A request
arrives for a change in functionality, and the change must be
made at a particular time within the development process
within a specified period." A system-specific version might be
"A request arrives to add support for a new browser to a Web-
based system, and the change must be made within two
weeks." Furthermore, a single scenario may have many system-
specific versions. The same system that has to support a new
browser may also have to support a new media type. A quality
attribute scenario is a quality-attribute-specific requirement
The assessment of a software quality using scenarios is
done in these steps:
A. Define a Representative set of Scenarios
A set of scenarios is developed that concretizes the actual
meaning of the attribute. For instance, the maintainability
quality attribute may be specified by scenarios that capture
typical changes in requirements, underlying hardware, etc.
B. Analyses the Architecture
Each individual scenario defines a context for the
architecture. The performance of the architecture in that
context for this quality attribute is assessed by analysis. Posing
typical question [15] for the quality attributes can be helpful.
C. Summaries the Rresults
The results from each analysis of the architecture and
scenario are then summarized into overall results, e.g., the
number of accepted scenarios versus the number not accepted.
We have proposed a set of six independent high-level non-
functional characteristics, which are defined as a set of
attributes of a software product by which its quality is
described and evaluated. In practice, some influence could
appear among the characteristics, however, they will be
considered independent to simplify our presentation. The
quality characteristics are used as the targets for validation
(external quality) and verification (internal quality) at the
various stages of development. They are refined (see Figure 1)
into sub-characteristics, until the quality attribute are obtained.
Sub characteristics (maturity, fault tolerance, confidentiality,
changeability etc) are refined into scenarios. Each non-
functional characteristic may have more than one sub
characteristics is refined into set of scenarios. When we
characterized a particular attribute then set of scenarios
developed to describe it.
Figure 1. Analysis Scenario Diagram
IV. THE APPROACH
To establish the correct relationship between architectural
styles using non functional requirements. The proposed
recommendation tool consists of four activities as follows:
Create a set of simple scenarios relevant to a single
nonfunctional requirement.
Identify those scenarios that may have positive or
negative impacts on one or more nonfunctional
requirements
Establish a relationship between a set of quality
attributes obtained in step 2 to a set of universally
accepted design principles (tactics).
Select a software architecture style that supports set
of design principles identified by step 3.
A. Quality Attribute
Product considerations and market demands require
expectations or qualities that must be fulfilled by a systems
architecture. These expectations are normally have to do with
how the system perform a set of tasks (i.e., quality) rather than
what system do (i.e., functionality). Functionality of a system,
which is the ability of a system to perform the work correctly
for which it was intended, and the quality of a system, is
orthogonal to one another.
In general, the quality attributes of a system is divided
between two groups: 1) Operational quality attributes such as
performance, and 2) non-operational, such as modifiability [8].
In this study, we have selected both operational and non-
operational quality attributes as follows:
Reliability (the extent with which we can expect a
system to do what it is supposed to do at any given
time)
Security (the extend by which we can expect how
secure the system is from tampering/ illegal access)
Modifiability (how difficult or time consuming it is to
perform change on the system)
Performance (how fast the system will run, i.e.,
throughput, latency, number of clock cycles spend
finishing a task)
Usability (the ease by which the user can interact
with system in order to accomplish a task),
Availability (the extend by which we expect the
system is up and running)
Reusability (the extent by which apart or the entire
system can be utilized)
Usability involves both architectural and nonarchitectural
aspects of a system. Example of nonarchitectural features
includes graphical user interface (GUI); examples of
architectural features include, undo, cancel, and redo.
Modifiability involves decomposition of system functionality
and the programming techniques utilized within a component.
In general, a system is modifiable if changes involve the
minimum number of decomposed units. Performance involves
the complexity of a system, which is the dependency (e.g.,
structure, control, and communication) among the elements of
NFR
Characteristics
Quality
Attribute
Sub-
Characteristics
Scenarios
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
107 | P a g e
http://ijacsa.thesai.org/
a system, and the way system resources are scheduled and/or
allocated. In general, the quality of a system can never be
achieved in isolation. Therefore, the satisfaction of one quality
may contribute (or contradict) to the satisfaction of another
quality [12]. For example, consider security and availability;
security strives for minimally while availability strives for
maximally. Or, it is difficult to achieve a high secure system
without compromising the availability of that system. In this
case security contradicts the availability. This can be easily
solved by negotiating with the user to make her/his mind.
Another example has to do with security and usability: security
inhabits usability because user must do additional things such
as creating a password. Table I documents the correlation
among quality attributes.
To summaries, five types of quality attributes relationships
are identified. These relationships are defined by some
numerical values which belong to 0 to 1. These relationships
are: Very strong(0.9), Strong(0.7), Average(0.5), Below
average(0.3), Very low(0.1), Not available(0.0).
TABLE I. QUALITY VS QUALITY
S.N. Quality
Attribute
Quality
Attribute
Relationship values
1) Reliability Performance Very Strong 0.9
Security Very Strong 0.9
2) Performance Reliability Very Strong 0.9
Security Below Average 0.3
3) Security Reliability Average 0.5
Performance Very Low 0.1
B. Design Principle
According to [1, 2], a design can be evaluated in many
ways using different criteria. The exact selection of criteria
heavily depends on the domain of applications. In this work,
we adopted what is known as commonly accepted design
principles [3, 6, 7], and a set of design decisions known as
tactics [3, 8, 13, 14]. Tactics are a set of proven design
decisions and are orthogonal to particular software
development methods. Tactics and design principle have been
around for years and originally advocated by people like Parnas
and Dijkstra. Our set of design principles and tactics includes:
1) Generality (or abstractions), 2)Locality and separation of
concern, 3) Modularity, 4)Concurrency 5) Replicability, 6)
Operability, and 7) Complexity.
Examples of design principles and tactics include a high
degree of parallelism and asynchronized communication is
needed in order to partially meet the performance requirement;
a high degree of replicability (e.g., data, control, computation
replicability) is needed in order to partially meet availability; a
high degree of locality, modularity, and generality are needed
in order to achieve modifiability and understandability; a high
degree of controllability, such as authentication and
authorization, is needed in order to achieve security and
privacy.; and a high degree of locality, operatability (i.e., the
efficiently by which a system can be utilized by end-users) is
needed in order to achieve usability. Table II shows the
correlation among qualities and tactics.
TABLE II. TACTICS VS QUALITIES
C. Architecture Styles
In order to extract the salient features of each style we
have compiled its description, advantages and disadvantages.
This information was later utilized to establish a link among
styles and design principles. We have chosen, for the sake of
this work, main/subroutine, object-oriented, pipe/filter,
blackboard, client/server, and layered systems.
A main/subroutine (MS) architectural style advocates top-
down design strategy by decomposing the system into
components (calling units), and connectors (caller units). The
coordination among the units is highly synchronized and
interactions are done by parameters passing.
An Object-oriented (OO) system is described in terms of
components (objects), and connectors (methods invocations)
components are objects. Objects are responsible for their
internal representation integrity. The coordination among the
units is highly asynchronized and interactions are done by
method invocations. The style supports reusability, usability,
modifiability, and generality.
A Pipe/filter (P/F) style advocates bottom-up design
strategy by decomposing a system in terms of filters (data
transformation units) and pipes (data transfer mechanism). The
coordination among the filters are asynchronized by
transferring control upon the arrival of data at the input.
Upstream filters typically have no control over this behavior.
A Client/server (C/S) system is decomposed into two sets
of components (clients or masters), and (servers or slaves). The
interactions among components are done by remote procedure
calls (RPC) type of communication protocol. The coordination
and control transformation among the units are highly
synchronized.
A Blackboard (BKB) system is similar to a database
system; it decomposes a system into components (storage and
computational units known as knowledge sources (KSs). In a
Blackboard system, the interaction among units is done by
shared memory. The coordination among the units, for most
parts, is asynchronized when there is no race for a particular
data item, otherwise it is highly synchronized. The blackboard
S.N Tactics Quality
Attribute
Relationship values
1) Generality Reliability Very Strong 0.9
Security Average 0.5
Performance Very Strong 0.9
2) Locality Reliability Very Strong 0.9
Security Not Available 0.0
Performance Very Strong 0.9
3) Modularity Reliability Very Strong 0.9
Security Very Strong 0.9
Performance Strong 0.7
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
108 | P a g e
http://ijacsa.thesai.org/
style enjoys some level of replications such data (e.g., the
distributed database and the distributed blackboard systems)
and computation.
A Layered (LYR) system typically decomposes a system
into a group of components (subtasks). The communication
between layers is achieved by the protocols that define how the
layers will interact. The coordination and control
transformation among the units (or subtasks) is highly
synchronized and interactions are done by parameters passing.
A Layered system incurs performance penalty stems from the
rigid chain of hierarchy among the layers. Table III illustrates
the relationships among design principles/tactics.
V. PROPOSED WORK
The implementation of our tool consists of six different
modules to perform its functions. Average weight module
calculates average weight corresponding to selected scenarios.
Effective weight module calculates effective weight of each
nonfunctional requirement and each nonfunctional requirement
has list of scenarios. Scenarios and its corresponding weight are
selected by user. Quality attribute weight module calculates
quality attribute weight. It depends on average and effective
module response. Quality attribute rank module calculates the
rank of quality attribute. It depends on quality attribute weight
module. Tactics rank module calculates tactics rank and
architecture style rank module calculates the architecture rank.
TABLE III. ARCHITECTURAL STYLES VS TACTICS
These are described in detail as follows.
Calculate average weight of each quality attribute that
is selected by user. In this step user first selects
scenarios corresponding to non-functional
requirement and chooses weight according to his
choice. Then calculate the average weight for each
non functional requirement by using following
formula
AQA
i
= Average weight of i
th
quality attribute
QWt
n
= Weight of
n
th
selected scenario.
n = Number of scenarios
N = Total number of selected scenarios
Calculate effective weight of each quality attribute.
Each scenario may affect more than one scenarios.
All effective scenarios questions for each scenario are
stored in the effect table in the database. Effect table
maintains the list of effected scenarios questions.
Calculate effective weight for each quality attribute
by using following formula.
EWtQA
i
= Effective weight of i
th
quality attribute
EQ
n
= n
th
Effective scenario.
e = Number of effective scenarios.
m = Number of scenarios
Calculate quality attribute weight for each quality
attribute. Using the output of step1 and step 2 we
calculate the quality attribute weight by the following
formula.
=
QAWt
i
= i
th
quality attribute weight.
Calculate quality attribute rank. Quality to quality
relationship table is stored in the database which
maintains relationship values of quality to quality
attribute. Calculate quality attribute rank using
quality to quality relationship table by following
formula.
QAR
i
= i
th
quality attribute rank.
q = Number of quality attribute.
QtoQ
q
= qth Quality to quality relationship
Calculate tactics rank. Quality to tactics relationship
table is stored in the database which maintains
relationship values of quality to tactics. Calculate
tactics rank using quality to tactics relationship table
by following formula.
TR
i
= i
th
Tactics rank.
QtoT
t
= t
th
Quality to tactics relationship.
t = Number of tactics.
Calculate architecture style rank. Tactics to
architecture style relationship table is stored in the
database which maintains relationship values of
tactics to architecture style. Calculate architecture
style rank using tactics to architecture style
relationship table by flowing formula.
N
QWtn
AQAi
n
m e
QWtn EQn EWtQAi
1 1
*
QAWti EWtQAi AQAi
q
QtoQq QAWti QARi
1
*
t
QtoTt QARi TRi
1
*
a
TtoASa TRi ASRi
1
*
S.N Architecture
Style
Tactics Relationship values
1) Pipe & Filter Generality Very Strong 0.9
Locality Very Strong 0.9
Modularity Average 0.5
2) Black Board Generality Very Strong 0.9
Locality Very Strong 0.9
Modularity Very Strong 0.9
3) Object Oriented Generality Very Strong 0.9
Locality Very Strong 0.9
Modularity Very Strong 0.9
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
109 | P a g e
http://ijacsa.thesai.org/
ASR
i
= i
th
Architecture style rank.
TtoAS
a
= ith Tactics to architecture style relationship.
a = Number of architecture styles
First user will click the main page then main page opens
and he will select non functional requirements. Now he has to
select scenario questions and corresponding weight according
to his requirement. A single user can select more than one non
functional requirement
Now the user will select the submit button and Result page
will open and he would be able to see the Average Weight,
Effective Weight, Quality Attribute Weight, Quality Attribute
rank, Tactics Rank and Architectural Style and Rank.
VI. RELATED WORK
The work in this paper is inspired by the original work in
the area of architectural design guidance tool by Thomas Lane
[4], and it is partially influenced by the research in [2, 5], [6],
of [7], [8], [9], [11], [12], [13], and [14]. In [5], NFRs, such as
accuracy, security, and performance have been utilized to study
software systems.
Figure 2 . A simple form-based Scenario
Figure 3. The evaluation results
In [6], the authors analyzed the architectural styles using
modality, performance, and reusability. Their study provided
preliminary support for the usefulness of architectural styles the
work by Bass et al. [8] introduces the notion of design principle
and scenarios that can be utilized to identify and implement
quality characteristics of a system. In [7], the discussed the
identification of the architecturally significant requirements its
impact and role in assessing and recovering software
architecture. In [9], the authors proposed an approach to elicit
NFRs and provide a process by which software architecture to
obtain the conceptual models.
In [14], the authors proposed a systematic method to extract
architecturally significant requirements and the manner by
which these requirements would be integrate into the
conceptual representation of the system under development.
The method worked with the computation, communication, and
coordination aspects of a system to select the most optimal
generic architecture. The selected architecture is then deemed
as the starting point and hence is subjected to further
assessment and/or refinement to meet all other users
expectations.
In [13], the authors developed a set of systematic
approaches based on tactics that can be applied to select
appropriate software architectures. More specifically, they
developed a set of methods, namely, ATAM (architecture
Tradeoff Analysis Method, SAAM (Software Architecture
Analysis Method, and ARID (Active Reviews for Intermediate
Designs. Our approach has been influenced by [13]; we did
applied tactics and QAs to select an optimal architecture.
However, the main differences between our approach and the
methods developed by Clements et al. [13] are 1) our method
utilizes different set of design principle and proven design, 2)
establishes the correlation within QAs, tactics using tables, 3)
establishes the proper correlation between QAs, tactics, and
architectural styles using a set of tables and 4) the
implementation of scenarios, which meant to increase the
accuracy of the evaluation and architectural recommendations.
VII. CONCLUSIONS AND FUTURE WORK
In this paper, we created a tool based on a set of scenarios
that allows the user to select an architecture based on non-
functional requirements. Non-functional requirements are then
mapped to tactics using weighting. The architecture is then
selected by its compatibility with the high-scoring design
principle. We believe this approach has a lot of merits.
However, more research work will be required to create a
complete set of scenarios having a closer coupling with quality
attributes. Additional work may also be required in fine-tuning
the mappings between nonfunctional and functional
requirements.
Currently, our tool can be utilized to derive and/or
recommend architectural styles based on NFR. To validate the
practicality and the usefulness of our approach, we plan to
conduct a series of experiments in the form of case studies in
which the actual architectural recommendations from our tool
will be compared to the design recommendations by architects.
We have discussed some quality attributes, some design tactics
and some architecture styles. This needs some more research
work on other quality attributes, tactics and architecture styles.
New research works on non functional requirements might be
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
110 | P a g e
http://ijacsa.thesai.org/
done by project members in the future. Our tool provides
facility for addition, deletion and modification of new non-
functional requirements, new tactics, and new architecture
styles.
REFERENCES
[1] R. Aris. Mathematical modeling techniques. London; San Francisco:
Pitman (Dover, New York), 1994.
[2] N. Medvidovic, P. Gruenbacher, A. Egyed, and B. Boehm. Proceedings
of the 13th International Conference on Software Engineering and
Knowledge Engineering (SEKE01), Buenos Aires, Argentina, June
2001.
[3] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and M. Stal.
Pattern-Oriented Software Architecture: A System of Patterns. John
Wiley, 1996.
[4] T. Lane. User Interface Software Structure, Ph.D. thesis. Carnegie
Mellon University, May 1990.
[5] L. Chung, and B. Nixon. Dealing with Non Functional Requirements:
Three experimental Studies of a Processes oriented approach. Process
dings of the International Conference on Software Engineering
(ICSE95), Seattle, USA, 1995.
[6] M. Shaw, and D. Garlan. Software Architecture: Perspective on an
Emerging Discipline. Prentice Hall, 1996.
[7] M. Jazayeri, A. Ran, and F. Linden. Software Architecture for Product
Families., Addison/Wesley, 2000.
[8] L. Bass, P. Clements, and R. Kazman. Software Architecture in Practice,
second edition, Addison/Wesley. 2003.
[9] L. Cysneiros, and J. Leite. Nonfunctional Requirements: From elicitation
to Conceptual Models. IEEE Transaction on Software Engineering,
vol.30, no.5, May 2004.
[10] B. Boehm, A. Egyed, J. Kwan, D. Port, A. Shah, and R. Madachy. Using
the WinWin spiral model: a case study IEEE Computer, 1998.
[11] A. Lamsweerde. From System Goals to Software Architecture. In
Formal Methods for Software Architecture by LNCS 2804, Springer-
Verlag, 2003.
[12] L. Chung, B. Nixon, E. Yu, and J. Mylopoulos. Nonfunctional
Requirements in Software Engineering. Kluwer Academic, Boston,
2000.
[13] P. Clements, R. Kazman, and M. Klein. Evaluating Software
Architectures: Methods and Case Studies. Addison Wesley, 2002.
[14] H. Reza, and E. Grant. Quality Oriented Software Architecture. The
IEEE International Conference on Information Technology Coding and
Computing (ITCC05), Las Vegas, USA, April 2005.
[15] J.A. McCall, Quality Factors, Software Engineering Encyclopedia, Vol
2, J.J. Marciniak ed., Wiley, 1994, pp. 958 971
[16] B. Boehm and H. Hoh, Identifying Quality-Requirement Conflicts,
IEEE Software, pp. 25-36, Mar. 1996.
[17] M. C. Paulk, The ARC Network: A case study, IEEE Software, vol. 2,
pp. 61-69, May 1985.
[18] M. Chen and R. J. Norman, A framework for integrated case, IEEE
Software, vol. 9, pp. 18-22, March 1992.
[19] S. T. Albin, The Art of Software Architecture: Design Methods and
Techniques, John Wiley and Sons, 2003
[20] P. Bengtsson, Architecture-Level Modifiability Analysis, Doctoral
Dissertation Series No.2002-2, Blekinge Institute of Technology, 2002.
AUTHORS PROFILE
Mr. Mahesh Parmar is Assistant Professor in CSE Dept. in LNCT Bhopal and
having 2 years of Academic and Professional experience. He has published 5
papers in International Journals and Conferences. He received M.E. degree in
Computer Engineering from SGSITS Indore in July 2010. His other
qualifications are B.E.(Computer Science and Engineering, 2006). His area of
expertise is Software Architecture and Software Engineering
Dr.W.U.Khan, has done PhD (Computer Engg) and Post Doctorate (Computer
Engg). He is Professor in Computer Engineering Department at, Shri G.S.
Institute of Technology and Science, Indore, India.
Dr. Binod Kumar is HOD and Associate professor in MCA Dept. in LNCT
Bhopal and having 12.5 years of Academic and Professional experience. He is
Editorial Board Member and Technical Reviewer of Seven (07) International
Journals in Computer Science. He has published 11 papers in International
and National Journals. He received Ph.D degree in Computer Science from
Saurastra Univ. in June 2010. His other qualifications are M.Phil (Computer
Sc, 2006), MCA(1998) and M.Sc (1995). His area of expertise is Data Mining,
Bioinformatics and Software Engineering.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
111 | P a g e
http://ijacsa.thesai.org/
To Generate the Ontology from Java Source Code
OWL Creation
Gopinath Ganapathy
1
1
Department of Computer Science,
Bharathidasan University,
Trichy, India.
[email protected]
S. Sagayaraj
2
2
Department of Computer Science,
Sacred Heart College,
Tirupattur, India
[email protected]
AbstractSoftware development teams design new components
and code by employing new developers for every new project. If
the company archives the completed code and components, they
can be reused with no further testing unlike the open source code
and components. Program File components can be extracted
from the Application files and folders using APIs. The proposed
framework extracts the metadata from the source code using
QDox code generators and stores it in the OWL using Jena
framework automatically. The source code will be stored in the
HDFS repository. Code stored in the repository can be reused for
software development. By Archiving all the project files in to one
ontology will enable the developers to reuse the code efficiently.
Keywords- component: Metadata; QDox, Parser, J ena, Ontology,
Web Ontology Language and Hadoop Distributed File System;.
I. INTRODUCTION
Todays Web content is huge and not well-suited for human
consumption. An alternative approach is to represent Web
content in a form that is more easily machine-processable by
using intelligent techniques. The machine processable Web is
called the Semantic Web. Semantic Web will not be a new
global information highway parallel to the existing World
Wide Web; instead it will gradually evolve out of the existing
Web [1]. Ontologies are built in order to represent generic
knowledge about a target world [2]. In the semantic web,
ontologies can be used to encode meaning into a web page,
which will enable the intelligent agents to understand the
contents of the web page. Ontologies increase the efficiency
and consistency of describing resources, by enabling more
sophisticated functionalities in development of knowledge
management and information retrieval applications. From the
knowledge management perspective, the current technology
suffers in searching, extracting, maintaining and viewing
information. The aim of the Semantic Web is to allow much
more advanced knowledge management system.
For every new project, Software teams design new
components and code by employing new developers. If the
company archives the completed code and components, it can
be used with no further testing unlike open source code and
components. File content metadata can be extracted from the
Application files and folders using APIs. During the
development each developer follows one's own methods and
logic to perform a task. So there will be different types of
codes for the same functionalities. For instance to calculate the
factorial, the code can be with recursive, non-recursive process
and with different logic. In organizational level a lot of time is
spent in re-doing the same work that had been done already.
This has a recursive effect on the time of development, testing,
deployment and developers. So there is a base necessity to
create system that will minimize these factors.
Code re-usability is the only solution for this problem. This
will reduce the development of an existing work and testing.
As the developed code has undergone the rigorous software
development life cycle, it will be robust and error free. There is
no need to re-invent the wheel. Code reusability was covered in
more than two decades. But still it is of syntactic nature. The
aim of this paper is to extract the methods of a project and store
the metadata about the methods in the OWL. OWL stores the
structure of the methods in it. Then the code will be stored in
the distributed environment so that the software company
located in various geographical areas can access. To reuse the
code, a tool can be created that can extract the metadata such
as function, definition, type, arguments, brief description,
author, and so on from the source code and store them in OWL.
This source code can be stored in the HDFS repository. For a
new project, the development can search for components in the
OWL and retrieve them at ease[3].
The paper begins with a note on the related technology
required in Section 2. The detailed features and framework for
source code extractor is found in Section 3. The metadata
extraction from the source code is in section 4. The metadata
extracted is stored in OWL using Jena framework is in section
5. The implementation scenario is in Section 6. Section 7 deals
with the findings and future work of the paper.
II. RELATED WORK
A. Metadata
Metadata is defined as data about data or descriptions of
stored data. Metadata definition is about defining, creating,
updating, transforming, and migrating all types of metadata that
are relevant and important to a users objectives. Some
metadata can be seen easily by users, such as file dates and file
sizes, while other metadata can be hidden. Metadata standards
include not only those for modeling and exchanging metadata,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
112 | P a g e
http://ijacsa.thesai.org/
but also the vocabulary and knowledge for ontology [4]. A lot
of efforts have been made to standardize the metadata but all
these efforts belong to some specific group or class. The
Dublin Core Metadata Initiative (DCMI) [5] is perhaps the
largest candidate in defining the Metadata. It is simple yet
effective element set for describing a wide range of networked
resources and comprises 15 elements. Dublin Core is more
suitable for document-like objects. IEEE LOM [6], is a
metadata standard for Learning Objects. It has approximately
100 fields to define any learning object. Medical Core
Metadata (MCM) [7] is a Standard Metadata Scheme for
Health Resources. MPEG-7 [8] multimedia description
schemes provide metadata structures for describing and
annotating multimedia content. Standard knowledge ontology
is also needed to organize such types of metadata as content
metadata and data usage metadata.
B. Hadoop & HDFS
The Hadoop project promotes the development of open
source software and it supplies a framework for the
development of highly scalable distributed computing
applications [9]. Hadoop is a free, Java-based programming
framework that supports the processing of large data sets in a
distributed computing environment and it also supports data
intensive distributed application. Hadoop is designed to
efficiently process large volumes of information[10]. It
connects many commodity computers so that they could work
in parallel. Hadoop ties smaller and low-priced machines into
a compute cluster. It is a simplified programming model which
allows the user to write and test distributed systems quickly. It
is an efficient, automatic distribution of data and it works
across machines and in turn it utilizes the underlying
parallelism of the CPU cores.
In a Hadoop cluster even while, the data is being loaded in,
it is distributed to all the nodes of the cluster. The Hadoop
Distributed File System (HDFS) will break large data files into
smaller parts which are managed by different nodes in the
cluster. In addition to this, each part is replicated across several
machines, so that a single machine failure does not lead to non-
availability of any data. The monitoring system then re-
replicates the data in response to system failures which can
result in partial storage. Even though the file parts are
replicated and distributed across several machines, they form a
single namespace, so their contents are universally accessible.
Map Reduce [11] is a functional abstraction which provides an
easy-to-understand model for designing scalable, distributed
algorithms.
C. Ontology
The key component of the Semantic Web is the collections
of information called ontologies. Ontology is a term borrowed
from philosophy that refers to the science of describing the
kinds of entities in the world and how they are related. Gruber
defined ontology as a specification of a conceptualization
[12].Ontology defines the basic terms and their relationships
comprising the vocabulary of an application domain and the
axioms for constraining the relationships among terms [13].
This definition explains what an ontology looks like [14].The
most typical kind of ontology for the Web has taxonomy and a
set of inference rules. The taxonomy defines classes of objects
and relations among them. Classes, subclasses and relations
among entities are a very powerful tool for Web use.
A large number of relations among entities can be
expressed by assigning properties to classes and allowing
subclasses to inherit such properties. Inference rules in
ontologies supply further power. Ontology may express rules
on the classes and relations in such a way that a machine can
deduce some conclusions. The computer does not truly
understand" any of this information, but it can now
manipulate the terms much more effectively in ways that are
useful and meaningful to the human user. More advanced
applications will use ontologies to relate the information on a
page to the associated knowledge structures and inference
rules.
III. SOURCE CODE EXTRACTOR FRAMEWORK
After the completion of a project, all the project files are
sent to Source code extraction framework that extracts
metadata from the source code. Only java projects are used for
this framework. The java source file or folder that consists of
java files is passed as input along with project information like
description of the project, version of the project. The
framework extracts the metadata from the source code using
QDox code generators and stores it in the OWL using Jena
framework. The source code is stored in the Hadoops HDFS.
A sketch of the source code extractor tool is shown in Fig. 1.
Source code extraction framework performs two processes:
Extracting Meta data from the source code using QDox and
storing the meta-data in to OWL using Jena. Both the
operations are performed by API's. This source code extractor
will integrate these two operations in a sequenced manner. The
given pseudo code describes the entire process of the
framework.
Figure 1. The process of Semantic Stimulus Tool
The framework takes project folder as input and counts the
number of packages. Each package information is stored in the
OWL. Each package contains various classes and each class
has many methods. The class and method information is stored
in the OWL. For each of method, the information such as
return type, parameters and parameter type information are
stored in the OWL. The framework which places all the
information in the persistence model and it is stored in the
OWL file.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
113 | P a g e
http://ijacsa.thesai.org/
1. Get package count by passing the file path.
2. Initialize packageCounter to zero
3. While the package count equal to packageCounter
3.1 Store the package[packageCounter] Information into OWL model
3.2 Initilalize classCounter value is equal to zero.
3.3 Get the no of class count
3.4 While class count equal to classCounter
3.4.1 Store the class[ClassCounter] Information
into OWL model
3.4.2 Initialize methodCounter to zero
3.4.3 Get no of method of the
class[packageCounter]
3.4.4 While no of method count is equal to zero
3.4.4.1 Store the method[methodCounter]
information into OWL.
3.4.4.2 Store the modifier informaton of the
method [classCounter]
3.4.4.3 Store the return type of the
method[classCounter]
3.4.4.4 Initialize the parmCounter to zero
3.4.4.5 Get the no of parameters of
method[methodCounter]
3.4.4.6 While no of paramerters count is equal to
zero
3.4.4.6.1 Store the
parameter[paramCounter]
information into OWL model
3.4.4.6.2 Increase
paramCounter by
one
3.4.4.7 Increase methodCounter by one
3.4.5 Increase class Counter by one
3.5 Increase package Counter by one
4. Write the OWL model in the OWL File.
IV. EXTRACTING METADATA
QDox is a high speed small footprint parser for extracting
classes, interfaces, and method definitions from the source
code. It is designed to be used by active code generators or
documentation tools. This tool extracts the metadata from the
given java source code. To extract the meta-data of the source,
the given order has to be followed. When the java source file or
folder that has the java source file is loaded to QDox, it
automatically performs the iteration. The loaded information is
stored in the JavaBuilder object. From the java builder object
the list of packages, as an array of string, are returned. This
package list has to be looped to get the class information. From
the class information, the method information is extracted. It
returns the array of JavaMethod. Out of these methods, the
information like scope of the method, name of method, return
type of the method and parameter information is extracted.
The QDox process uses its own methods to extract various
metadata from the source code. The getPackage() method lists
all the available packages for a given source. The getClasses()
method lists all the available classes in the package. The
getMethods() method lists all the available methods in a class.
The getReturns() method returns the return type of the method.
The getParameters() method lists all the parameters available
for the method. The getType() method returns the type of the
method. And when the getComment() method is used with
packages, classes and methods, it returns the appropriate
comments. Using the above methods the project informations
such as package, class, method, retune type of the method,
parameters of the method, method type and comments are
extracted by the QDox. These metadata are passed to the next
section for storing in the OWL.
V. STORING METADATA IN OWL
To store the metadata extracted by QDox, the Jena
framework is used. Jena is a Java framework for manipulating
ontologies defined in RDFS and OWL Lite [15]. Jena is a
leading Semantic Web toolkit [16] for Java programmers.
Jena1 and Jena2 are released in 2000 and August 2003
respectively. The main contribution of Jena1 was the rich
Model API. Around this API, Jena1 provided various tools,
including I/O modules for: RDF/XML [17], [18], N3 [19], and
N-triple [20]; and the query language RDQL [21]. In response
to these issues, Jena2 has a more decoupled architecture than
Jena1. Jena2 provides inference support for both the RDF
semantics [22] and the OWL semantics [23].
Jena contains many APIs out of which only few are used
for this framework like addProperty(), createIndividual() and
write methods. The addProperty() method is to store data and
object property in the OWL Ontology. CreateIndividual()
creates the individual of the particular concepts. Jena uses in-
memory model to hold the persistent data. So this has to be
written in to OWL Ontology using write() method.
The OWL construction is done with Protg. Protg is an
open source tool for managing and manipulating OWL[24].
Protg [25] is the most complete, supported and used
framework for building and analysis of ontologies [26, 27, 28].
The result generated in Protg is a static ontology definition
[29] that can be analyzed by the end user. Protg provides a
growing user community with a suite of tools to construct
domain models and knowledge-based applications with
ontologies. At its core, Protg implements a rich set of
knowledge-modeling structures and actions that support the
creation, visualization, and manipulation of ontologies in
various representation formats. Protg can be customized to
provide domain-friendly support for creating knowledge
models and entering data. Further, Protg can be extended by
way of a plug-in architecture and a Java-based API for building
knowledge-based tools and applications.
Based on the java source code study the ontology domain is
created with the following attributes. To store the extracted
metadata, the ontology is created with project, packages,
classes, methods and parameters. The project is concept that
holds the information like name, project repository location,
project version and the packages. The package is a concept that
holds the information like name and the class. The class is a
concept that holds the class informations such as author, class
comment, class path, identifier, name and the methods. The
method is a concept that holds the information like method
name, method Comment, method identifier, isConstructor,
return type, and the parameter. The parameter is a concept that
holds the information like name and the data type.
Concepts/Classes provide an abstraction mechanism for
grouping resources with similar characteristics. Project,
package, class, method, parameter are concepts in source code
extractor ontology.
Individual is an instance of the concept/ class.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
114 | P a g e
http://ijacsa.thesai.org/
Property describes the relation between concepts and
objects. It is a binary relationship on individuals. Each property
has domain and range. There are two types of property namely
object and data property
Object Property links individuals to individuals. In source
code ontology, the object properties are hasClass, hasMethod,
hasPackage and hasParameter. hasClass is an object property
which has domain Package and range Class. hasMethod is an
object property which has domain class and range method.
hasPackage is an object property which has domain Project and
range Package. hasParameter is an object property which has
domain method and range range.
Datatype Property links individuals to data values. Author
is a dataproperty which has domain Class and the String as
range. ClassComment is a data property which has domain
class and string as range. DataType is a data property which
has domain parameter and the range string as range. Identifier
is a data property which has domain method,class and the range
boolean as range. IsConstructor is a data property which has
domain method and string as range. MethodComment is a data
property which has domain method and string as range. Name
is a data property which has domain project, package, class,
method, parameter and string as range. Project_Date is a data
property which has domain project and string as range.
Project_Description is a data property which has domain
project and string as range. Project_Repository_Location is a
data property which has domain project and string as range.
Project_Description is a data property which has domain
project and string as range. Project_Version is a data property
which has domain project and string as range. Returns is a data
property which has domain method and string as range.
VI. CASE STUDY
To evaluate the proposed framework the following simple
java code is used.
package com.sourceExtractor.ontology;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;
import org.apache.log4j.Logger;
import org.apache.log4j.spi.RootLogger;
import com.hp.hpl.jena.ontology.DatatypeProperty;
/** * To manage the ontology related informations
* @author Sagayaraj */
public class OntoManager {
private Logger LOGGER =
RootLogger.getLogger(OntoManager.class);
@SuppressWarnings("static-access")
public OntModel getModel(String modelLocation) {
OntModel ontModel = null;
ontModel = ModelFactory.createOntologyModel();
ontModel.read(new FileManager().get().open(modelLocation),
"");
return ontModel; }
/**
* To create Individual In OWL
* @param model
* @param concept
* @param individual */
public void createIndividual(OntModel model, String concept,
String individual) {
OntClass ontClass =
model.getOntClass(addNameSpace(concept, model));
model.enterCriticalSection(Lock.WRITE);
try {if (ontClass != null) {
ontClass.createIndividual(addNameSpace(individual, model));
} else {
LOGGER.error("Direct Class is null");// todo
}} finally {
model.leaveCriticalSection();
} }}
The sample java code is given as input to QDox document
generator through the Graphical User Interface (GUI) provided
in the Fig. 2.
Source Code Extractor Source Code Extractor
Select the Source Folder
Project Name
Project Description
Version of Project
/home/prathap/SourceExtractor/src/
Extract Close
Click here to Start Extraction
Figure 2. GUI for locating folder
Using the QDox APIs metadata is extracted as given in
the Table 1. The output of the QDox stores metadata in the
form of strings. To store the metadata the OWL ontology,
template is created using Protg. The strings are passed to the
Jena framework and the APIs place the metadata in to the
OWL Ontology. The entire project folder, stored in the HDFS,
is linked to the method signature in the OWL ontology for
retrieval purpose. The components will be reused for the new
project appropriately. The obtained OWL Ontology
successfully loads on both Protg Editor and Altova
Semantics. The sample OWL file is given below as the output
of the framework.
<owl:Ontology rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Package"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Project"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Class"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Method"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Parameter"/>
<owl:ObjectProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#hasPackage">
<rdfs:domain rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Project"/>
<rdfs:range rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Package"/>
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
115 | P a g e
http://ijacsa.thesai.org/
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#hasParameter">
<rdfs:domain rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Method"/>
<rdfs:range rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Parameter"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#hasClass">
<rdfs:domain rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Package"/>
<rdfs:range rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Class"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#hasMethod">
<rdfs:range rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Method"/>
<rdfs:domain rdf:resource="http://www.owl-
ontologies.com/SourceExtractorj.owl#Class"/>
</owl:ObjectProperty> <owl:DatatypeProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Project_Date"/>
<owl:DatatypeProperty rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Identifier">
<rdfs:domain>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Method"/>
<owl:Class rdf:about="http://www.owl-
ontologies.com/SourceExtractorj.owl#Class"/>
</owl:unionOf>
</owl:Class>
</rdfs:domain>
<rdfs:range>
<owl:Class>
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DataRange"/>
<owl:oneOf rdf:parseType="Resource">
<rdf:first rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>public</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>private</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first
rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>protected</rdf:first>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-
ns#nil"/>
</rdf:rest>
</rdf:rest>
</owl:oneOf>
</owl:Class>
</rdfs:range>
</owl:DatatypeProperty>
VII. CONCLUSION AND FUTURE WORK
This paper presents an approach for generating ontologies
using the source code extractor tool from source code. This
approach helps to integrate source code into the Semantic Web.
OWL is semantically much more expressive than needed for
the results of our mapping. With these sample tests the paper
argues that it is indeed possible to transform source code in to
OWL using this Source Code Extractor framework. The
framework created OWL which will increase the efficiency and
consistency in development of knowledge management and
information retrieval applications. The purpose of the paper is
to achieve the code re-usability for the software development.
By creating OWL for the source code the future will be to
search and extract the code and components and reuse to
shorten the software development life cycle. Open source code
can also be used to create OWL so that there will be huge
number of components which can be reused for the
development. By storing the projects in the OWL and the
HDFS the corporate knowledge grows and the developers will
use more of reuse code than developing themselves. Using the
reuse code the development cost will come down, development
time will become shorter, resource utilization will be less and
quality will go up.
After developing OWL and storing the source code in the
HDFS, the code components can be reused. The future work
can take off in two ways. One can take a design document from
the user as input, then extract the method signature and try to
search and match in the OWL. If the user is satisfied with the
method definition, it can be retrieved from the HDFS where the
source code is stored. Second one can take the project
specification as input and text mining can be performed to
extract the keywords as classes and the process as methods.
The method prototype can be used to search and match with the
OWL and the required method definition can be retrieved from
the HDFS. The purpose of storing the metadata in OWL is to
minimize the factors like time of development, time of testing,
time of deployment and developers. Creating OWL using this
framework can reduce these factors.
REFERENCES
[1] Grigoris Antoniou and Frank van Harmelen, A Semantic Web Primer,
PHI Learning Private Limited, New Delhi, 2010, pp 1-3.
[2] Bung. M, Treatise on Basic Philosophy. Ontology I. The Furniture of
the World. Vol. 3, Boston: Reidel.
[3] Gopinath Ganapathy and S. Sagayaraj, Automatic Ontology Creation
by Extracting Metadata from the Source code , in Global Journal of
Computer Science and Technology,Vol.10, Issue 14( Ver.1.0) Nov.
2010. pp.310-314.
[4] Won Kim: On Metadata Management Technology Status and Issues,
in Journal of Object Technology, vol. 4, no.2, 2005, pp. 41-47.
[5] Dublin Core Metadata Initiative. <
http://dublincore.org/documents/>,2002.
[6] IEEE Learning Technology Standards Committee,
http://ltsc.ieee.org/wg12, IEEE Standards for Learning Object Metadata
(1484.12.1)
[7] Darmoni, Thirion, Metadata Scheme for Health Resources
American Medical Infor. Association, 2000 JanFeb; 7(1): 108109.
[8] MPEG-7 Overview: ISO/IEC JTC1/SC29/WG11 N4980, Kla-genfurt,
July 2002.
[9] Jason Venner, Pro Hadoop : Build Scalable, Distributed
Applications, in the cloud, Apress, 2009.
[10] Gopinath Ganapathy and S. Sagayaraj, Circumventing Picture
Archiving and Communication Systems Server with Hadoop Framework
in Health Care Services, in Journal of Social Science, Science
Publication 6 (3) : pp.310-314.
[11] Tom White, Hadoop: The Definitive Guide, OReilly Media, Inc.,
2009.
[12] Gruber, T. What is an Ontology? ,September, 2005:
http://www.ksl- stanford.edu/kst/what-is-an-ontology.html.
[13] Yang, X. Ontologies and How to Build Them,(March, 2006):
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
116 | P a g e
http://ijacsa.thesai.org/
http://www.ics.uci.edu/~xwy/publications/area-exam.ps.
[14] Bugaite, D., O. Vasilecas, Ontology-Based Elicitation of Business
Rules. In A. G. Nilsson, R. Gustas, W. Wojtkowski, W. G.
Wojtkowski, S. Wrycza, J. Zupancic Information Systems
Development: Proc. of the ISD2004. Springer- Verlag, Sweden, 2006,
pp. 795-806.
[15] McCarthy, P, Introduction to Jena:
www-106.ibm.com/developerworks/java/library/j-jena/, ,
22.02.2005.
[16] B. McBride, Jena IEEE Internet Computing, July/August, 2002.
[17] J.J. Carroll. CoParsing of RDF & XML, HP Labs Technical Report,
HPL-2001-292, 2001
[18] J.J. Carroll, Unparsing RDF/XML,WWW2002:
http://www.hpl.hp.com/techreports/2001/HPL-2001- 292.html
[19] T. Berners-Lee et al, Primer: Getting into RDF & Semantic Web using
N3, http://www.w3.org/2000/10/swap/Primer.html
[20] J. Grant, D. Beckett, RDF Test Cases, 2004, W3C6.
[21] L. Miller, A. Seaborne, and A. Reggiori,Three Implementations of
SquishQL, a Simple RDF Query Language, 2002, p 423.
[22] P. Hayes, RDF Semantics, 2004, W3C.
[23] P.F. Patel-Schneider, P. Hayes, I. Horrocks, OWL Semantics &
Abstract Syntax, 2004, W3C.
[24] Protg Semantic Web Framework,
http://protege.stanford.edu/overview/protege-owl.html,
accessed 16
th
October 2010.
[25] Protg.
http://protege.stanford.edu/ontologies/ontologyOfScience.
[26] 9th Intl. Protg Conference - July 23-26, 2006 Stanford,
Californiahttp://protege.stanford.edu/conference/2006.
[27] 10th Intl. Protg Conference - July 15-18, 2007 Budapest,
Hungaryhttp://protege.stanford.edu/conference/2007.
[28] 11th Intl. Protg Conference - June 23-26, 2009 Amsterdam,
Netherlandshttp://protege.stanford.edu/conference/2009.
[29] Hai H. Wang, Natasha Noy, Alan Rector, Mark Musen, Timothy
Redmond, Daniel Rubin, Samson Tu, Tania Tudorache, Nick
Drummond, Matthew Horridge, and Julian Sedenberg, Frames and
OWL side by side. In 10th International Protg Conference,Budapest,
Hungary, July 2007.
AUTHORS PROFILE
Gopinath Ganapathy is the Professor & Head, Department of Computer
Science and Engineering in Bharathidasan University, India. He obtained his
under graduation and post-graduation from Bharathidhasan University, India in
1986 and 1988 respectively. He submitted his Ph.D in 1996 in Maduari
Kamaraj University, India. Received Young Scientist Fellow Award for the
year 1994 and eventually did the research work at IIT Madras. He published
around 20 research papers. He is a member of IEEE, ACM, CSI, and ISTE. He
was a Consultant for a 8.5 years in the international firms in the USA and the
UK, including IBM, Lucent Technologies (Bell Labs) and Toyota. His research
interests include Semantic Web, NLP, Ontology, and Text Mining.
S. Sagayaraj is the Associate professor in the Department of Computer
Science, Sacred Heart College, Tirupattur, India. He did his Bachelor Degree in
Mathematic in Madras University, India in 1985. He completed his Master of
Computer Applications in Bharadhidhasan University, India in 1988. He
obtained Master of Philosophy in Computer Science from Bharathiar
University, India in 2001. He registered for Ph.D. programme in
Bharathidhasan University, India in 2008. His Research interests include Data
Mining, Ontologies and Semantic Web.
TABLE I. METADATA EXTRACTED FROM THE SAMPLE CODE
Project
Project Name Ontology_Learn
Project Version 1.0.0
Project Date 10/10/10
Repository Location /opt/SourceCodeExtrctor/
HasPackage com.sourceExtractor.ontology
Package
Name com.sourceExtractor.ontology
HasClass OntoManager
Class
Name OntoManager
Class Comment It manage the ontology
operation
Class Path /SampleOntology/com/sourceE
xtractor/ontology/OntoMa
nager.java
Author Sagayaraj
Identifier Public
HasMethod getModel
createIndividual
Method
Name getModel
createIndividual
Identifier Public Public
Returns OntoModel
Void
Method Comment -undefined-
To add the data property in owl
file
IsConstructor FALSE
FALSE
HasParameter modelLocation
Individual model
Concept
Parameter
Name modelLocation
Data Type java.lang.String
Name Individual
Data Type java.lang.String
Name Model
Data Type OntModel
Name Concept
Data Type java.lang.String
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
117 | P a g e
http://ijacsa.thesai.org/
Query based Personalization in Semantic Web Mining
Mahendra Thakur
Department of CSE
Samrat Ashok Technological
Institute
Vidisha, M.P., India
Yogendra Kumar Jain
Department of CSE
Samrat Ashok Technological
Institute
Vidisha, M.P., India
Geetika Silakari
Department of CSE
Samrat Ashok Technological
Institute
Vidisha, M.P., India
Abstract To provide personalized support in on-line course
resources system, a semantic web-based personalized learning
service is proposed to enhance the learner's learning efficiency.
When a personalization system relies solely on usage-based
results, however, valuable information conceptually related to
what is finally recommended may be missed. Moreover, the
structural properties of the web site are often disregarded. In this
Paper, we present a personalize Web search system, which can
helps users to get the relevant web pages based on their selection
from the domain list. In the first part of our work we present
Semantic Web Personalization, a personalization system that
integrates usage data with content semantics, expressed in
ontology terms, in order to compute semantically enhanced
navigational patterns and effectively generate useful
recommendations. To the best of our knowledge, our proposed
technique is the only semantic web personalization system that
may be used by non-semantic web sites. In the second part of our
work, we present a novel approach for enhancing the quality of
recommendations based on the underlying structure of a web
site. We introduce UPR (Usage-based Page Rank), a Page Rank-
style algorithm that relies on the recorded usage data and link
analysis techniques based on user interested domains and user
query.
Keywords-SemanticWeb Mining;Personalized Recommendation;
Recommended System
I. INTRODUCTION
Comparing with the traditional face-to-face learning style,
e-learning is indeed a revolutionary way to provide education
in the life-long term. However, different learners have different
learning styles, goals, previous knowledge and other
preferences; the traditional one-size-fits-all learning method
is no longer enough to satisfy the needs of learners. Nowadays
more and more personalized systems have been developed and
are trying to find a solution to the personalization of the
learning process, which affect the learning function outcome.
The Semantic
Web is not a separate web but an extension of the current
one, in which information is given well-defined meaning, and
better enabling computers and people to work in cooperation
[1].Under the conditions of Semantic Web-based learning
system the learning information is well-defined, and the
machine can understand and deal with the semantics for the
learning contents to provide adaptable learning services with a
powerful technical support.
Figure: 1 the web personalization process
The problem of providing recommendations to the visitors
of a web site has received a significant amount of attention in
the related literature. Most of the research efforts in web
personalization correspond to the evolution of extensive
research in web usage mining, taking into consideration only
the navigational behavior of the (anonymous or registered)
visitors of the web site. Pure usage-based personalization,
however, presents certain shortcomings. This may happen
when, for instance, there is not enough usage data available in
order to extract patterns related to certain navigational actions,
or when the web sites content changes and new pages are
added but are not yet included in the web logs. Moreover,
taking into consideration the temporal characteristics of the
web in terms of its usage, such systems are very vulnerable to
the training data used to construct the predictive model. As a
result, a number of research approaches integrate other sources
of information, such as the web content or the web structure in
order to enhance the web personalization process [1] and [2].
As already implied, the users navigation is largely driven
by semantics. In other words, in each visit, the user usually
aims at finding information concerning a particular subject.
Therefore, the underlying content semantics should be a
dominant factor in the process of web personalization. The web
sites content characterization process involves the feature
extraction from the web pages. Usually these features are
keywords subsequently used to retrieve similarly characterized
content. Several methods for extracting keywords that
characterize web content have been proposed. The similarity
between documents is usually based on exact matching
between these terms. This way, however, only a binary
matching between documents is achieved, whereas no actual
semantic similarity is taken into consideration. The need for a
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
118 | P a g e
http://ijacsa.thesai.org/
more abstract representation that will enable a uniform and
more flexible document matching process imposes the use of
semantic web structures, such as ontologys. By mapping the
keywords to the concepts of an ontology, or topic hierarchy, the
problem of binary matching can be surpassed through the use
of the hierarchical relationships and/or the semantic similarities
among the ontology terms, and therefore, the documents.
Finally, we should take into consideration that the web is not
just a collection of documents browsed by its users. The web is
a directed labeled graph, including a plethora of hyperlinks that
interconnect its web pages. Both the structural characteristics
of the web graph, as well as the web pages and hyper links
underlying semantics are important and determinative factors
in the users navigational process. The main contribution of this
paper is a set of novel techniques and algorithms aimed at
improving the overall effectiveness of the web personalization
process through the integration of the content and the structure
of the web site with the users navigational patterns. In the first
part of our work we present the semantic web personalization
system for Semantic Web Personalization that integrates usage
data with content semantics in order to compute semantically
enhanced navigational patterns and effectively generate useful
recommendations. Similar to previously proposed approaches,
the proposed personalization framework uses ontology terms to
annotate the web content and the users navigational patterns.
The key departure from earlier approaches, however, is that
Semantic Web Personalization is the only web personalization
framework that employs automated keyword-to-ontology
mapping techniques, while exploiting the underlying semantic
similarities between ontology terms. Apart from the novel
recommendation algorithms we propose, we also emphasize on
a hybrid structure-enhanced method for annotating web
content. To the best of our knowledge, Semantic Web
Personalization is the only semantic web personalization
system that can be used by any web site, given only its web
usage logs and a domain-specific ontology [3] and [4].
II. BACKGROUND
The main data source in the web usage mining and
personalization process is the information residing on the web
sites logs. Web logs record every visit to a page of the web
server hosting it. The entries of a web log file consist of several
fields which represent the date and the time of the request, the
IP number of the visitors computer (client), the URI requested,
the HTTP status code returned to the client, and so on. The web
logs file format is based on the so called extended log
format.
Prior to processing the usage data using web mining or
personalization algorithms, the information residing in the web
logs should be preprocessed. The web log data preprocessing is
an essential phase in the web usage mining and personalization
process. An extensive description of this process can be found.
In the sequel, we provide a brief overview of the most
important pre-processing techniques, providing in parallel the
related terminology. The first issue in the pre-processing phase
is data preparation. Depending on the application, the web log
data may need to be cleaned from entries involving page
accesses that returned, for example, an error or graphics file
accesses. Furthermore, crawler activity usually should be
filtered out, because such entries do not provide useful
information about the sites usability. A very common problem
to be dealt with has to do with web pages caching. When a
web client accesses an already cached page, this access is not
recorded in the web sites log. Therefore, important
information concerning web path visits is missed. Caching is
heavily dependent on the client-side technologies used and
therefore cannot be dealt with easily. In such cases, cached
pages can usually be inferred using the referring information
from the logs and certain heuristics, in order to re-construct the
user paths, filling out the missing pages. After all page accesses
are identified, the page view identification should be
performed. A page view is defined as the visual rendering of a
web page in a specific environment at a specific point in time.
In other words, a page view consists of several items, such as
frames, text, graphics and scripts that construct a single web
page. Therefore, the page view identification process involves
the determination of the distinct log file accesses that
contribute to a single page view. Again such a decision is
application-oriented. In order to personalize a web site, the
system should be able to distinguish between different users or
groups of users. This process is called user profiling. In case no
other information than what is recorded in the web logs is
available, this process results in the creation of aggregate,
anonymous user profiles since it is not feasible to distinguish
among individual visitors. However, if the users registration is
required by the web site, the information residing on the web
log data can be combined with the users demographic data, as
well as with their individual ratings or purchases. The final
stage of log data pre-processing is the partition of the web log
into distinct user and server sessions. A user session is defined
as a delimited set of user clicks across one or more web
servers, whereas a server session, also called a visit, is defined
as a collection of user clicks to a single web server during a
user session. If no other means of session identification, such
as cookies or session ids is used, session identification is
performed using time heuristics, such as setting a minimum
timeout and assumes that consecutive accesses within it belong
to the same session, or a maximum timeout, assuming that two
consecutive accesses that exceed it belong to different sessions
[1] and [5] and [6].
A. Web Usage Mining and Personalization:
Web usage mining is the process of identifying
representative trends and browsing patterns describing the
activity in the web site, by analyzing the users behavior. Web
site administrators can then use this information to redesign or
customize the web site according to the interests and behavior
of its visitors, or improve the performance of their systems.
Moreover, the managers of e-commerce sites can acquire
valuable business intelligence, creating consumer profiles and
achieving market segmentation. There exist various methods
for analyzing the web log data. Some research studies use well
known data mining techniques such as association rules
discovery, sequential pattern analysis, clustering, probabilistic
models, or a combination of them. Since web usage mining
analysis was initially strongly correlated to data warehousing,
there also exist some research studies based on OLAP cube
models. Finally some proposed web usage mining approaches
that require registered user profiles, or combine the usage data
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
119 | P a g e
http://ijacsa.thesai.org/
with semantic meta-tags incorporated in the web sites content.
Furthermore, this knowledge can be used to automatically or
semi-automatically adjust the content of the site to the needs of
specific groups of users, i.e. to personalize the site. As already
mentioned, web personalization may include the provision of
recommendations to the users, the creation of new index pages,
or the generation of targeted advertisements or product
promotions. The usage-based personalization systems use
association rules and sequential pattern discovery, clustering,
Markov models, machine learning algorithms, or are based on
collaborative filtering in order to generate recommendations.
Some research studies also combine two or more of the
aforementioned techniques [2] and [4].
B. Integrating Content Semantics in Web Personalization:
Several frameworks supporting the claim that the
incorporation of information related to the web sites content
enhances the web personalization process have been proposed
prior or subsequent to our work. In this Section we overview
in detail the ones that are more similar to ours, in terms of
using a domain-ontology to represent the web sites content.
Dai and Mobasher proposed a web personalization
framework that uses ontologies to characterize the usage
profiles used by a collaborative filtering system. These profiles
are transformed to domain-level aggregate profiles by
representing each page with a set of related ontology objects. In
this work, the mapping of content features to ontology terms is
assumed to be performed either manually, or using supervised
learning methods. The defined ontology includes classes and
their instances therefore the aggregation is performed by
grouping together different instances that belong to the same
class. The recommendations generated by the proposed
collaborative system are in turn derived by binary matching of
the current user visit, expressed as ontology instances, to the
derived domain-level aggregate profiles, and no semantic
similarity measure is used. The idea of semantically enhancing
the web logs using ontology concepts is independently
described in recent. This framework is based on a semantic
web site built on an underlying ontology. The authors present a
general framework where data mining can then be performed
on these semantic web logs to extract knowledge about groups
of users, users preferences, and rules. Since the proposed
framework is built on a semantic web knowledge portal, the
web content is already semantically annotated focuses solely on
web mining and thus does not perform any further processing
in order to support web personalization.
In recent (through the existing RDF annotations), and no
further automation is provided. Moreover, the proposed
framework t work also proposes a general personalization
framework based on the conceptual modeling of the users
navigational behavior. The proposed methodology involves
mapping each visited page to a topic or concept, imposing a
concept hierarchy (taxonomy) on these topics, and then
estimating the parameters of a semi-Markov process defined on
this tree based on the observed user paths. In this Markov
models-based work, the semantic characterization of the
content is performed manually. Moreover, no semantic
similarity measure is exploited for enhancing the prediction
process, except for generalizations/specializations of the
ontology terms. Finally, in a subsequent work, explore the use
of ontologies in the user profiling process within collaborative
filtering systems. This work focuses on recommending
academic research papers to academic staff of a University.
The authors represent the acquired user profiles using terms of
research paper ontology (is-a hierarchy). Research papers are
also classified using ontological classes. In this hybrid
recommender system which is based on collaborative and
content-based recommendation techniques, the content is
characterized with ontology terms, using document classifiers
(therefore a manual labeling of the training set is needed) and
the ontology is again used for making
generalizations/specializations of the user profiles [7] and [8]
and [9].
C. Integrating Structure in Web Personalization:
Although the connectivity features of the web graph have
been extensively used for personalizing web search results,
only a few approaches exist that take them into consideration in
the web site personalization process. To use citation and
coupling network analysis techniques in order to conceptually
cluster the pages of a web site. The proposed recommendation
system is based on Markov models. In previous, use the degree
of connectivity between the pages of a web site as the
determinant factor for switching among recommendation
models based on either frequent item set mining or sequential
pattern discovery. Nevertheless, none of the aforementioned
approaches fully integrates link analysis techniques in the web
personalization process by exploiting the notion of the
authority or importance of a web page in the web graph.
In a very recent work, address the data sparsity problem of
collaborative filtering systems by creating a bipartite graph and
calculating linkage measures between unconnected pairs for
selecting candidates and make recommendations. In this study
the graph nodes represent both users and rated/purchased items.
Finally, subsequent work, proposed independently two link
analysis ranking methods, Site Rank and Popularity Rank
which are in essence very much like the proposed variations of
our UPR algorithm (PR and SUPR respectively). This work
focuses on the comparison of the distributions and the rankings
of the two methods rather than proposing a web personalization
algorithm [9] and [10].
III. PROPOSED TECHNIQUE
In this paper, we present Semantic Enhancement for Web
Personalization, a web personalization framework that
integrates content semantics with the users navigational
patterns, using ontologies to represent both the content and the
usage of the web site. In our proposed framework we employ
web content mining techniques to derive semantics from the
web sites pages. These semantics, expressed in ontology
terms, are used to create semantically enhanced web logs,
called C-logs (concept logs). Additionally, the site is organized
into thematic document clusters. The C-logs and the document
clusters are in turn used as input to the web mining process,
resulting in the creation of a broader, semantically enhanced set
of recommendations. The whole process bridges the gap
between Semantic Web and Web Personalization areas, to
create a Semantic Web Personalization system.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
120 | P a g e
http://ijacsa.thesai.org/
A. Semantic Enhancement for Web Personalization System
Architecture:
Semantic Enhancement for Web Personalization uses a
combination of web mining techniques to personalize a web
site. In short, the web sites content is processed and
characterized by a set of ontology terms (categories). The Web
personalization process include (a) The collection of Web data,
(b) The modeling and categorization of these data
(preprocessing phase), (c) The analysis of the collected data,
and (d) The determination of the actions that should be
performed. When a user sends a query to a search engine, the
search engine returns the URLs of documents matching all or
one of the terms, depending on both the query operator and the
algorithm used by the search engine. Ranking is the process of
ordering the returned documents in decreasing order of
relevance, that is, so that the best answers are on the top.
When the user enters the query, the query is first analyzed .The
Query is given as input to the semantic search algorithm for
separation of nouns, verbs, adjectives and negations and
assigning weights respectively. The processed data is then
given to the personalized URL Rank algorithm for
personalizing the results according to the user domain, interest
and need. The sorted results are those results in which the user
is interested. The personalization can be enhanced by
categorizing the results according to the types. Thus after
building the knowledge base, the system can give use
recommendation based on the similarity of the user interested
domain and the user query. The recommendation procedure of
the System has two steps:
The system gives user a list of interested domains
.Detect users current interested domain.
Based on users current interested domain and
combined his or her profile, the system will give him
or her set of URLs with ranking scores.
In this way, the system could help the user to retrieve his or
her potential interested domains. Besides, a user can change his
or her current interested domain by clicking the interested
domain list on the same page but with more convenience. In
the beginning, if the user does not have a profile in the
database, the system displays the user available domains, and
then keeps a track of the users selections .The users selections
is used to construct a table that uses URL weight calculation.
The current interested domains recommendation is based on
last selections. The figure 2 shows the complete process.
Figure 2: Web Personalization architecture
B. Recommendation process:
The learners implicit query defined previously under both
of its shapes constitutes the input of the recommendation phase.
The recommendation process task is accomplished using
basically: content based filtering (CBF) and collaborative
filtering (CF) approaches (Figure 3). First, we apply the (CBF)
approach alone using the search functionalities of the search
engine. We submit the term vector to the search engine in order
to compute recommendation links. Results are ranked
according to the cosine similarity of their content (vector of
TF-IDF weighted terms) with the submitted term vector.
Second, we apply the collaborative approach (CF) alone by
comparing, first, the sliding window pages to clusters (groups
of learners obtained in the offline phase by applying two-level
model based collaborative filtering approach) in order to
classify the active learner in one of the learners group. Then,
we use the ARs of the corresponding group to give personalized
recommendations. The current session window is matched
against the "condition" or left side of each rule.
It is worth noting that several recommendation strategies
using these approaches have been investigated in our work.
After applying a CF and CBF approaches alone, we included
next the possibility to combine both of the recommendation
approaches (CBF and CF) in order to improve the
recommendation quality and generate the most relevant
learning objects to learners. Hence, two approaches are to be
considered: Hybrid content via profile based collaborative
filtering with cascaded/feature augmentation combination,
which performs collaborative recommendation followed by
content recommendation (the reverse order could also be
considered); and Hybrid content and profile based collaborative
filtering with weighted combination, where the collaborative
filtering and content based filtering recommendations are
performed simultaneously, then the results of both techniques
are combined together to produce a single recommendation set.
In the Hybrid content via profile based collaborative filtering
with cascaded/feature augmentation combination approach, we
apply first CF approach giving as output a set of recommended
links, then we apply CBF approach on these links. In fact,
recommended links are mapped to a set of content terms in
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
121 | P a g e
http://ijacsa.thesai.org/
order to compose a term vector (top k frequent terms), a parser
tool must be used for this task. Finally, these terms are
submitted to the search engine which returns the final
recommended links.
Figure 3: Recommendation process
In the Hybrid content and profile based collaborative
filtering with weighted combination approach, the collaborative
filtering and content based filtering are performed separately,
then the results of both techniques are combined together to
produce a single recommendation set.
C. This process uses the following steps:
I. Step 1 is performed in the same way as in CF approach; the
result is called Recommended Set 1;
II. Step 2 maps each LO references in the sliding window to a
set of content terms (top k frequent terms). Then these terms
are submitted to the search engine which returns
recommended links. This result is called Recommended Set 2;
III. Final collaborative and content based filtering
recommendation combination: both recommended sets
obtained previously are combined together to form a coherent
list of related recommendation links, which are ranked based
on their overlap ratio.
IV. METHODOLOGY
Data Set The two key advantages of using this data set are
that the web site contains web pages in several formats
(such as pdf, html, ppt, doc, etc.), written both in Greek
and English and a domain-specific concept hierarchy is
available (the web administrator created a concept-
hierarchy of 150 categories that describe the sites
content). On the other hand, its context is rather narrow,
as opposed to web portals, and its visitors are divided into
two main groups: students and researchers. Therefore, the
subsequent analysis (e.g. association rules) uncovers these
trends: visits to course material, or visits to publications
and researcher details. It is essential to point out that the
need for processing online (up-to-date) content, made it
impossible for us to use other publicly available web log
sets, since all of them were collected many years ago and
the relevant sites content is no longer available.
Moreover, the web logs of popular web sites or portals,
which would be ideal for our experiments, are considered
to be personal data and are not disclosed by their owners.
To overcome these problems, we collected web logs over
a 1-year period (01/01/10 31/12/10). After
preprocessing, the total web logs size was approximately
105 hits including a set of over 67.700 distinct
anonymous user sessions on a total of 360 web pages. The
sessionizing was performed using distinct IP & time limit
considerations (setting 20 minutes as the maximum time
between consecutive hits from the same user).
Keyword Extraction: Category Mapping: We extracted
up to 7 keywords from each web page using a
combination of all three methods (raw term frequency,
inlinks, outlinks). We then mapped these keywords to
ontology categories and kept at most 5 for each page.
Document Clustering: We used the clustering scheme
described in recent, i.e. the DBSCAN clustering algorithm
and the similarity measure for sets of keywords. However,
other web document clustering schemes (algorithm &
similarity measure) may be employed as well.
Association Rules Mining: We created both URI-based
and category-based frequent item sets and association
rules. We subsequently used the ones over a 40%
confidence threshold.
V. RESULTS
In our paper work we compare the performance of the three
ranking methods based on pure similarity, plain Page Rank and
weighted (personalized) URL Rank.
The personalization accuracy was found to be 75%; the
random search accuracy is 74.6 %. The average of
personalization accuracy is 74.7%. Because the interested
domains personalization is done considering the user selected
domain, the accuracy is higher than the random
recommendation in our experiment. Above Fig. 4 is a
comparison of the interested domains personalization accuracy
based on random selection and based on our personalization
method. Figure 4 shows Relevance Query Results vs. Random
& Personalization Selection graph.
Figure 4 Random Selection accuracy
P
e
r
c
e
n
t
a
g
e
a
c
c
u
r
a
c
y
pure_similarity plain_PageRank
weighted_URL_Rank.
Random
Selection
using
personalized
selection
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
122 | P a g e
http://ijacsa.thesai.org/
The URL personalization accuracy based on the interested
domains selection is 71.3%; and the URL personalization
accuracy without the interested domains selection assistance is
31.9 % in Fig. 5. From this result, we can see that the interested
domains recommendation help the system to filter lots of URLs
that the user might not be interested in. Moreover, the system
could focus on the domains that users are interested in to select
the relevant URL. Figure 5 shows Relevance Query Results vs.
Random & Personalization Selection graph.
Figure 5- Personal accuracy in interested domain
VI. CONCLUSION
In this paper contribution is a core technology and reusable
software engine for rapid design of a broad range of
applications in the field of personalized recommendation
systems and more. We present a web personalization system
for web search, which not only gives user a set of personalized
pages, but also gives user a list of domains the user may be
interested in. Thus, user can switch to different interests when
he or she is surfing on the web for information. Besides, the
system focuses on the domains that the user is interested in, and
wont waste lots of time on searching the information in the
irrelevant domains. Moreover, the recommendation wont be
affected by the irrelevant domains, and the accuracy of the
recommendation is increased.
REFERENCES
[1] Changqin Huang, Ying Ji, Rulin Duan, A semantic web-based
personalized learning service supported by on-line course resources, 6th
IEEE International Conference on Networked Computing (INC), 2010.
[2] V. Gorodetsky, V. Samoylov, S. Serebryakov, Ontologybased
contextdependent personalization technology, IEEE/WIC/ACM
International Conference on Web Intelligence and Intelligent Agent
Technology, vol. 3, pp. 278-283, 2010.
[3] Pasi Gabriella, Issues on preference-modelling and personalization in
information retrieval, IEEE/WIC/ACM International Conference on
Web Intelligence and Intelligent Agent Technology, pp. 4, 2010.
[4] Wei Wenshan and Li Haihua, Base on rough set of clustering algorithm
in network education application, IEEE International Conference on
Computer Application and System Modeling (ICCASM 2010), vol. 3,
pp. V3-481 - V3-483, 2010.
[5] Shuchih Ernest Chang, and Chia-Wei Wang, Effectively generating and
delivering personalized product information: Adopting the web 2.0
approach, 24th IEEE International Conference on Advanced
Information Networking and Applications Workshops (WAINA), pp.
401-406, 2010.
[6] Xiangwei Mu, Van Chen, and Shuyong Liu, Improvement of similarity
algorithm in collaborative filtering based on stability degree, 3rd
International Conference on Advanced Computer Theory and
Engineering (ICACTE), vol .4, pp. V4-106 - V4-110, 2010
[7] Dario Vuljani, Lidia Rovan, and Mirta Baranovi, Semantically
enhanced web personalization approaches and techniques, 32nd IEEE
International Conference on Information Technology Interfaces (ITA),
pp. 217-222, 2010.
[8] Raymond Y. K. Lau, Inferential language modeling for selective web
search personalization and contextualization, 3rd IEEE International
Conference on Advanced Computer Theory and Engineering (ICACTE),
vol. 1, pp. V1-540 - V1-544, 2010.
[9] Esteban Robles Luna, Irene Garrigos, and Gustavo Rossi, Capturing
and validating personalization requirements in web applications, 1st
IEEE International Workshop on Web and Requirements Engineering
(WeRE), pp. 13-20, 2010.
[10] B. Annappa, K. Chandrasekaran, K. C. Shet, Meta-Level constructs in
content personalization of a web application, IEEE International
conference on Computer & Communication Technology-ICCCT10, pp.
569 574, 2010.
[11] F. Murtagh, A survey of recent advances in hierarchical clustering
algorithms, Computer Journal, vol. 26, no. 4, pp. 354 -359, 1983.
[12] Wang Jicheng, Huang Yuan, Wu Gangshan, and Zhang Fuyan, Web
mining: Knowledge discovery on the web system, IEEE International
Conference systems, Man and cybernatics, vol.2, pp. 137 141, 1999.
[13] B. Mobasher, Web usage mining and personalization in practical
handbook of internet computing, M.P. Singh, Editor. 2004, CRC Press,
pp. 15.1-37.
[14] T. Maier, A formal model of the ETL process for OLAP-based web
usage analysis, 6th WEBKDD- workshop on Web Mining and Web
Usage Analysis, part of the ACM KDD: Knowledge Discovery and
Data Mining Conference, pp. 23-34, Aug. 2004
[15] R. Meo, P. Lanzi, M. Matera, R. Esposito, Integrating web conceptual
modeling, WebKDD, vol. 3932, pp. 135-148, 2006.
[16] B. Mobasher, R. Cooley, and J. Srivastava, Automatic personalization
based on web usage mining, Communications of the ACM, vol. 43, no.
8, pp. 142151, Aug 2000.
[17] B. Mobasher, H. Dai, T. Luo, and M. Nakagawa, Effective
personalization based on association rule discovery from web usage
data, 3rd international ACM Workshop on Web information and data
management, 2001.
[18] O. Nasraoui, R. Krishnapuram, and A. Joshi, Mining web access logs
using a relational clustering algorithm based on a robust estimator, 8th
International World Wide Web Conference, pp. 40-41, 1999.
[19] D. Pierrakos, G. Paliouras, C. Papatheodorou, V. Karkaletsis, and M.
Dikaiakos, Web community directories: A new approach to web
personalization, 1st European Web Mining Forum (EWMF'03),
vol. 3209, pp. 113-129, 2003.
[20] Schafer J. B., Konstan J., and Reidel J., Recommender systems in e-
commerce, 1st ACM Conference on Electronic commerce, pp. 158-166.
1999
AUTHORS PROFILE
Dr. Yogendra Kumar Jain presently working as head of the department,
Computer Science & Engineering at Samrat Ashok Technological Institute
Vidisha M.P India. The degree of B.E. (Hons) secured in E&I from SATI
Vidisha in 1991, M.E. (Hons) in Digital Tech. & Instrumentation from
SGSITS, DAVV Indore(M.P), India in 1999. The Ph. D. degree has been
awarded from Rajiv Gandhi Technical University, Bhopal (M.P.) India in
2010.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
123 | P a g e
http://ijacsa.thesai.org/
Research Interest includes Image Processing, Image compression, Network
Security, Watermarking, Data Mining. Published more than 40 Research
papers in various Journals/Conferences, which include 10 research papers in
International Journals. Tel:+91-7592-250408, E-mail: [email protected].
Geetika Silakari presently working as Asst. Professor in Computer Science
& Engineering at Samrat Ashok Technological Institute Vidisha M.P India.
The degree of B.E. (Hons) secured in Computer Science & engineering. She
secured M.Tech in Computer science and Engineering from Vanasthali
University. She is currently pursuing PHD in Computer science and
engineering. E-mail:[email protected]
Mr. Mahendra Thakur is a research scholar pursuing M.Tech in Computer
Science & Engineering from Samrat Ashok Technological Institute Vidisha
M.P India. He secured degree of B.E. in IT from Rajiv Gandhi Technical
University, Bhopal (M.P.) India in 2007.
[email protected]
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
124 | P a g e
http://ijacsa.thesai.org/
A Short Description of Social Networking Websites
And Its Uses
Ateeq Ahmad
Department of Computer science & Engineering
Singhania University
Pacheri Bari, Disst. Jhunjhunu (Rajasthan)-333515, India
Email:[email protected]
AbstractNow days the use of the Internet for social networking
is a popular method among youngsters. The use of collaborative
technologies and Social Networking Site leads to instant online
community in which people communicate rapidly and
conveniently with each other. The basic aim of this research
paper is to find out the kinds of social network are commonly
using by the people.
Keywords-Social Network, kinds, Definition, Social Networking
web sites, Growth.
I. INTRODUCTION
A web site that provides a social community for people
interested in a particular subject or interest together. Members
create there own online profile with data, pictures, and any
other information. They communicate with each other by
voice, chat, instant message, videoconferencing, and the
service typically provides a way for members to connect by
making connections through individuals is known as Social
networking. Now days there are many web sites dedicated to
the Social Networking, some popular websites are. Facebook,
Orkut, Twitter, Bebo, Myspace, Friendster, hi5, and
Bharatstudent are very commonly used by the people. These
websites are also known as communities network sites. Social
networking websites function like an online community of
internet users. Depending on the website in question, many of
these online community members share common interests in
hobbies, discussion. Once you access to a social networking
website you can begin to socialize. This socialization may
include reading the profile pages of other members and
possibly even contacting them.
II. DEFINITION
Boyd and Ellison (2007) define social network services as
web-based services which allow individuals to Construct a
public or semipublic profile within a bounded system,
Communicate with other users; and View the pages and details
provided by other users within the system. The social
networking websites have evolved as a combination of
personalized media experience, within social context of
participation. The practices that differentiate social networking
sites from other types of computer-mediated communication
are uses of profiles, friends and comments or testimonials
profiles are publicly viewed, friends are publicly articulated,
and comments are publicly visible.
Users who join Social networking websites are required to
make a profile of themselves by filling up a form. After filling
up the forms, users are supposed to give out information about
their personality attributes and personal appearances. Some
social networking websites require photos but most of them
will give details about one's age, preference, likes and dislikes.
Some social networking websites like Facebook allow users to
customize their profiles by adding multimedia content.
(Geroimenko & Chen, 2007)
III. CHARACTERISTICS OF SOCIAL NETWORKING SITES
Social networking websites provide rich information about
the person and his network, which can be utilized for various
business purposes. Some of the main characteristics of social
networking sites are:
They act as a resource for advertisers to promote their
brands through word-of-mouth to targeted customers.
They provide a base for a new teacher-student
relationship with more interactive sessions online.
They promote the use of embedded advertisements in
online videos.
They provide a platform for new artists to show their
profile.
IV. OBJECTIVE
The basic objective of this research is to analysis about the
awareness and frequency regarding the use of social
networking websites.
V. HISTORY OF SOCIAL NETWORKING WEBSITES
The first social networking websites was launched in the
year 1997 Sixdegrees.com. This company was the first of its
kind; it allowed user to list their profiles, provide a list of
friends and then contact them. However, the Company did not
do very well as it eventually closed three years later. The
reason for this was that many people using the internet at that
time had not formed many social networks hence there was
little room for maneuver. It should be noted that there were also
other elements that hinted at Social network websites. For
instance, dating sites required users to give their profiles but
they could not share other people's websites. Additionally,
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
125 | P a g e
http://ijacsa.thesai.org/
there were some websites that would link former school mates
but the lists could not be shared with others. (Cassidy, 2006)
After this there was the creation of Live Journal in the year
1999. It was created in order to facilitate one way exchanges of
journals between friends. Another company in Korea called CY
world added some social networking features in the year 2001.
This was then followed by Lunar Storm in Sweden during the
same year. They include things like diary pages and friends
lists. Additionally, Ryze.com also established itself in the
market. It was created with the purpose of linking business men
within San Francisco. The Company was under the
management of Friendster, LinkedIn, Tribe.net and Ryze. The
latter company was the least successful among all others.
However, Tribe.net specialized in the business world but
Friendster initially did well; this did not last for long. (Cohen,
2003)
VI. SOCIAL NETWORKING WEBSITES THAT ARE COMMONLY
USED BY THE PEOPLE
The most significant Social networking websites commonly
used by the people especially by the youngster like, Friendster,
Myspace, Facebook, Downlink, Ryze, SixDegrees, Hi 5,
LinkedIn, Orkut, Flicker, YouTube, Reddit, Twitter,
FriendFeed, BharatStudent and Floper.
A. Friendster
Friendster began its operations in the year 2002. It was a
brother company to Ryze but was designed to deal with the
social aspect of their market. The company was like a dating
service, however, match making was not done in the typical
way where strangers met. Instead, friends would propose which
individuals are most compatible with one another. At first,
there was an exponential growth of the Comply. This was
especially after introduction of network for gay men and
increase in number of bloggers. The latter would usually tell
their friends about the advantages of social networking through
Friendster and this led to further expansion. However,
Friendster had established a market base in one small
community. After their subscribers reached overwhelming
numbers, the company could no longer cope with the demand.
There were numerous complaints about the way their servers
were handled because subscribers would experience
communication breakdowns. As if this was not enough, social
networks in the real world were not doing well; some people
would find themselves dating their bosses or former classmates
since the virtual community created by the company was rather
small. The Company also started limiting the level of
connection between enthusiastic users. (Boyd, 2004)
B. MySpace
By 2003, there were numerous companies formed with the
purpose of providing social networking service. However, most
of them did not attract too much attention especially in the US
market. For instance, LinkedIn and Xing were formed for
business persons while services like MyChurch, Dogster and
Couchsurfing were formed for social services. Other
companies that had been engaging in other services started
offering social networking services. For instance, the You Tube
and Last. FM was initially formed to facilitate video and music
sharing respectively. However, the started adopted social
networking services. (Backstrom et al, 2006)
C. Facebook
This social networking service was introduced with the
purpose of linking friends in Harvard University in 2004.
Thereafter, the company expanded to other universities then
colleges. Eventually, they invited corporate communities. But
this does not mean that profiles would be interchanged at will.
There are lots of restrictions between friends who join the
universities social network because they have to have the .edu
address. Additionally, those joining corporate network must
also have the .com attachment. This company prides itself in
their ability to maintain privacy and niche communities and
have been instrumental in learning institutions. (Charnigo &
Barnett-Ellis, 2007)
D. Downelink
This website was founded in 2004 for the lesbian, gay,
bisexual, and transgender community. Some features include
social networking, weblogs, internal emails, a bulletin board,
DowneLife and in the future, a chat.
E. Ryze
The first of the online social networking sites, Adrian
Scotts founded Rzye as a business-oriented online community
in 2001.Business people can expand their business networks by
meeting new people and join business groups, called Networks,
through industries, interests, and geographic areas.
F. SixDegrees
Six Degrees was launched in 1997 and was the first modern
social network. It allowed users to create a profile and to
become friends with other users. While the site is no longer
functional, at one time it was actually quite popular and had
around a million of members.
G. Hi5
Hi5 is established in 2003 and currently boasting more than
60 million active members according to their own claims.
Users can set their profiles to be seen only by their network
members. While Hi5 is not particularly popular in the U.S., it
has a large user base in parts of Asia, Latin America and
Central Africa.
H. LinkedIn
LinkedIn was founded in 2003 and was one of the first
mainstream social networks devoted to business. Originally,
LinkedIn allowed users to post a profile and to interact through
private messaging.
I. Orkut
Launched in January 2004, is Goggles social network, and
while its not particularly popular in the U.S., its very popular
in Brazil and India, with more than 65 million users. Orkut lets
users share media, status updates, and communicate through
IM.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
126 | P a g e
http://ijacsa.thesai.org/
J. Flickr
Flickr has become a social network in its own right in
recent years. They claim to host more than 3.6 billion images
as of June 2009. Flickr also has groups, photo pools, and allows
users to create profiles, add friends, and organize images and
video.
K. YouTube
YouTube was the first major video hosting and sharing site,
launched in 2005. YouTube now allows users to upload HD
videos and recently launched a service to provide TV shows
and movies under license from their copyright holders.
L. Reddit
Reddit is another social news site founded in 2005. Reddit
operates in a similar fashion to other.
M. Twitter
Twitter was founded in 2006 and gained a lot of popularity
during the 2007. Status updates have become the new norm in
social networking.
N. FriendFeed
Friend Feed launched in 2007 and was recently purchased
by Facebook, allow you to integrate most of your online
activities in one place. Its also a social network in its own
right, with the ability to create friends lists, post and updates.
O. BharatStudent
Bharatstudent is a social utility that brings together all the
young Indians living across the globe. It is for every Young
Indian who is a student or a non-student, fresh graduate, a
working professional or an Entrepreneur, and is focused on
providing comprehensive solutions for any personal and
professional issues.
P. Fropper
Fropper is ALL about meeting people, making new friends
& having fun with photos, videos, games & blogs! Come,
become a part of the 4 Million strong Fropper communities.
VII. GROWTH OF SOCIAL NETWORKING WEBSITES.
Now days Social networking popularity is increasing
rapidly around the world. Social networking behemoth
MySpace.com attracted more than 114 million global visitors
age 15 and older in June 2007, representing a 72-percent
increase versus year ago. Facebook.com experienced even
stronger growth during that same time frame, jumping 270
percent to 52.2 million visitors. Bebo.com (up 172 percent to
18.2 million visitors) and Tagged.com (up 774 percent to 13.2
million visitors) also increased by orders of magnitude.
(ComScore)
A. Worldwide Growth of Selected social Networking Sites
between June 2006 and June 2007
During the past year, social networking has really taken off
globally, Literally hundreds of millions of people around the
world are visiting social networking sites each month and many
are doing so on a daily basis(Bob Ivins) see table I.
TABLE I. ANLYSIS OF SOCIAL NETWORKING SITES
Social Networking sites
Worldwide
Growth of Social Networking Sites
J une-2006 J une-2007
Percent
Change
MySpace 66,401 114,147 72
FaceBook 14,083 52,167 270
Hi5 18,098 28,174 56
Friendster 14,917 24,675 65
Orkut 13,588 24,120 78
Bebo 6,694 18,200 172
Tagged 1,506 13,167 774
B. Worldwide Growth of Selected social Networking Sites
between June 2007 and June 2008
During the past year, many of the top social networking
sites have demonstrated rapid growth in their global user bases.
Facebook.com, which took over the global lead among social
networking sites in April 2008, has made a concerted effort to
become more culturally relevant in markets outside the U.S. Its
introduction of natural language interfaces in several markets
has helped propel the site to 153 percent growth during the past
year. Meanwhile, the emphasis Hi5.com has put on its full-
scale localization strategy has helped the site double its visitor
base to more than 56 million. Other social networking sites,
including Friendster.com (up 50 percent), Orkut (up 41
percent), and Bebo.com (up 32 percent) have demonstrated
particularly strong growth on a global basis. See table II.
TABLE II. ANLYSIS OF SOCIAL NETWORKING SITES
C. Worldwide Growth of Selected social Networking Sites
between July 2009 and July 2010
Social Networking sites in India, that Facebook.com
grabbed the number one ranking in the category for the first
time in July with 20.9 million visitors, up 179 percent versus
year ago. The social networking phenomenon continues to gain
steam worldwide, and India represents one of the fastest
growing markets at the moment, Though Facebook has tripled
its audience in the past year to pace the growth for the
category, several other social networking sites have posted
their own sizeable gains. (Will Hodgman)See table III.
Social Networking sites
Worldwide
Growth of Social Networking Sites
J une-2007 J une-2008
Percent
Change
Asia Pacific 162,738 200,555 23
Europe 122,527 165,256 35
North America 120,848 131,255 9
Latin America 40,098 53,248 33
Middle East Africa 18,226 30,197 66
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
127 | P a g e
http://ijacsa.thesai.org/
More than 33 million Internet users age 15 and older in
India visited social networking sites in July, representing 84
percent of the total Internet audience. India now ranks as the
seventh largest market worldwide for social networking, after
the U.S., China, Germany, Russian Federation, Brazil and the
U.K. The total Indian social networking audience grew 43
percent in the past year, more than tripling the rate of growth of
the total Internet audience in India.
TABLE III. ANLYSIS OF SOCIAL NETWORKING SITES
VIII. CONCLUSTION
Social networking websites is also one of the social media
tools which can be used as a tool in education industry to
generate on line traffic and a pipe line for new entrants. The
use of these websites is growing rapidly, while others
traditional online is on the decrease. Social network user
numbers are staggering, vastly increasing the exposure
potential to education industry through advertising industry.
Social network offers people great convenience for social
networking. It allows people to keep in touch with friends, and
with old friends, meet new people, and even conduct business
meeting online. You can find people with similar interests as
you and get to know them better, even if they are in a different
country. Every day people are joining the Social Network. And
the growth and uses of social networking are increasing, all
over the World.
REFERENCES
[1] Budden C B, Anthony J F, Budden M C and Jones M A (2007),
"Managing the Evolution of a Revolution: Marketing Implications of
Internet Media Usage Among College Students", College Teaching
Methods and Styles Journal, Vol. 3, No. 3, pp. 5-10.
[2] Danielle De Lange, Filip Agneessens and Hans Waege (2004), "Asking
Social Network Questions: A Quality Assessment of Different
Measures", Metodoloki zvezki, Vol. 1, No. 2, pp. 351-378.
[3] David Marmaros and Bruce Sacerdote (2002), "Peer and Social
Networks in Job Search", European Economic Review, Vol. 46, Nos. 4
and 5, pp. 870-879.
[4] Feng Fu, Lianghuan Liu and Long Wang (2007), "Empirical Analysis of
Online Social Networks in the Age of Web 2.0", Physica A: Statistical
Mechanics and its Applications, Vol. 387, Nos. 2 and 3, pp. 675-684.
[5] Kautz H, Selman B and Shah M (1997), "Referral Web: Combining
Social Net Works and Collaboration Filtering", Communication of the
ACM, Vol. 40, No. 3, pp. 63-65.
[6] Mark Pendergast and Stephen C Hayne (1999), "Groupware and Social
Networks: Will Life Ever Be the Same Again?", Information and
Software Technology, Vol. 41, No. 6, pp. 311-318.
[7] Mayer Adalbert and Puller Steven L (2008), "The Old Boy (and Girl)
Network: Social Network Formation on University Campuses", Journal
of Public Economics, Vol. 92, Nos. 1 and 2, pp. 329-347.
[8] Reid Bate and Samer Khasawneh (2007), "Self-Efficacy and College
Student's Perceptions and Use of Online Learning Systems", Computers
in Human Behavior, Vol. 23, No. 1, pp. 175-191.
[9] Sorokou C F and Weissbrod C S (2005), "Men and Women's Attachment
and Contact Patterns with Parents During the First Year of College",
Journal of Youth and Adolescence, Vol. 34, No. 3, pp. 221-228.
[10] Boyd Danah and Ellison Nicole (2007), "Social Network Sites:
Definition, History and Scholarship", Journal of Computer-Mediated
Communication, Vol. 13, No. 1.
[11] Boyd Danah (2007), "Why Youth (Heart) "Social Network Sites: The
Role of Networked Publics in Teenage Social Life", MacArthur
Foundation Series on Digital Learning-Youth, Identity and Digital
Media Volume, David Buckingham (Ed.), MIT Press, Cambridge, MA.
[12] Madhavan N (2007), "India Gets More Net Cool", Hindustan Times,
July 6, http://www.hindustantimes.com
[13] Cotriss, David (2008-05-29). Where are they now: Theglobe.com. The
Industry Standard.
[14] Romm-Livermore, C, & setzekorn, K. (2008). Social Networking
communities and E-Dating Services: Concepts and Implications. IGI
Global. P.271
[15] Knapp, E, (2006). A Parents Guide to Myspace. DayDream Publishers.
ISBN 1-4196-4146-8.
[16] Acquisti, A, & Gross, R (2006): Imagined communities: Awareness,
information sharing, and privacy on the Facebook: Cambridge, UK:
Robinson College
[17] Backstrom, L et al (2006): Group formation in large social networks:
Membership, growth, and evolution, pp. 44-54, New York ACM Press.
[18] Boyd, D. (2004): Friendster and publicly articulated social networks.
Proceeding of ACM
[19] Cassidy, J, (2006): Me media: How hanging out on the Internet Became
big Business, The new Yorker, 82,13,50
[20] Charnigo, L & Barnett-Ellis, p. (2007): Checking out Facebook.com:
The impact of a digital trend on academic libraries; Information
Technology and Libraries, 26,1,23.
[21] Choi, H (2006): Living in Cyworld: Contextualising CyTies in south
Korea, pp. 173-186, New York Peter Lang.
[22] Cohen, R. (2003): Livewire: Web sites try to make Internet dating less
creepy, Reuters, retrieved from http://asia.reuters.com Accessed 01 Feb
2011
[23] ComScore (2007): Social networking goes global. Reston,
http://www.comscore.com Accessed 01 Feb 2011
[24] Cameron Chapman(2009)History and evolution of social media
Retrieved from http://www.webdesignerdepot.com
[25] http://wikipedia.com Accessed 01 Feb 2011.
[26] Worldwide social networking websites(ComScore)
http://www.comscore.com Accessed 01 Feb 2011
[27] Geroimenko, V. & Chen, C. (2007): Visualizing the Semantic Web,
pp. 229-242, Berlin: Springer.
[28] Shahjahan S. (2004), Research Methods for Management, Jacco
Publishing House
[29] Cooper Donald R. and Shindler Panda S (2003), Business Research
Methods, Tata McGraw Hill Co. Ltd., New Delhi.
[30] Shah Kruti and DSouza Alan (2009), Advertising & Promotions: An
IMC perspective, Tata McGraw Hill Publishing Company Limited,
New Delhi.
Social Networking sites
Worldwide
Growth of Social Networking Sites
J uly-2009 J uly-2010
Percent
Change
United States 131,088 174,429 33
China N/A 97,151 N/A
Germany 25,743 37938 47
Russian Federation 20,245 35,306 74
Brazil 23,966 35,221 47
United Kingdom 30,587 35,153 15
India 23,255 33,158 43
France 25,121 32,744 30
Japan 23,691 31,957 35
South Korea 15,910 24,962 57
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
128 | P a g e
http://ijacsa.thesai.org/
[31] Paneeerselvam R. (2004), Research Methodology, Prentice Hall of
India Pvt. Ltd., New Delhi.
[32] Opie Clive, (June 2004), Doing Educational Research- A Guide for
First Time Researchers, Sage Publications, New Delhi,
[33] Kothari C.R. (2005), Research Methodology: Methods and
Techniques, India, and New Age International Publisher.
Cooper Donald R. and Shindler Panda S (2003), Business Research
Methods, Tata McGraw Hill Co. Ltd., New Delhi.
AUTHOR PROFILE
Ateeq Ahmad received the Master degree in computer science in year 2003.
He has a PhD student in Department of
Computer science & Engineering Singhania
University, Rajasthan India. His research
interests include Computer networks, Social
Network, Network Security, and Web
development.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
129 | P a g e
http://ijacsa.thesai.org/
Multilevel Security Protocol using RFID
1
Syed Faiazuddin,
2
S. Venkat Rao,
3
S.C.V.Ramana Rao,
4
M.V.Sainatha Rao,
5
P.Sathish Kumar
1
Asst Professor, SKTRM College of Engg & Tech, Dept .of CSE, Kondair, A.P., [email protected].
2
Asst Professor, S.R.K. P.G College, Dept .of MCA, Nandyal, Kurnool (Dist.), A.P.,
[email protected].
3
Asst. Professor, RGM Engg & Tech, Dept. of CSE, Nandyal, Kurnool (Dist.), A.P.,
[email protected].
4
Asst. Professor, RGM Engg & Tech, Dept. of IT, Nandyal, Kurnool (Dist.), A.P.
[email protected].
5
Asst Professor, SVIST , Dept .of CSE, Madanapalle, Chittoor (Dist.), A.P., [email protected]
AbstractThough RFID provides automatic object identification,
yet it is vulnerable to various security threats that put consumer
and organization privacy at stake. In this work, we have
considered some existing security protocols of RFID system and
analyzed the possible security threats at each level. We have
modified those parts of protocol that have security loopholes and
thus finally proposed a modified four-level security model that
has the potential to provide fortification against security threats.
Keywords- RFI D, Eavesdropping, Slotted I D, Spoofing, Tracking.
I. INTRODUCTION
Radio Frequency Identification is a generic term for
identifying living beings or objects using Radio Frequency.
The benefit of RFID technology is that, it scans and identifies
objects accurately and efficiently without visual or physical
contact with the object [1], [3].
A typical RFID system consists of:
An RFID tag
A tag reader
A host system with a back-end database[2]
Each object contains a tag that carries a unique ID [3]. The
tags are tamper resistant and can be read even in visually and
environmentally challenging conditions [3] such as snow, ice,
fog, inside containers and vehicles etc [2]. It can be used in
animal tracking, toxic and medical waste management, postal
tracking, airline baggage management, anti-counterfeiting in
the drug industry, access control etc. It can directly benefit the
customer by reducing waiting time and checkout lines [3] due
to its very fast response time. Hence, it should be adopted
pervasively.
For low cost RFID implementation, inexpensive passive
tags that do not contain a battery [5] and can get activated only
by drawing power from the transmission of the reader [4]
through inductive coupling are used. Tags don't contain any
microprocessor [6], but incorporate ROM (to store security
data, unique ID, OS instructions) and a RAM (to store data
during reader interrogation and response) [2], [6],
Fig . 1
In the simplest case, on reader interrogation the tag sends
back its secret ID (Fig-1). The universally unique ID makes
the tag vulnerable towards tracking as it moves from one
place to another. Hi violates "location privacy". Unprotected
tags could be monitored and tracked by business rivals. An II if
known to an illegal reader could be used to produce fake tags
that would successfully pass through security checks in future.
Hence, the security of RFID tags and the stored ID is of
extreme importance and sensing : probable security
loopholes we have proposed a :k monitoring protocol that
would reduce the security threats due to eavesdropping and
tracking.
II. SECURITY THREATS
A. Eavesdropping Scenario:
Eavesdropping normally occurs when the attacker
intercepts the communication between an RFID token and
authorized reader. The attacker does not need to power or
communicate with the token, so it has the ability to execute the
attack from a greater distance than is possible for skimming. It
is, however, limited in terms of location and time window,
since it has to be in the vicinity of an authorized reader when
transaction that it is interested in, is conducted. The attacker
needs to capture the transmitted signals using suitable RF
equipment before recovering and storing data of interest [4],
[8].
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
130 | P a g e
http://ijacsa.thesai.org/
Fig. 2
B. Forward privacy:
Forward privacy ensures that messages transmitted today
will be secure in the future even after compromising the tag.
Privacy also includes the fact that a tag must not reveal any
information about the kind of item it is attached to [9], [10].
C. Spoofing:
It is possible to fool an RFID reader into beUeVmg, it
vs recewmg da\a from an RFID tag or data. This is called
"SPOOFING". In spoofing someone with a suitably
programmed portably reader covertly read" and" record's a
cfafa transmission from a tag that could contain the tag's ID.
When this data transmission is retransmitted, it appears to be a
valid tag. Thus, the reader system cannot determine that data
transmission is not authenticated [II], [12].
D. Tracking:
A primary security concern is the illicit tracking of RFID
tags. Tags which are world-readable, pose a risk to both
personal location privacy and corporate security. Since tag can
be read from inside wallets, suitcases etc. even in places where
it's not expected to items to move often it can be a smart idea to
find ways to track the item. Current RFID deployments can be
used to track people the tag the carry. To solve this problem,
we cannot use a fixed identifier [7], [12].
III. RELATED WORK
To resolve the security concerns rose in the previous
section many protocols have been proposed in various research
papers.
In the work [4], the authors proposed a 'Hash lock Scheme'.
In this scheme, the tag carries a key and a meta ID that is
nothing but the hashed key. Upon request from a reader, the tag
sends its stored meta ID back to the reader. Reader then
forwards this meta ID to the back end database where the key
of the tag has been found by looking up the database using
meta ID as the search key. The reader forwards the key found
from the database to the tag which hashes this key value and
matches the calculated hashed value with the stored meta ID.
On a successful match the tag is unlocked for further
information fetch.
Fig. 3
The drawback of this protocol is that the meta ID is still
unique. A tag can still be tracked using this meta 3D despite of
knowing the original ID. So, "location privacy" is still under
threat. Again, while transmission of the key from back end
database through reader, it can easily be captured by an
eavesdropper though the connection between the reader and tag
has been an authenticated one. Hence, eavesdropping is still a
major problem. From this, it is inferred that no unique and
'static' value can ever be sent back to the reader.
To overcome this problem, a new protocol has been
predicted [4] in which tag responses change with every query.
To realize this, the tag sends a pair <r, h(ID, r)> where r is a
random number upon request. The database searches
exhaustively through its list of known IDs until it finds the one
that matches h(ID,r), for the given r. Though this technique
resolves the tracking problem yet increases the overhead of the
database and the search complexity increases with r. This is
handled by the protocol discussed by us in the next section.
Our tag contains a unique meta ID. As we cannot send the
unique meta ID, we are generating a random number in the tag.
This random number is fed to a down counter. The down
counter counts down to zero and sends a clock pulse to a
sequence Tag sends a pair <r, q> where r is the random number
and q is the new state generated by the sequence generator.
At the reader end a reverse sequence generator is
implemented through which the state equal to the original
meta ID has been found.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
131 | P a g e
http://ijacsa.thesai.org/
Fig.4
Fig. 5
A. Reader Identification:
Since the reader plays an important role in RFID system,
the tag must identify its authenticated reader. An authenticated
reader has the capability to modify, change, insert or delete the
tag's data. As an extension to the previous section, after
generating the original meta ID the system looks into the back-
end database and retrieves the corresponding key. Now, before
sending the key to the tag, the logic circuit effaces some of the
bits from the key and sends the modified key to the tag. Which
bits are to be deleted is determined by the random number r.
generator with each down count, on receiving of which the
sequence generator each time generates a new state starting
from the state equivalent to the meta ID. When the down
counter becomes zero, the state of sequence generator is
recorded.
IV. SECURITY PROPOSALS
A. Mitigating Eavesdropping:
In the first part of our work, we came up with a novel idea
to alleviate eavesdropping introducing meta ID concept in a
new light.
Fig. 6 Example of an Image with Acceptable
At the tag end, the missing bits of the covert key are copied
down from the original key stored in the tag. Then the stored
key and the modified key are compared. On a successful
match, the tag considers the reader to be valid and unlocks
itself for further access of the reader. Otherwise, it rejects the
query request sensing the reader to be a false one.
B. Slotted ID Read:
Up to this stage only a valid reader has been given the
privilege to gain access of the next level of the tag. Still the
unique ID of the tag cannot be sent openly to the reader as it
can readily get skimmed and tracked by an eavesdropper. To
deal with it, the ID is divided into a number of slots of varying
length. Some additional bits are added at the beginning of
each slot that holds the length of the ID belonging to that slot.
Then the entire data packet is encrypted. As only the
authenticated, reader knows the number of bits used to specify
the length of that slot, it provides an extra security to this
approach. The transmission of data packets in several slots is
continued until the end of the ID.
Length of Data Data + Padding Bits
Fig . 7. Typical packet
C. Tag Identification:
At the reader end after receiving the each packet, it first
decrypts the data and then eliminates the bits used to specify
the length of that slot recovers a part of the original ID. This
method continued for each packet and then the decry IDs are
combined together to reform the entire unique ID. Thus, the
unique ID is transmitted to the authenticated reader and at the
same time it also stymied the false readers from reading it.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
132 | P a g e
http://ijacsa.thesai.org/
Fig. 8 Example of an Image with acceptable resolution
V. SECURITY ANALYSIS
In our protocol, we have provided a four step security to the
ID that prevents the tag from getting cloned and reduces the
risk of spoofing, eavesdropping by many folds.
As the <r, q> pair sent to the reader from the tag
changes every time, an eavesdropper can never track
a tag through its meta ID. In work [4] though this was
achieved, it increased the database overhead and
complexity of brute-force search algorithm. In our
method, the same goal was met but the problem of
work [4] has also been resolved.
The key retrieved from the back-end database of the
reader has not directly been sent to the tag as any
false reader can catch this key on its way to the tag and
can prove itself to be a valid reader at any moment.
Hence, the key has been modified with special method
and as the same key is modified in a different manner
each time, it doesn't allow a false reader or an
eavesdropper to discover the key.
The received and modified key is reconstructed and
matched with the key stored in the tag to authenticate
a reader. This feature bars ali readers apart from the
valid one to gain further access of the tag contents.
The entire ID has been slotted and each slot is
different length. The first few bits of each slot
represent the number of bits of ID belonging to that
slot. Then the data of the entire sl ot is being
encrypted and sent to the reader. The ID is sent in
several steps and the unique ID has never been sent m
its original Sorm. TYi'is ent'ire method aYiows orhy an
authenticated reader to find the original ID.
Thus, we have beefed up the security of the ID through our
protocol and provided secure tag-to-reader transactions.
The proposed protocol can be mixed up with other research
works to make it more beneficial for practical life
implementation towards the goal of manufacturing low cost
RFID. With the passage of time and generation new ideas
along with new technology will sprung up, which will
definitely make this RFID technology, a more preferable and
cost effective. As the radiation from RFID is not good for
human exposure, RFID radiation is inadvertently causing
damage to human cells, tissues on its exposure. So there is a
wide space in this field also to minimize its effect on human
beings. Hence there is a plethora of fields in which we can
work on.
VI. CONCLUSION
As our work revolves around security only, we have
provided a 3-way security level in our proposal. With our
limited resources we had tried our best to give tag-reader
identification a higher priority since both have their own
importance in security analysis measurement. By combining
the random variable concept for tag-reader identification we
have provided an additional security. The most important
characteristic of our protocol is that at no point of time we are
leaving our IDs/keys in their original form. Even if a false
reader reads any information, it's of no use for that reader. That
said, our proposed security definitions are just a starting point.
They certainly do not capture the full spectrum of real-world
needs. We had proposed important areas for further work.
REFERENCES
[1] Glidas Avoine and Philippe Oechslin, "A Scalable and Provably
Secure Hash-Based RFID Protocol", The 2"" IEEE International
Workshop on Pervasive Computing and Communication Security,
2005.
[2] C. M. Roberts, Radio frequency identification. Computers & Security vol.
25, p. 18-26, Elsevier, 2006.
[3] Tassos Dimitnou, A Secure and Efficient RFID Protocol that could make
Big Brother (partially) Obsolete, Proceedings of the Fourth Annual
IEEE International Conference on Per\'asive Computing and
Communications,2006.
[4] Stephen A. Weis, Sanjay E. Sharma, Ronald L. Rivest and Daniel
VV. Engels, Security and Privacy Aspects of Low-Cost Radio
Frequency Identification Systems, /** Internationa/ conference
on Security in Pervasive Cotnput,ng(SPC).2003,
[5] Mike Biirmester, Breno de Medeiros and Rossana Motta, Provably
Secure grouping-proofs for RFID tags.
[6] Gildas Avoine and Philippe Oechslin, RFID Traceability: A Multilayer
Problem.
[7] Mike Burmester and Breno de Medeiros, RFID Security: Attacks,
Countermeasitres and Challenges.
[8] G.P. Hancke, Eavesdropping Attacks on High-Frequency RFID
Tokens.
[9] Raphael C.-W. Phan, Jiang Wu , Khaled Ouafi and Douglas R. Stinson,
Privacy Analysis of Fonvard and Bacfovard Un/raceable RFID
Authentication Schemes.
[10] Mayla Brns'o, Konstantinos Chatzikokolakis, and Jerry den Hartog,
Formal verification of privacy for RFID systems.
[11] Dale R. Thompson, Neeraj Chaudhry, Craig W. Thompson RFID
Security Threat Model.
[12] Dong-Her Shih, Privacy and Security Aspects of RFID Tap'
AUTHORS PROFILE
Syed Faiazuddin received the M. Tech degree
in Computer Science and Engineering from JNTU
Kakinada University from the Department of Master
of Technology. He is a faculty member in the
Department of M. Tech in KTRMCE, Kondair. His
research interests are in areas of Computer
Networks, Mobile Computing, DSP, MATLAB,
Data Mining, Sensor Networks. He is worked as 2
years research and Development Engineer in
ELICO Ltd.,. He is Member of the National Service
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 2, No.2, February 2011
133 | P a g e
http://ijacsa.thesai.org/
Scheme (NSS).
Mr. S. Venkat Rao is currently working as
Assistant Professor in the Department of Computer
Science and Applications in Sri Ramakrishna
Degree & P.G. College, Nandyal, Andhra Pradesh,
India. He completed his B.Sc., M.Sc., and M.Phil.
with Computer Science as specialization in
1995,1997 and 2006 respectively from Sri
Krishnadevaraya University, Anantapur. He is
currently a part-time Research Scholar pursuing
Ph.D. degree. His areas of interest include Computer
Networks, WDM Networks. He participated in number of national seminars
and presented papers in national conferences.
S.C.V. Ramana Rao received the M.
Tech(CSE) degree in Computer Science and
Engineering from JNTU Anantapur University from
the Department of Master of Technology. He is a
faculty member in the Department of C.S.E. His
research interests are in areas of Computer Networks,
Mobile Computing, DSP, MATLAB, Data Mining,
Sensor Networks.
M.V. Sainatha Rao is currently Working as Assistant Professor in the
Department of Information and Technolgy. He has
8 Years of experience in Computer Science . He
has attended 2 international conferences, 6 national
conferences, 4 workshops. His research interests
are in areas of Computer Networks, Mobile
Computing, DSP, MATLAB, Sensor Networks.
P. Sathish Kumar received the M.Tech
degree in Computer Science and Engineering
from JNTU Anantapur University from
the Dept. of Master of Technology. He is
Asst. Professor in the Department of C.S.E
at Sri Vishveshwaraiah Institute Of
Science&Technology, approved by AICTE,
affiliated to JNTU, Anantapur. Madanapalle,
Chittoor Dist., A.P. His research interests are in
areas of Computer Networks, Mobile
Computing, DSP, MATLAB, Data Mining,
Sensor Networks.