Bioinformatics Paper

ccccccccccccccccccccccccccccccccccccccc cccccccccccccccccccccccccccc ccc 


Isha Noohi Chishty

[email protected]

cBioinformaticians are the tool builders and it is

critical that they understand biological problems as
ccccccccccccccccccccccccccccccccccccc c well as computer solutions in order to produce useful
- flood of data means that many of the challenges in Insights into the three-dimensional (3D) structure of a
biology are now challenges in computing. Protein is of great assistance when planning
Bioinformatics, the application of computational experiments aimed at the understanding of protein
techniques to analyse the information associated with function and during the drug /vaccine /antibody
bio molecules on a large-scale, has now firmly /enzyme /protein design process. The experimental
elucidation of the 3D-structure of proteins is however
established itself as a discipline in molecular biology,
often hampered by difficulties in obtaining sufficient
and encompasses a wide range of subject areas from protein, diffracting crystals and many other technical
structural biology, genomics to gene expression studies.c aspects.
This paper deals with some of the applications of Design of vaccines has attained new dimensions with
Bioinformatics. This can be given as follows- the availability of complete genome sequences of
c diseased organisms, three dimensional / two
dimensional structural informations (coordinate
1).Designing Drugsc
values) of proteins involved in interaction of MHC,
2).Finding homologc epitopes and T cell receptors stored in PDB / MDB
3).Overall Genome Characterizationc database. Besides, there are different algorithms can
access the potentiality of generated vaccines.
We will propose a solution which maximizes utilization c
of laboratories for research work in the field of In the present study, we are presenting a case study of
informatics can be achieved.c accurate method. Modelling of probable vaccine
epitopes against visceral Leishmaniasis. The software
Bioinformatics is the application of statistics and
tools used in the study have generated three-
computer science to the field of molecular biology. dimensional coordinates of desired epitopes and the
Over the past few decades rapid developments in stability and validity analysis. Hence, the accuracy as
genomic and other molecular research technologies and well as efficiency of softwares is the points of
developments in information technologies have Significant emphasis.
combined to produce a tremendous amount of
information related to molecular biology.cccccccccccccccc
c designing, homology, genome

cccccccccccccccccccccccccccc c

Bioinformatics is defined as an interdisciplinary fieldc

involving biology, computer science, mathematics and
statistics to analyze biological sequence data, prediction
of genes and regulatory elements, their arrangement and
proteome analysis involving prediction of 2D, 3D
structures of proteins [1]. In other words, bioinformatics
is a subset of the larger field of computational biology,c
which includes the application of quantitativec   c

 cc c   c   

c Figure.1

Such as maps, weather systems, with crop health and
The term 6   
first came into use in the genotype data, will allow us to predict successful
1990s and was originally synonymous with the outcomes of agriculture experiments. -nother future
management and analysis of DN-, RN- and protein area of research in bioinformatics is large-scale
sequence data. Computational tools for sequence comparative genomics. For example, the development
analysis had been available since the 1960s, but this of tools that can do 10-way comparisons of genomes
was a minority interest until advances in sequencing will push forward the discovery rate in this field of
technology led to a rapid expansion in the number of bioinformatics. -long these lines, the modelling and
stored sequences in databases such as GenBank. visualization of full networks of complex systems
Now, the term has expanded to incorporate many other could be used in the future to predict how the system
types of biological data, for example protein structures, (or cell) reacts to a drug for example. - technical set of
gene expression profiles and protein interactions. Each challenges faces bioinformatics and is being addressed
of these areas requires its own set of databases, by faster computers, technological advances in disk
algorithms and statistical methods. storage space, and increased bandwidth. Finally, a
First, many bioinformatics problems require the same Key research question for the future of bioinformatics
task to be repeated millions of times. For example, will be how to computationally compare complex
comparing a new sequence to every other sequence biological observations, such as gene expression
stored in a database or comparing a group of sequences patterns and protein networks. Bioinformatics is about
systematically to determine evolutionary relationships. converting biological observations to a model that a
In such cases, the ability of computers to process computer will understand. This is a very challenging
information and test alternative solutions rapidly is task since biology can be very complex. This problem
indispensable. of how to digitize phenotypic data such as behaviour,
Second, computers are required for their problem- electrocardiograms, and crop health into a computer
solving power. Typical problems that might be readable form offers exciting challenges for future
addressed using bioinformatics could include solving bioinformaticians.2
the folding pathways of protein given its amino acid
sequence, or deducing a biochemical pathway given a -
Collection of RN- expression profiles. Computers can c
help with such problems, but it is important to note that The aims of bioinformatics are threefold.
expert input and robust original data are also First, at its simplest bioinformatics organises data in a
Required. way that allows researchers to access existing
information and to submit new entries as they are
produced, eg the Protein Data Bank for 3D
macromolecular structures [6,7]. While data-curation is
an essential task, the information stored in these
databases is essentially useless until analysed. Thus the
purpose of bioinformatics extends much further.
The second aim is to develop tools and resources that
aid in the analysis of data. For example, having
sequenced a particular protein, it is of interest to
compare it with previously characterised sequences.
This needs more than just a simple text-based search
and programs such as F-ST- [8] and PSI-BL-ST [9]
must consider what comprises a biologically significant
match. Development of such resources dictates
expertise in computational theory as well as a thorough
understanding of biology. The third aim is to use these
Figure.2 tools to analyse the data and interpret the results in a
biologically meaningful manner. Traditionally,
The future of bioinformatics is integration. For biological studies examined individual systems in
example, integration of a wide variety of data sources detail, and frequently compared them with a few that
such as clinical and genomic data will allow us to use are related. In bioinformatics, we can now conduct
Disease symptoms to predict genetic mutations and global analyses of all the available data with the aim of
vice versa. The integration of GIS data, uncovering common principles that apply across many
systems and highlight novel features.

c c
a mismatch repaircprotein (mmr) situated on the
shortcarm of chromosome 3 [125]. Throughclinkage
Data sourcec Data sourcec
analysis and its similarity tocmmr genes in mice, the
Raw DN- Separating coding and non-coding
gene hascbeen implicated in nonpolyposis colorectalc
sequencec regions cancer [126]. Given the nucleotidecsequence, the
Identification of introns and exons probable aminocacid sequence of the encoded protein
Gene product prediction can be determined using translation software.
Forensic analysisc Sequence search techniques can then be used to find
Protein Sequence comparison algorithms homologues in model organisms, and based on
sequencec Multiple sequence alignments sequence similarity; it is possible to model the
algorithms structure of the human protein on experimentally
Identification of conserved sequence characterised structures. Finally, docking algorithms
motifsc could design molecules that could bind the model
Macromolecular Secondary, tertiary structure prediction structure, leading the way for biochemical assays to
structurec 3D structural alignment algorithms test their biological activity on the actual protein.
Protein geometry measurements c
Surface and volume shape calculations
Intermolecular interactions
Molecular simulations
(force-field calculations,
molecular movements,
docking predictionsc
Genomesc Characterisation of repeats
Structural assignments to genes
Phylogenetic analysis
Genomic-scale censuses
(characterisation of protein content,
metabolic pathways)
Linkage analysis relating specific
genes to diseasesc
Gene Correlating expression patterns
expression Mapping expression data to sequence,
structural and
biochemical data
Other data Digital libraries for automated
Literature bibliographical searches
Metabolic Knowledge databases of data from
pathways literature
Pathway simulations
Table 1. Sources of data used in bioinformatics, the
quantity of each type of data that is currently (-ugust
2000) available, and bioinformatics subject areas that
utilise this data.

 -bove is a schematic outlining how scientists
cc can use bioinformatics to aid rational drug discovery.
One of the earliest medical applications of MLH1 is a human gene encoding a mismatch
bioinformatics has been in aiding rational drug design. Repair protein () situated on the short arm of
chromosome 3. Through linkage analysis and its similarity
Figure 3coutlines the commonly cited approach,ctaking
genes in mice, the gene has been
the MLH1 gene product as ancexample drug target. implicated in nonpolyposis colorectal cancer. Given the
MLH1 is a humancgene encoding a mismatch repairc nucleotide sequence, the probable amino acid sequence of
protein (mmr) situated on the shortcarm of chromosome the encoded protein can be
3 [125]. Throughclinkage analysis and its similarity toc Determined using translation software. Sequence search
mmr genes encodingc techniques can be used to find homologues in model
organisms, and based on sequence
