Bioinformatics Paper

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

ccccccccccccccccccccccccccccccccccccccc cccccccccccccccccccccccccccc ccc 


c

Isha Noohi Chishty


[email protected]
ccccccccc

cBioinformaticians are the tool builders and it is


critical that they understand biological problems as
ccccccccccccccccccccccccccccccccccccc c well as computer solutions in order to produce useful
tools.
- flood of data means that many of the challenges in Insights into the three-dimensional (3D) structure of a
biology are now challenges in computing. Protein is of great assistance when planning
Bioinformatics, the application of computational experiments aimed at the understanding of protein
techniques to analyse the information associated with function and during the drug /vaccine /antibody
bio molecules on a large-scale, has now firmly /enzyme /protein design process. The experimental
elucidation of the 3D-structure of proteins is however
established itself as a discipline in molecular biology,
often hampered by difficulties in obtaining sufficient
and encompasses a wide range of subject areas from protein, diffracting crystals and many other technical
structural biology, genomics to gene expression studies.c aspects.
This paper deals with some of the applications of Design of vaccines has attained new dimensions with
Bioinformatics. This can be given as follows- the availability of complete genome sequences of
c diseased organisms, three dimensional / two
dimensional structural informations (coordinate
1).Designing Drugsc
values) of proteins involved in interaction of MHC,
2).Finding homologc epitopes and T cell receptors stored in PDB / MDB
3).Overall Genome Characterizationc database. Besides, there are different algorithms can
access the potentiality of generated vaccines.
We will propose a solution which maximizes utilization c
of laboratories for research work in the field of In the present study, we are presenting a case study of
informatics can be achieved.c accurate method. Modelling of probable vaccine
epitopes against visceral Leishmaniasis. The software
Bioinformatics is the application of statistics and
tools used in the study have generated three-
computer science to the field of molecular biology. dimensional coordinates of desired epitopes and the
Over the past few decades rapid developments in stability and validity analysis. Hence, the accuracy as
genomic and other molecular research technologies and well as efficiency of softwares is the points of
developments in information technologies have Significant emphasis.
combined to produce a tremendous amount of
information related to molecular biology.cccccccccccccccc
cc 
c designing, homology, genome
characterization.cccccccccccccccccccccccccc

cccccccccccccccccccccccccccc c

Bioinformatics is defined as an interdisciplinary fieldc


involving biology, computer science, mathematics and
statistics to analyze biological sequence data, prediction
of genes and regulatory elements, their arrangement and
proteome analysis involving prediction of 2D, 3D
structures of proteins [1]. In other words, bioinformatics
is a subset of the larger field of computational biology,c
which includes the application of quantitativec   c

 cc c   c   

c Figure.1

c
Such as maps, weather systems, with crop health and
The term 6   
first came into use in the genotype data, will allow us to predict successful
1990s and was originally synonymous with the outcomes of agriculture experiments. -nother future
management and analysis of DN-, RN- and protein area of research in bioinformatics is large-scale
sequence data. Computational tools for sequence comparative genomics. For example, the development
analysis had been available since the 1960s, but this of tools that can do 10-way comparisons of genomes
was a minority interest until advances in sequencing will push forward the discovery rate in this field of
technology led to a rapid expansion in the number of bioinformatics. -long these lines, the modelling and
stored sequences in databases such as GenBank. visualization of full networks of complex systems
Now, the term has expanded to incorporate many other could be used in the future to predict how the system
types of biological data, for example protein structures, (or cell) reacts to a drug for example. - technical set of
gene expression profiles and protein interactions. Each challenges faces bioinformatics and is being addressed
of these areas requires its own set of databases, by faster computers, technological advances in disk
algorithms and statistical methods. storage space, and increased bandwidth. Finally, a
First, many bioinformatics problems require the same Key research question for the future of bioinformatics
task to be repeated millions of times. For example, will be how to computationally compare complex
comparing a new sequence to every other sequence biological observations, such as gene expression
stored in a database or comparing a group of sequences patterns and protein networks. Bioinformatics is about
systematically to determine evolutionary relationships. converting biological observations to a model that a
In such cases, the ability of computers to process computer will understand. This is a very challenging
information and test alternative solutions rapidly is task since biology can be very complex. This problem
indispensable. of how to digitize phenotypic data such as behaviour,
Second, computers are required for their problem- electrocardiograms, and crop health into a computer
solving power. Typical problems that might be readable form offers exciting challenges for future
addressed using bioinformatics could include solving bioinformaticians.2
the folding pathways of protein given its amino acid
sequence, or deducing a biochemical pathway given a -
cc
c
Collection of RN- expression profiles. Computers can c
help with such problems, but it is important to note that The aims of bioinformatics are threefold.
expert input and robust original data are also First, at its simplest bioinformatics organises data in a
Required. way that allows researchers to access existing
information and to submit new entries as they are
produced, eg the Protein Data Bank for 3D
macromolecular structures [6,7]. While data-curation is
an essential task, the information stored in these
databases is essentially useless until analysed. Thus the
purpose of bioinformatics extends much further.
The second aim is to develop tools and resources that
aid in the analysis of data. For example, having
sequenced a particular protein, it is of interest to
compare it with previously characterised sequences.
This needs more than just a simple text-based search
and programs such as F-ST- [8] and PSI-BL-ST [9]
must consider what comprises a biologically significant
match. Development of such resources dictates
expertise in computational theory as well as a thorough
understanding of biology. The third aim is to use these
Figure.2 tools to analyse the data and interpret the results in a
biologically meaningful manner. Traditionally,
The future of bioinformatics is integration. For biological studies examined individual systems in
example, integration of a wide variety of data sources detail, and frequently compared them with a few that
such as clinical and genomic data will allow us to use are related. In bioinformatics, we can now conduct
Disease symptoms to predict genetic mutations and global analyses of all the available data with the aim of
vice versa. The integration of GIS data, uncovering common principles that apply across many
systems and highlight novel features.

c c
a mismatch repaircprotein (mmr) situated on the
shortcarm of chromosome 3 [125]. Throughclinkage
Data sourcec Data sourcec
analysis and its similarity tocmmr genes in mice, the
Raw DN- Separating coding and non-coding
gene hascbeen implicated in nonpolyposis colorectalc
sequencec regions cancer [126]. Given the nucleotidecsequence, the
Identification of introns and exons probable aminocacid sequence of the encoded protein
Gene product prediction can be determined using translation software.
Forensic analysisc Sequence search techniques can then be used to find
Protein Sequence comparison algorithms homologues in model organisms, and based on
sequencec Multiple sequence alignments sequence similarity; it is possible to model the
algorithms structure of the human protein on experimentally
Identification of conserved sequence characterised structures. Finally, docking algorithms
motifsc could design molecules that could bind the model
Macromolecular Secondary, tertiary structure prediction structure, leading the way for biochemical assays to
structurec 3D structural alignment algorithms test their biological activity on the actual protein.
Protein geometry measurements c
Surface and volume shape calculations
Intermolecular interactions
Molecular simulations
(force-field calculations,
molecular movements,
docking predictionsc
Genomesc Characterisation of repeats
Structural assignments to genes
Phylogenetic analysis
Genomic-scale censuses
(characterisation of protein content,
metabolic pathways)
Linkage analysis relating specific
genes to diseasesc
Gene Correlating expression patterns
expression Mapping expression data to sequence,
structural and
biochemical data
Other data Digital libraries for automated
Literature bibliographical searches
Metabolic Knowledge databases of data from
pathways literature
Pathway simulations
cc
Table 1. Sources of data used in bioinformatics, the
quantity of each type of data that is currently (-ugust
2000) available, and bioinformatics subject areas that
utilise this data.
c
 
c
c

 
cc
 -bove is a schematic outlining how scientists
ccc
cc can use bioinformatics to aid rational drug discovery.
One of the earliest medical applications of MLH1 is a human gene encoding a mismatch
bioinformatics has been in aiding rational drug design. Repair protein () situated on the short arm of
chromosome 3. Through linkage analysis and its similarity
Figure 3coutlines the commonly cited approach,ctaking
to 
genes in mice, the gene has been
the MLH1 gene product as ancexample drug target. implicated in nonpolyposis colorectal cancer. Given the
MLH1 is a humancgene encoding a mismatch repairc nucleotide sequence, the probable amino acid sequence of
protein (mmr) situated on the shortcarm of chromosome the encoded protein can be
3 [125]. Throughclinkage analysis and its similarity toc Determined using translation software. Sequence search
mmr genes encodingc techniques can be used to find homologues in model
organisms, and based on sequence
c
c
c 
ti cit
 ci c ic
 ti ct c c 
i ilitcitcic ilct c lctcttc
ctc  c  jti ct ci c 

ii tct c i ict c


 ti c ci  tllcti cttci llc  lti ct c  c cllt c

i cl it c l c i c llcttc l ci ctc 

i itic tlct ct  c i c


 ti cic

lcttcl i ctcc
ci  ilcct cttc li ci c i ci  tci tct tc c
tici l ilctiitc ctctlc ti c ttcc
 ] ]
     
 c
 
tcc c  cttctc i c
 ti cttc
 citc i c  c c tclcil ctc
tc

i itc iti c cct c cticicc


 R i cc ctctc c t ccc
]]   ] i c  c clic c c clt tic
 ]  ]   i c
 ti ct ct c cR i c c
 t i cc tllci ilc ctc iti ti cic
-clc  c
c i c  c cc
ccc i

 cicttct i cc i c


l it cc cli c citi c
c
 ti ct ccilct c ic ci   c
li
i ctc li cl it cictcctc  i c
 ti cilc i cctc i c
li cctt c i c i c c c
ctc
 ti ct lctclc iti tc i

 c
l it ctcli ciciltcci   tllctti c cttc c llc
ct ct i c ctc

 cc  c'c


  t¶c(  cttc
ll c  i c
 ti cic ti i c
ccttc cttc
tic ci l c  c")*&c!C c")+,
 ilctcictillc  ct cct 
l cc
./&c cl0c".#&c c tc  ccc t  c
 ti cicl c
t ctcltc
cci i
c tc
-t ! c"))c.$&c1 ticti iti c
c2i c l   tc ci t  c
cc tccc i c
! i c12! c".%&cCM,! c".)c..&c c30(c  ti c
c clti c cc i c
 ti c
".4&ctcli cictt ci citc titc 
c

i itc iti cilc i cicctitlc


ccccc c iti ct cli c
liilitcitc cc ilc  tc i c

tct ci  c

i itc iti c
t cctcltctc
ctct c
lilci c  ct c ccillccl c i c ctllcl c
ll c
c
 ti lc cttcc ct c i  lc
itti ct ci  tlc tcic citc
 tctcli cc  c
 ct cc ci ilct ct i clc cc c
'i  c
it¶cBcitcic tti llc ic t ct c i c ciciltc ctcic
ctc

c i c  cll c ti c


liilitcc i i c cilc tclcctc
c i c
ccc- tciti ct cli
c i c  c l c
 ti c c i ctcltct c tcc
i lc
c  i ct ctccttc l  cR lc  cC i i ctcltc cc ci lcc
i c c l c iti ict c  cti c ii cllc i c
 ti clcitc  c
t c i icc l  tilc5c
t c c c  i  c ci
c cci  tlci i c c
 ticitic5c c  ci   ti c c i iiti c tcc l i cttitilc lcitc
 , i c ct ticl it c
 ti i citc
ctci c i c
 ti c
 ]   ]! "] c
  




-llc i c  c citcilt,i c6 l´c 
 i c
 ti ctctcclc cc c i clic c
ctc ii c
ci c
tc c tcctctc ic
cllc c c i i
 ticictcc
ci ilitict c
c c c l c c ti icc i c
 ti c !i

 tci llctc


 c li c
cilt,i c i c
 ti c
c i c  c t tic  iti c
c tci  ti
iti c
c
c l  ct c c cclctc
cttc  ti c l cc c itctilcc
 ti cict cc
c lc ctcttc c tc c tc i cict 
i ci
 ti ct c
ilc  cttctcctctc
 ti c
cc lt c ti c c lci cc lc
tilcttc
c cci  tlci i c ti c ti citcic ilct cc
c


i itic ci iiti c t tci tcci


icttc  l cttccttc t c citc

cctc
c   c c it ci i c c ti clc c
ctc l c
ctclttct c
i
 ti c
 ctlcttc c c
 ci  c tc
 ci
illcitcttlc tc
i ci  t c c c't ¶ctc i c  tilc lc
c ti ccllc c c

 ti ct cttcttc ctci c'tt, i  tllc l cttc


cl c l c
 ¶c c't  ¶c i c
 ti cic cc "#$%&ci ilct icc ci c
l c
i ccil i cci cti citci c R  iti ci cictticttc iti c
 i c
 ti ct c c jti ctci c  c c
i i cttc
c tc l c
 

ii tct c i ictc lti ct c  ci ctctc iti cic tillc
c
c
Bi l ilc   tc cc j c cci c
:ilc"#$)&c;ci  ilc cttlc tcc  tti lci l ci  lc l i cttitilc
li ct ic l cc  ci cl ,llc  i c t lct ctci lc
 c ici ci,t tc
lictc ctcltcli ct c l ci c  ci ct icct icc
t c ct c
i,llc  i ccc  cc t i ctc ci lit ci cc i  7c c
i  tcc c   i c itc c i c tc
 c  c
- cil tc cicl c l  ci c ic itlilcllct c tc
 c ,  cllct c
 l c
i i cict ilc ct c
i c t i ctct itcttcc,lt c c
 i ci ci c lc  c c c  ,lt ci cctilc lti c
c c

 ti lc tcic


 tlct 
 ct c ttc llc
i ii lc c cclclcitcl ci li
ictc
 l c
c t i c lc cc 
# 

 li ci lc  i c
itc ct cli ctc
 ci ilct c c lit c c5cticic c c c
c tic c licictc
 cclcttlc ic jtc
 c c tli  tc
ctc   ct c c
Ô 
 "<#&c  t l c li c c tc ic
tci c
ccccccc illctc ci c ccli ci cc i

 tc  i ctcictci t ic cttc


( t tilc cttccilc i  cci c citc ilct ctctc lti c c
tc l c
c tilc i ilc ti cc  ilc
ctc i c
ct c c-c
ii ci c  c cc llclcttlc ltit c
c lti c tcti ctci c
i

 ct ci ilc ti c cc  ct c  iti lcllcc c lti c-tctc
i c c llcttci
illci ct c c l tcllc i tc tti c

tci ii lc


ttctc tc tc l ti c-tccicllclc  lc
c   tc  c liti cltlct 
c

$
 i i ct  iti c lti c ci ti c
8lti tlc lc cci  l ci c c




ci iti c ll i iti c c  i cic

t cl i ct ci citi cc litc
c
 ctc ttc
c ic tti cictc c
c  c lti c c  citi cll ct c
i ctc c c tci l ilc
tci cc l c
c t tilc lc cl it c
!=-c cc
itc c tti c
tc  cc ct cctc
cl it ic
t cc i  ci c#<<.cc!c c;itc c ttitilc c t tilct ic i c
 c
ctc
ctct ctcc tittc
c1 ic tcitic
i c tc c i ti c
Rcttc  c c l ctc
itc c l it c
c l c c ci c lc

cc
,lii c  i ct cc   ctcti c t cM cCi cM tcCl cl it c
c
ë
   
cc!c;itciltcc
tc Bi c lic
c l c c c ilitic
t ct c
i ctc clci ctc!=-c c lc
ttc  cc ti ctct 
cR=-c c tc

tc ct c ci itilci  tc


c
 ti ct c M c
ctct icc c ctc l c
t c cM tc tc c tti ct c tti c c ti c
 ilic tti c
 ci illctctc  cillc
c lic
c
 ic!=-cc t tlc i c ci  i c   
c 
-lt c tc c

ii tlct cllctc


 
% &#
 i
 ti clt ct c cttc c
i c ttcitcic
lct c  cllctic
ci c
c  c c cc t i  cc 
 ti ci t c t lct c c
tcttc
i c R=-cllcitc ltilct ic c c ilc t cB  c liti c
i l i c i c c!=-c ctc lci  ti
ci tti cjtcc
c
tc
  c i cilc lic
c ci c til c lic clc c ti ci cc
-1 ctc i c ilclllci tc  c ttcic lc ct cctctc
 i cM( c ci cliti c
c -c lci c cc
ltil ci ,itci iti c-llc
ctct ic  ctcl,lc c c c cc
ct lc i, c / cjtct cici ct  c
c lti ci  ilc ci ilc
ti c c lcci
ic ti c
l c
 it citcti cl  tic 9c c
 cc i

 tc
l citi ctilcc
c
c

-ctilc i c
ccti tc cttcitc t,
 i 9c- ct ctc cc
l c c tlc ti ct cctiilitc ci  itc
t clt c  i 9c! ctict tc
ci c
 ci
ic ic ct  c;itctic
lllc c
clt c i c
 c i
 ti cc ic i ti c
ci c l c
t iti lc lti ct9c itilct ic cttc ci c i i ii ctcltc tc
c
tc
 c
c
l c i

ctlct c  ctt  tc c tiiti ctc


 i c cttctci c
c
l ct c  ltc
c icltci cli
cRlcli
ti c
 i c ci c
tc
ll ct iti lc  i c l cl ct ci  c
c titi ci tc
(l  ticli
iti c"$#c)#&c;c cl c  clc tti c
c cill c"#%*&c c
i ttc tc c ti c
 ti ;ci cttctc  iti c , ctt  tc l cctil  c
tilc ti c
l cc
t clt ct ci
ic i
illct ctcti tc c ictc i i c
Bi  ilc
 ti c"4+c4<&ctc
i i cc tc tc

tic c
c  iti citc i i lc
ilitctc iitc
c t lictci c i ,

tc"#%+&c1i ctc tctc


c
i

 tc  i c"$/#/.&c l   tccc i ci cltcct c


c c ilci ctc tct c it tc
tc
 c
ctc tciti c c c
c ic 

 ti cictci c tcC i i c 
 

i ci
 ti citcttlc c
 ti lc c
li
iti c
c ti cc cctctc Bi i
 ticc tc j ci tc c
ic  c
cc ti c
l ci cc cic i t l c citcliti cctc  tc
i itic
cici cllc"##$&ctc
c tc t cc  c c jtc cc
 iclc tcttcc c i ci cl,lc tc c i c jtc l cc
ci l ctcclllc c
ctc tc   lcit tctci i
 tict ic>c
iti c c c
c ici
 ti cictc it tci i
 tic li ci ttti c
c
i c tcC i i ci ci
 ti c tc tc l cc ti lc!ct ctc
itcttlc c
 ti lcli
iti c
c ti c i i
 ticliti c tc
cllctc j c
c cctctcic  c
cc ti c  jtc c  c tillcc

l ci cc cici itic


cici cllc c
"##$&ctc iclc tcttcc c i c Bi i
 ticl ci  ctc c i c
i cl,lcci l ctcclllc i c i c cl ctc i c
c ti, i ilc
c  tcBi i
 ticicl c ct c t c c
 $##
   cl c cBi i
 tic  ccl c
  ct c c ,icli  tciclci c
 i  ti
i c
 ti c
c c cl c c
M tc tcliti ci ctc  ilci cc
 ti litc
 t c c ci c lic"#%/&cicllc ;itctc tc lc
c tc tti lc
i  lc ili ci c tc
cllc

t cc t cc ci i lct ci l ilc


i

 tc ic"#%#&cc c".%#%$c#%%&c c i titi cii llc l  c


ctc lic
ti l ic"#%)&c c i ctc   tc
ci l ilc ci i
 tic c
i tc  lci cllc  ti
iti c
c   cci c c
cjtcci l i c
 cttcc c i

 tlci c

t cllc ttlci l c ic c ci c


 i ccic
cli i ctcc
cill c t ic cticicc i  c ci t ti c
 cilitc t tilc cttc8i ctc c  c ic
ctc tcttc
c
il c
i ci cic$c c l c i c   c  ci ilc c i cllct ici c
ttci ctc c ti c cc c i i
 ticitcicttc
c i c c
i  t tlctct iti clt cc c  i ctc tc  i ct ci l illc
c ci ci cllc1i ccl c  i 
lci ilitic c cttc
c li c
   c i ci  tc ct cc ct c ctc
c tct ci
c c t ctc
ltc ct c  l ilci t ti c ti c
c tctc
c tcc
"#%.#%4&c cl c i clcttct c ttc c  cc
lt ci ctc i ci c
ctc
il c
 itctct iitc
ctilc cc ;icct c t c c  ictci
 ti c
tc  ci ci i
 tic i  citc  it citci l ilc llc cclcl9cc
i  tlc ic
ci ii lcc it ct c c
 lti lictc
tc
cltc-ctilcc
c c

c
c

-ccltci i
 ticc tc lc i  c #<<<;c$)# 7+,##c
tc tct ci l ilci titi ctc  c #)cC ticCc( ti c ct  c
 ilic
ctc
tc i  i c
c tccllc cticccc llci l itc" &c=tc#<<$ ;c%.*4%*< 7.)%,)c
-lct c i ci ii lct ci c tilc c #.c  cC-c? c!c  t c?Mc( ti c
l c ct citct cttcclt ci c 
 ilic c i c
l c=tc#<<) ;c
 ct c %*$4./* 74%#,)c
8  c ci ilcttclc c  c #4c2c-McC ticCc c i

 tc i ci c


 c t i ci ilc ti ctt7ctcttc
t c cilitc lc
tcttcc
 c lti c   ic
ctcl i c?cM lcBi lc
 ict c c #<+/;c#%4% 7$$.,*/c
c #*cRllcRBcicM-clcR-cBtc(-ct c
]% M?cR  iti c
c l  c c l  c ti c
#cRi tcct¶ci c ci cccti lcc
c
l 7c lic
c c cttc ti c?cM l c
tc c=tc#<<<;c%<<4*%4 7.#*,$/c Bi lc#<<*;c$4<% 7)$%,%<c
$cB  c!-cB,Miicc2i  c!?c #+cRllcRBcicM-cBtc(-clcR-ct c
tllc?cRcB-c;lc!2c 1 B c=lic M?cR  iti c
c l  c c l  c ti c
-i cRc$///;c$+c# 7#.,+c
l @  tc
c iti cc c it c
li  tcci c iilctitti c tic
%cBi c-c-ilcRcc;,(Rc
( ti c c#<<+;### 7#,<c
 ti c c tc citcl  tc #<citc;Mc!iti ii c l  c
 c l  c
 MB2ci c$///c=lic-i cRc$///;c  ti ctcA lc#<*/;c#<7<<,##/c
$+# 7).,+c $/ct cR2cB i c :c2i  c!?c-c ic
)cli  cR!c-  cM!c;itccClt c tic c ti c
 ilici c#<<* ;c
R-cBi c cBlc-Rc tclc; l, $*+.%%+ 74%#,*c
 c c i c c lc
c $#c1ti cMciccC i c ci ct c
c
 ilci
l cR ci c#<<.;$4<c  ti ctt7cc
cc
i itctclitc Mc
.$$% 7)<4,.#$c Mi i lcRc#<<+;c$$) 7$**,%/)c
.c!  i ci c tcc  itc$4c? c#<<<c c
4cB ti cCcB tlcc;illi c1?cMc
c?cBicM!cR c?Rctccc( t i c!tc c
B c-c t, cilc
ilc
c
 llcttc c?cBi  c#<**;c c
+/$ 7%#<,$)c
*cB  cMc;t c?c cAc1illil c1c c
Btc=c;iicctclcc( ti c!tcB c
=lic-i cRc$///;c$+# 7$%.,)$c c
+c( c;Rc2i  c!?c   ct lc
c
i l ilc c i c( c=tlc- cic c
8cc-c#<++;c+.+ 7$))),$))+c
<c-ltlccM  c2c

c--cA c?c c
A cAcMillc;ctclc1 cB2-c c(,
B2-7cc c ti c
c ti c tcc c
  c=lic-i cRc#<<*;c$.#* 7%%+<,
%)/$c c
#/c ltcCc? c;ic??c2cc t c
C?c1 cMRc1 lcRc2 c cC  cR-c
c
!iti ctclt ciitc
cc tic
 cCllc#<<+;c<.. 7*#*,*$+c
c
##c(  c-1c?  c2?cB cc
t
l tcc8c!;c-c!=-cttlctlc

c iic lic?cM lcBi lc$///;c$<<) 7</*, c


<%/c
#$cB icMc1 t ccB 117cB t c c
 l  ic
c c c c=lic-i c
Rc$///;c$+# 7$*,%/c c
#%c?

cC?cM liti c ti cBc


c
c

You might also like