Chapter 1 - Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Introduction to

Bioinformatics
BIF 415 – CSC 412
Some useful information ….
➢ Dr. Nancy Fayad
➢ Email: [email protected]
➢ Office: CHSC 6204
➢ Office hours: by appointment
➢ TR 14:00 – 15:15 pm
➢ Block A room 401
➢ Grading (subject to small variation)
• Multiple phases project, oral presentation, participation → 50%
• Exam I Exam II → 50%

BIF415/CSC412 - CHAPTER 1 2
Textbooks
• Bioinformatics genes, proteins, and computers
by Christine Orengo David Jones Janet Thornton
2005
• Bioinformatics databases, tools, algorithms by
Orpita Bosu Simminder Kaur Thukral 2007
• Introduction to Bioinformatics, 4th edition by
Arthur Lesk 2014
• Bioinformatics a practical guide to the analysis
of genes and proteins, 2nd edition by Andreas D
Baxevanis B F Francis Ouellette 2001

BIF415/CSC412 - CHAPTER 1 3
Some remarks and tips
for success

BIF415/CSC412 - CHAPTER 1 4
Not a passive learning experience !
Course objectives
• Use online databases to search for genes and proteins using different annotations.
• Perform sequence similarity search to identify similar genes and proteins
• Look for special SNPs
• Annotate nucleotide sequence and predict genes
• Perform sequence alignment, clustering and phylogeny analysis
• Efficiently explore protein 3D structures
• Predict protein secondary structure from sequence
• Perform protein homology search
• Use Biomart to perform multiple conditioning queries

BIF415/CSC412 - CHAPTER 1 5
Chapter 1
Introduction
THE RISE OF BIOINFORMATICS

BIF415/CSC412 - CHAPTER 1 6
Let’s brainstorm!

Bioinformatics?

BIF415/CSC412 - CHAPTER 1 7
Bioinformatics: a brief history
Bioinformatics is the application of computer technology to the understanding and effective use of
biological and biomedical data. It is the discipline that stores, analyses and interprets the Big Data
generated by life-science experiments, or collected in a clinical context. This multidisciplinary field is
driven by experts from a variety of backgrounds: biologists, computer scientists, mathematicians,
statisticians and physicists.

Swiss Institute of Bioinformatics (https://www.sib.swiss/what-is-bioinformatics)

Interdisciplinary ! Emerging ! Indispensable !

BIF415/CSC412 - CHAPTER 1 8
Bioinformatics: a brief history
• First use of the term
Computer “Bioinformatics” → 1980s
Biology
sciences • The use of computers to
retrieve, process, analyze,
store and simulate biological
information
Bioinformatics

BIF415/CSC412 - CHAPTER 1 9
David J. Lipman, former director of the National
Center for Biotechnology Information (NCBI), Bioinformatics:
called her ‘the mother and father of
bioinformatics’ a brief history
Protein analysis as the starting point:
Automation & Analysis of Edman degradation Needleman and Wunsch, who developed the
Margaret Dayhoff: the first bioinformatician first dynamic programming algorithm for
(COMPROTEIN) pairwise protein sequence alignments

1965 1980s

1950s 1970

Dayhoff later developed the one-letter amino


First MSA algorithm
acid code that is still in use today.
Da-Fei Feng and Russell F. Doolitle in 1987 →
This one-letter code was first used in Dayhoff
improvement of MSA
and Eck’s 1965 Atlas of Protein Sequence and
Structure

BIF415/CSC412 - CHAPTER 1 10
COMPROTEIN, the first
bioinformatics software.
(A) An IBM 7090 mainframe, for
which COMPROTEIN was made
to run.
(B) A punch card containing one
line of FORTRAN code (the
language COMPROTEIN was
written with).
(C) An entire program’s source
code in punch cards.
(D) A simplified overview of
COMPROTEIN’s input (i.e.
Edman peptide sequences) and
output (a consensus protein
sequence).

BIF415/CSC412 - CHAPTER 1 11
A shift from Protein to DNA
Transcribed into RNA:
DNA sequences:
Following maturation → mRNA
Basis of the genetic information
determines the protein sequence

Central Dogma

Amino acids determine the structure Translated into amino acids:


of a protein Amino acids have different natures

BIF415/CSC412 - CHAPTER 1 12
DNA analysis

1. Comparisons (e.g. finding homology between sequences from different organisms);

2. Calculations (e.g. building a phylogenetic tree of multiple protein orthologs)

3. Pattern matching (e.g. finding open reading frames in a DNA sequence).

4. Protein sequence prediction

BIF415/CSC412 - CHAPTER 1 13
➢ Interlocking domains
➢ Basic bioinformatics vs
advanced bioinformatics Evolution
and
genomics
Sequence
3D
modeling

Functional
Structure Function domain
prediction

BIF415/CSC412 - CHAPTER 1 14
Potential applications of
Bioinformatics in different fields?

BIF415/CSC412 - CHAPTER 1 15
Limitations

Data quality

Data availability

Computational
power

BIF415/CSC412 - CHAPTER 1 16
Recommendations
✓ Important to keep in mind the potential for errors produced by bioinformatics
programs
✓ Caution should always be exercised when interpreting prediction results
✓ Good practice to use multiple programs, if they are available, and perform multiple
evaluations
✓ A more accurate prediction can often be obtained if one draws a consensus by
comparing results from different algorithms

BIF415/CSC412 - CHAPTER 1 17
Article of the day…

A brief history of
bioinformatics.
Gauthier et al., 2019
doi: 10.1093/bib/bby063

BIF415/CSC412 - CHAPTER 1 18

You might also like