Mega6 Tutorial

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

MOLECULAR EVOLUTIONARY

GENETICS ANALYSIS
Authors: Koichiro Tamura, Glen Stecher, Daniel Peterson, and Sudhir Kumar
Version 6.0.5 Follow @iluvmega
Tutorial
Guide to Notations Used
Item Convention Example
Directory & file
names
Small Cap +
Bold
INSTALL.TXT
File name
extensions
Small Cap +
Bold
.TXT, .DOC, .MEG
Email
address/URLs
Underlined
www.megasoftware.net
Pop-up help links Dotted
Underlined +
Green
statement
Help Jumps Underlined +
Green
set of rules
Menu/Screen
Items
Italic Data Menu
User-Entered
Text
Monospace font !Title
Introduction to Walk through MEGA
This walk-through provides several brief tutorials that explain how to perform common tasks in MEGA. Each tutorial requires the use
of sample data files which can be found in the /MEGA/Examples folder (default location for Windows users is C:\Program
Files\MEGA\Examples\. The location for Mac users is $HOME/MEGA/Examples, where $HOME is the users home directory). It is
recommended that you follow the examples for a given tutorial in the order presented as the techniques explained in the initial
examples are used again in the subsequent ones.
In the tutorials, the following conventions are used:
Keystrokes are indicated by bold letters (e.g., F4).
If two keys must be pressed simultaneously, they are shown with a + sign between them (e.g., Alt + F3 means that the Alt and
F3 keys should be pressed at the same time).
Italicized words indicate the name of a menu or window.
Italicized bold words indicate individual commands that are found in menus, submenus, and toolbars.
Main menu refers to the menu bar at the top of the currently active window (File, Analysis, Help, etc.).
Main MEGA menu refers to the menu on the main window of MEGA where you launch all of the analyses from.
Launch bar refers to the toolbar located directly below the main menu of the currently active window (Align, Data, Models,
Distance, etc.).
For brevity, a sequence of menu / button clicks is indicated by a sequence of commands separated by pipes (e.g., File | Open
indicates that you should click on the File main menu item and then click on the Open sub menu item that is displayed).
I want to learn about:
1. Mega Basics
2. Aligning Sequences
3. Estimating Evolutionary Distances
4. Building Trees from Sequence Data
5. Testing Tree Reliability
6. Working with Genes and Domains
7. Testing for Selection
8. Managing Taxa with Groups
9. Computing Sequence Statistics
10. Building Trees from Distance Data
11. Constructing Likelihood Trees
12. Editing Data Files
Page 1 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
Aligning Sequences
In this tutorial, we will show how to create a multiple sequence alignment from protein sequence data that will be imported into the
alignment editor using different methods. All of the data files used in this tutorial can be found in the MEGA\Examples\ folder (The
default location for Windows users is C:\Program Files\MEGA\Examples\. The location for Mac users is $HOME/MEGA/Examples,
where $HOME is the users home directory).
Opening an Alignment
The Alignment Explorer is the tool for building and editing multiple sequence alignments in MEGA.
Example 2.1:
Launch the Alignment Explorer by selecting the Align | Edit/Build Alignment on the launch bar of the main MEGA
window.
Select Create New Alignment and click Ok. A dialog will appear asking Are you building a DNA or Protein sequence
alignment? Click the button labeled DNA.
From the Alignment Explorer main menu, select Data | Open | Retrieve sequences from File. Select the "hsp20.fas" file from
the MEG/Examples directory.
Aligning Sequences by ClustalW
You can create a multiple sequence alignment in MEGA using either the ClustalW or Muscle algorithms. Here we align a set of
sequences using the ClustalW option.
Example 2.2:
Select the Edit | Select All menu command to select all sites for every sequence in the data set.
Select Alignment | Align by ClustalW from the main menu to align the selected sequences data using the ClustalW algorithm.
Click the Ok button to accept the default settings for ClustalW.
Once the alignment is complete, save the current alignment session by selecting Data | Save Session from the main menu.
Give the file an appropriate name, such as "hsp20_Test.mas". This will allow the current alignment session to be restored for
future editing.
Exit the Alignment Explorer by selecting Data | Exit Aln Explorer from the main menu.
Aligning Sequences Using Muscle
Here we describe how to create a multiple sequence alignment using the Muscle option.
Example 2.3:
Starting from the main MEGA window, select Align | Edit/Build Alignment from the launch bar. Select Create a new
alignment and then select DNA.
From the Alignment Explorer window, select Data | Open | Retrieve sequences from a file and select the
Chloroplast_Martin.meg file from the MEGA/Examples directory.
On the Alignment Explorer main menu, select Edit | Select All.
On the Alignment Explorer launch bar, you will find an icon that looks like a flexing arm. Click on it and select Align DNA.
Near the bottom of the MUSCLE - AppLink window, you will see a row called Alignment Info. You can scroll through the text
to read information about the Muscle program.
Click on the Compute button (accept the default settings). A Progress window will keep you informed of Muscle alignment
status. In this window, you can click on the Command Line Output tab to see the command-line parameters which were
passed to the Muscle program. Note: The analysis may complete so fast, that you wont be able to click on this tab or read it.
The information in this tab isnt essential, its just interesting.
When the Muscle program has finished, the aligned sequences will be passed back to MEGA and displayed in the Alignment
Explorer window.
Close the Alignment Explorer by selecting Data | Exit Aln Explorer. Select No when asked if you would like to save the
current alignment session to file.
Obtaining Sequence Data from the Internet (GenBank)
Using MEGAs integrated browser you can fetch GenBank sequence data from the NCBI website if you have an active internet
connection.
Example 2.4:
From the main MEGA window, select Align | Edit/Build Alignment from the main menu.
When prompted, select Create New Alignment and click ok. Select DNA
Activate MEGAs integrated browser by selecting Web | Query Genbank from the main menu.
When the NCBI: Nucleotide site is loaded, enter CFS as a search term into the search box at the top of the screen. Press the
Search button.
When the search results are displayed, check the box next to any item(s) you wish to import into MEGA.
If you have checked one box: Locate the dropdown menu labeled Display Settings (located near the top left hand side
of the page directly under the tab headings). Change its value to FASTA and then click Apply. The page will reload
with all the search results in a FASTA format
If you have checked more than one box: locate the Display Settings dropdown (located near the top left hand side of
the page directly under the tab headings). Change the value to FASTA (Text) and click the Apply button. This will
output all the sequences you selected as a text in the FASTA format.
Press the Add to Alignment button (with the red + sign) located above the web address bar. This will import the sequences
into the Alignment Explorer.
Page 2 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
With the data now displayed in the Alignment Explorer, you can close the Web Browser window.
Align the new data using the steps detailed in the previous examples.
Close the Alignment Explorer window by clicking Data | Exit Aln Explorer. Select No when asked if you would like the save
the current alignment session to file.
Note: We have aligned some sequences and they are now ready to be analyzed. Whenever you need to edit/change your sequence
data, you will need to open it in the Alignment Editor and edit or align it there. Then export it to the MEGA format and open the
resulting file.
Estimating Evolutionary Distances
In this tutorial, we will estimate evolutionary distances for sequences from 11 Drosophila species using various models. The data
files used in this tutorial can be found in the MEGA/Examples folder (The default location for Windows users is C:\Program
Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where $HOME is the users home
directory).
Estimating Evolutionary Distances Using Pairwise Distance
In MEGA, you can estimate evolutionary distances between sequences by computing the proportion of nucleotide differences
between each pair of sequences.
Example 3.1:
Open the "Drosophila_Adh.meg" data file. If needed, refer to the MEGA Basics tutorial.
From the main MEGA launch bar, select Distance | Compute Pairwise Distance.
In the Analysis Preferences window, click the Substitutions Type pull-down and then select the Nucleotide option.
Click the pull-down for Model/Method and select the p-distance model. For this example we will be using the defaults for the
remaining options. Click Compute to begin the computation.
A progress indicator will appear briefly and then the distance computation results will be displayed in grid form in a new
window. Leave this window open so we can compare the results from the next steps.
Compute and Compare Distances Using Other Models/Methods
MEGA supports a wide collection of models for estimating evolutionary distances. Here we compare evolutionary distances
calculated by using different models.
Example 3.2:
Repeat Example 3.1 above, but select the Jukes/Cantor model under the Model/Method pull-down instead of the p-distance
model, leaving all the other options the same. Again, leave the results window open for comparison.
Repeat the analysis, this time selecting the Tamura-Nei model under the Model/Method pull-down, leaving all the other
options the same. Again, leave the results window open for comparison.
You are now able to compare the three open result windows which contain the distances estimated by the different
methods.
After you have compared the results, select the File | Quit Viewer option for each result window. Do not close the
"Drosophila_Adh.meg" data file.
Compute the Proportion of Amino Acid Differences
You can also calculate evolutionary distances based on the proportion of amino acid differences.
Note: MEGA will automatically translate nucleotide sequences into amino acid sequences using the selected genetic code table. The
genetic code table can be edited by Data | Select Genetic Code Table from the main MEGA launch bar.
Example 3.3:
From the main MEGA window, select Distance | Compute Pairwise Distances from the main menu. This will display the
Analysis Preferences window.
Click the Substitutions Type pull-down, select Amino Acid and then select p-distance under Model/Method.
Click the Compute button to accept the default values for the rest of the options and begin the computation. A progress dialog
box will appear briefly. As with the nucleotide estimation, a results viewer window will be displayed, showing the distances
in a grid format.
After you have inspected the results, use the File | Quit Viewer command to close the results viewer.
Close the data by selecting the Close Data button on the main MEGA task bar.
Building Trees from Sequence Data
In this tutorial, we will illustrate the procedures for building trees and in-memory sequence data editing, using the commands
available in the Data and Phylogeny menus. We will be using the "Crab_rRNA.meg" file which can be found in the
MEGA/Examples directory. This file contains nucleotide sequences for the large subunit mitochondrial rRNA gene from different
crab species (Cunningham et al. 1992). Since the rRNA gene is transcribed, but not translated, it falls in the category of non-coding
genes.
The Crab_rRNA.meg file used in this tutorial can be found in the MEGA/Examples folder (The default location for Windows
users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where $HOME is the
users home directory).
Building a Neighbor-Joining (NJ) Tree
Page 3 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
In this example, we will illustrate the basics of phylogenetic tree re-construction using MEGA and become familiar with the Tree
Explorer window.
Example 4.1:
Activate the "Crab_rRNA.meg" data file. If necessary, refer to Example 1.2 of the MEGA Basics tutorial.
From the main MEGA launch bar, select Phylogeny | Construct/Test Neighbor-Joining Tree menu option.
In the Analysis Preferences window select the p-distance option from the Model/Method drop-down.
Click Compute to accept the defaults for the rest of the options and begin the computation. A progress indicator will appear
briefly before the tree displays in the Tree Explorer window.
To select a branch, click on it with the left mouse button. If you click on a branch with the right mouse button, you will get a
small options menu that will let you flip the branch and perform various other operations on it.
Select a branch and then press the Up, Down, Left, and Right arrow keys to see how the cursor moves through the tree.
Change the branch style by selecting the View | Tree/Branch Style command from the Tree Explorer main menu.
Select the View | Topology Only command from the Tree Explorer main menu to display the branching pattern on the
screen.
You can display the numerical branch lengths in the Topology Only option by selecting View | Options and clicking on the
Branch tab. Check the box labeled Display Branch Length and click Ok.
Printing the NJ Tree (For Windows users)
Windows users can print directly from Tree Explorer.
Example 4.2a:
Select the File | Print option from the Tree Explorer main menu to bring up a standard Print window. This will print the tree
full-sized and may take multiple sheets of paper. Press Cancel.
To restrict the size of the printed tree to a single sheet of paper, choose the File | Print in a Sheet command from the Tree
Explorer main menu. Press Ok.
Select the File | Exit Tree Explorer command to exit the Tree Explorer. Click the OK button to close the Tree Explorer
without saving the tree session.
Printing the NJ Tree (For Mac users)
MEGA does not support printing directly from Tree Explorer when running on a Mac system. To print a tree using a Mac, users can
save the tree image to a PDF file and then print it by normal means.
Example 4.2b:
Select the Image | Save as PDF File option from the Tree Explorer main menu to bring up a standard Save window. Save the
image to the desired location.
Once the document is saved, you can open it with your PDF reader and print the document in the same manner as any other
PDF document.
Select the File | Exit Tree Explorer command to exit the Tree Explorer. Click the OK button to close the Tree Explorer
without saving the tree session.
Construct a Maximum Parsimony (MP) Tree Using the Branch-&-Bound Search Option
Using MEGA, you can re-construct a phylogeny using Maximum Likelihood, Minimum Evolution, UPGMA, and Maximum
Parsimony methods in addition to Neighbor-Joining. Here we re-construct the phylogeny for the Crab_rRNA.meg data using the
Maximum Parsimony (MP) method.
Example 4.3
Select the Phylogeny | Construct/Test Maximum Parsimony Tree(s) menu option from the main MEGA launch bar. In the
Analysis Preferences window, choose Max-mini Branch-&-bound for the MP Search Method option.
Click the Compute button to accept the defaults for the other options and begin the calculation. A progress window will
appear briefly, and the tree will be displayed in Tree Explorer.
(Windows users) Now print this tree by selecting either of the Print options from the Tree Explorer's File menu.
(Mac users) Save the tree to a PDF file as described in Example 4.2b above.
Compare the NJ and MP trees. For this data set, the branching pattern of these two trees is identical.
Select the File | Exit Tree Explorer command to exit the Tree Explorer. Click OK to close Tree Explorer without saving the
tree session.
Constructing a MP Tree using the Heuristic Search
For each method of phylogenetic inference, MEGA provides numerous options. In this example, we conduct MP analysis using the
Min-Mini Heuristic search.
Example 4.4:
Follow the steps in Example 4.3 and instead of choosing Max-mini Branch-&-bound, choose Min-Mini Heuristic for MP
Search Method. Change the MP Search Level to 2 and click Compute.
Note: In this example, the same tree is obtained by the Max-mini Branch-&-bound option as in the Min-Mini Heuristic option as
long as the MP Search Level is set to 2. However, the computational time is much shorter for the Heuristic method.
Examining Data Editing Features
For noncoding sequence data, OTUs (Operational Taxonomic Units) as well as sites can be selected for analysis.
Page 4 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
Example 4.5:
From the main MEGA window select the Data | Select Taxa and Groups option from the launch bar. A dialog box is
displayed.
All the OTU labels are checked in the left panel. This indicates that all OTUs are included in the current active data subset. To
remove the first OTU from the data, uncheck the checkbox next to the first OTU name in the left panel. Click the Close
button.
Now, when you construct a neighbor-joining tree from this data set, it will contain 12 OTUs instead of 13. Close out of the
Tree Explorer window by selecting File | Exit Tree Explorer and do not save. Deactivate the operational data set by selecting
the Close Data icon from the main MEGA window.
Testing Tree Reliability
In this example, we will conduct two different tests of reliability using protein-coding genes from the chloroplast genomes of nine
different species.
The data file Chloroplast_Martin.meg which is used in this tutorial can be found in the MEGA/Examples folder (The default
location for Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples,
where $HOME is the users home directory).
Bootstrap Testing for a Neighbor-Joining Tree
Example 5.1:
Activate the "Chloroplast_Martin.meg" file. If necessary, refer to Example 1.2 of MEGA Basics.
On the main MEGA window task bar, select the Phylogeny | Construct/Test Neighbor-Joining Tree option.
The Analysis Preferences window appears on the screen. For the Model/Method, select p-distance. Select Bootstrap method
for the Test of Phylogeny option.
Click Compute to accept the default values for the rest of the options. A progress indicator provides the progress of the test as
well as the details of your analysis preferences.
Once the computation is complete, the Tree Explorer appears and displays two tree tabs. The first tab is the original tree and
the second is the Bootstrap consensus tree.
To produce a condensed tree, use the Compute | Condensed Tree main menu command from the Tree Explorer window. You
can further manipulate the appearance of the condensed tree here. To change the cutoff value, select the View | Options menu
command and click the Cutoff tab. For now, keep the Cut-off value at 50% and click the OK button.
This tree shows all the branches that are supported at the default cutoff value of BCL 50. Select the Compute | Condensed
Tree main menu command and the original NJ tree will reappear.
From the Tree Explorer window, select the Image | Save as PDF File option and save a PDF image of the tree to a
convenient location.
From the Tree Explorer window, select the File | Exit Tree Explorer command to exit the Tree Explorer. A warning box will
inform you that your tree data has not been saved. Click Ok to close Tree Explorer without saving the tree.
Interior-branch testing for the Neighbor-Joining Tree
For neighbor-joining trees, you may conduct the standard error test for every interior branch by using the Interior branch test of
phylogeny.
Example 5.2:
From the main MEGA window, select Phylogeny | Construct/Test Neighbor-Joining Tree from the launch bar.
In the Analysis Preferences dialog, make sure the Substitutions Type option is set to Amino Acid and the Model/Method is set
to p-distance. Set the Test of Phylogeny option to Interior-branch test.
Click Compute to begin the computation. A progress indicator window will appear briefly. When the tree appears, confidence
probabilities (CP) from the standard error test of branch lengths are displayed on the screen.
Compare the CP values on this tree with the BCL values of the tree that you saved as a PDF file in the previous exercise.
Now close the Tree Explorer by selecting File | Exit Tree Explorer from the main menu. Close the current data by clicking
the Close Data icon on the main MEGA window.
Working With Genes and Domains
Defining and Editing Gene and Domain Definitions
In this example we will demonstrate how to specify coding and non-coding regions of a sequence. We will be using the file
Contigs.meg which is located in the MEGA/Examples directory folder (The default location for Windows users is C:\Program
Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where $HOME is the users home
directory).
Example 6.1:
Activate the data file "Contigs.meg". If necessary, refer to Example 1.2 of the MEGA Basics tutorial.
From the main MEGA window launch bar, select Data | Select Genes and Domains.
Notice the column header bar across the top (Name, From, To, #Sites, Coding? 'Codon Start). Domains will be
listed under the column header labeled Name. Click on the domain labeled Data underneath the Genes/Domains group, then
click on the button labeled Delete/Edit. Select Delete Gene/Domain to delete the data domain.
Click on the Genes/Domains label and then click the Add Domain button. Select Add New Domain from the popup menu.
Right-click on the new domain and select Edit Name from the popup menu. Change the name to Exon1 and press the Enter
key.
Page 5 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
Select the ellipses () button next to the first question mark in the From column to set the first site of the domain. When the
Start site for Exon1 window appears, select site number 1 for the AC087512 chimp row and push the Ok button.
Select the ellipsis () button in the To column to set the last site of the domain. When the End site for Exon1 window
appears, select site number 3918 for the AC087512 chimp row and push the OK button.
Check the box in the Coding? column to indicate that this domain is protein coding. You will need to click the box three
times before the check mark appears.
Add two more domains to the Genes/Domains item using the same steps. One of these domains will be named Intron1 and
will begin at site 3919 and end at site 5191. The other will be named Exon2 and will begin at site 5192 and end at site 8421.
Be sure to check the checkbox in the Coding? column for Exon2 to indicate a protein-coding domain.
Click on the Genes/Domains item to highlight it and then click the Add Gene button at the bottom of the screen. From the
popup menu choose Add new gene at the end. Right click on this new gene and change the name to Predicted Gene. Click
and drag all of the newly created domains to the Predicted Gene so that they now appear under the new gene.
Press the Close button at the bottom of the window to exit the Gene/Domain Organization window.
Using Domain Definitions to Compute Pairwise Distances
Now, if we compute pairwise distances between our sequences, the non-coding regions that we specified in the example above will
be ignored.
Example 6.2:
From the main MEGA window, select the Distance | Compute Pairwise Distances option from the launch bar.
In the Analysis Preferences window, click on the Substitutions Type drop-down and select Nucleotide. The Select Codon
Positions row is now enabled. Make sure that the Noncoding sites option does not have a checkmark next to it. Click the
Compute button to begin the analysis.
When the computation is complete, the Pairwise Distances window will display the pairwise distance computed using only
the sequence data from exonic domains of the Predicted Gene. Close the Pairwise Distances window by selecting File | Quit
Viewer and the Sequence Data Explorer window by selecting the Close Data icon on the main MEGA window.
Testing for Selection
In this example, we describe how to perform a codon-based test of positive selection for five alleles from the human HLA-A locus
(Nei and Hughes 1991).
The HLA-3Seq.meg" data file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default location for
Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where
$HOME is the users home directory).
Computing Synonymous and Non-synonymous Distances
Example 7.1:
Activate the "HLA-3Seq.meg" file. If necessary, refer to Example 1.2 in the MEGA Basics tutorial.
From the main MEGA window launch bar, select Selection | Codon-based Z-Test of Selection.
An Analysis Preferences window appears. For the Model/Method, select the Nei-Gojobori method (Proportion) model.
In the Test Hypothesis (HA: alternative) row, select Positive Selection (HA: dN > dS) from the pull-down menu.
From the Scope row, select the Overall Average option.
For the Gaps/Missing Data Treatment option, select Pairwise Deletion.
Click on "Compute" to accept the default values for the remaining options. A progress indicator appears briefly, and then the
computation results are displayed in a results window in grid format.
The column labeled "Prob" contains the probability computed (must be <0.05 for hypothesis rejection at 5% level). The
column labeled "Stat" contains the statistic used to compute the probability. The difference in synonymous and non-
synonymous substitutions should be significant at the 5% level.
Close the Test of Positive Selection window.
Managing Taxa with Groups
The Crab_rRNA.meg file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default location for
Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where
$HOME is the users home directory).
Defining and Editing Groups of Taxa
In MEGA, you can partition data into distinct groups and then evaluate distances within groups, distances between groups, and the
net distance between groups.
Example 8.1:
From the main MEGA window, activate the data present in the "Crab_rRNA.meg" file. If necessary, refer to Example 1.2 in
the MEGA Basics tutorial.
From the main MEGA window launch bar, select Data | Select Taxa and Groups. Notice the left pane called Taxa/Groups
and the right pane labeled Ungrouped Taxa.
Press the New Group button found below the Taxa/Groups pane to add a new group to the data. Name this new group
Pagurus and press Enter.
While holding the Ctrl button on the keyboard, click on all of the items in the Ungrouped Taxa pane that begin with Pagurus.
This will highlight them. When they are all highlighted, press the left-facing arrow button found on the vertical toolbar
Page 6 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
between the two panels (make sure the Pagurus group on the left side is also highlighted otherwise the arrow will not
appear).
Select the All group in the Taxa/Groups panel and press the + (add) button found on the vertical toolbar between the two
window panes to add a second group. Name this group "Non-Pagurus".
Add the remaining unassigned taxa to this group by using the left arrow and press the Close button at the bottom of the
window to exit this view.
Note: Now that groups have been defined, the Compute Within Group Mean, Compute Between Group Means, and Compute Net
Between Group Means menu commands from the Distance option on the launch bar may be used to analyze the data.
Close all of the open windows.
Computing Sequence Statistics
The Drosophila_Adh.meg data file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default
location for Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples,
where $HOME is the users home directory).
Using Sequence Data Explorer
The Sequence Data Explorer provides various tools for visually analyzing sequence data as well as calculating compositional
statistics. In the following examples we will demonstrate the basic usage of the Sequence Data Explorer.
Example 9.1:
Activate the "Drosophila_Adh.meg" file). If necessary, refer to Example 1.2 in the MEGA Basics tutorial.
Select the Data | Explore Active Data (F4) command.
Use the arrow keys on your keyboard or the mouse to move from site to site. At the bottom left corner of the window, you will
find an indicator that displays the column and the total number of sites. As you move through the columns, the column
indicator changes.
Highlighting
If you look at the bottom of the Sequence Data Explorer window, the Highlighted Sites indicator displays "None" because no special
site attributes are yet highlighted.
You can highlight variable sites in various ways:
Select the Highlight | Variable Sites main menu option on the Sequence Data Explorer main screen.
Click the icon labeled V from the launch bar.
Press the V key on the keyboard.
Example 9.2:
Use one of the above methods to highlight variable sites in the Drosophila data. All sites that are variable are now highlighted.
The Highlighted indicator at the bottom of the window has been replaced with the Variable indicator. The number of sites
which are variable is displayed, along with the total number of sites (Variable sites/Total # of sites). When you press the V
key again, the sites return to the normal color. The Highlighted indicator again displays "None".
Now highlight the parsimony-informative sites by pressing the P key, clicking on the button labeled Pi from the shortcut bar
below the main menu, or selecting the Highlight | Parsim-Info sites menu option. The Highlighted indicator turns into the
Parsim-info indicator.
To highlight 0, 2, and 4-fold degenerate sites, press the 0, 2, or 4 keys, respectively, or click on the corresponding buttons
from the shortcut bar below the main menu, or select the corresponding command from the Highlight menu. Once again, the
Highlighter indicator will turn into the Zero-fold indicator, Two-fold indicator, and Four-fold indicator respectively.
Statistics
The Statistics main menu option allows you to calculate Nucleotide Composition, Nucleotide Pair Frequencies and Codon Usage.
Before selecting one of these options, you will need to select whether to use all sites or only the highlighted sites. You will also need
to select the format in which you want the results displayed.
Example 9.3:
Select Statistics | Use All Selected Sites. To display the results of the calculation in a text file using the built-in text editor,
click the Statistics menu option again and select the Display Results in Text Editor option. To calculate the nucleotide base
frequencies, select the option, Nucleotide Composition, from the Statistics menu.
To compute codon usage, go back to the Sequence Data Explorer and select the Statistics | Codon Usage menu command.
This will calculate the codon usage and display the results of the calculation in a text file using the built-in text editor.
To compute nucleotide pair frequencies, select the Statistics | Nucleotide Pair Frequencies | Directional (16 pairs), or the
Statistics | Nucleotide Pair Frequencies | Undirectional (10 pairs) main menu option. This will calculate the pair frequencies
and display the results of the calculation in a text file using the built-in text editor.
Note: Notice that the Amino Acid Compositions option on the Statistics menu is disabled (grayed-out). This option is only available
if the sequences have been translated.
Using the Amino Acid Composition Option
Example 9.4:
To translate these protein-coding sequences into amino acid sequences and back again, select the Data | Translate Sequences
main menu command from the Sequence Data Explorer window.
Once the sequences are translated, calculate the amino acid composition by selecting the Statistics | Amino Acid Composition
main menu command from the Sequence Data Explorer window.
Page 7 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
Close the Text File Editor and Format Convertor window without saving your work. Close the Sequence Data Explorer and
select Close Data icon on the main MEGA window.
Building Trees from Distance Data
This tutorial illustrates procedures for building phylogenetic trees using distance data.
The Hum_Dist.meg data file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default location for
Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples, where
$HOME is the users home directory).
Making a Phylogenetic Tree from Distance Data
Example 10.1:
Activate the "Hum_Dist.meg" file. If necessary, refer to Example 1.2 in the MEGA Basics tutorial.
From the main MEGA window, select Phylogeny | Construct/Test Neighbor-Joining Tree from the launch bar.
The Analysis Preferences window will appear. For distance data files, all of the options shown here cannot be changed. Click
on the button labeled Compute. A progress meter will appear briefly.
The Tree Explorer will display a neighbor-joining (NJ) tree on the screen when the analysis completes.
From the Tree Explorer launch bar, click on the i icon. The number of tabs shown here depends on the type of tree that was
constructed. For a Neighbor-Joining tree, the tabs are General, Tree and Branch. Take a look at each to see the information
they contain.
Saving your Results
MEGA allows you to save trees in MEGAs native format or in the Newick format.
Example 10.2:
From the Tree Explorer window, select File | Save Current Session. In the Save As dialog, use the Save in drop-down menu
to select the location, and then type in a name for the session in the File Name area. The tree will be saved with the MEGA
".mts" extension.
Now, from the Tree Explorer window, select File | Export Current Tree from the main menu. In the Save As dialog, use the
Save in drop-down to select the location. In the File Name area, type a name for the session. The tree will be saved in Newick
format with the ".nwk" extension.
Go to the File menu and click on the Exit Tree Explorer option.
Constructing Likelihood Trees
MEGA provides options for performing various calculations relating to likelihood. In this tutorial, we will focus on the one you'll
probably use most often, constructing Maximum Likelihood trees.
The Drosophila_Adh.meg" data file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default
location for Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples,
where $HOME is the users home directory).
Constructing your Tree
Example 11.1:
Activate the "Drosophila_Adh.meg" file). If necessary, refer to Example 1.2 of the MEGA Basics tutorial.
Select Phylogeny | Construct/Test Maximum Likelihood Tree option from the main MEGA window launch bar.
The Analysis Preferences window will appear. For the Drosophila data file, you can choose between Nucleotide and Amino
Acid substitution types. Select Amino Acid. Now, click on the drop-down for Models/Methods. Note the models available.
Notice that the option to Select Codon Positions is disabled for Amino Acid sequences.
Change the Substitution Type to Nucleotide. The list of Models/Methods changes, showing only models which are applicable
to nucleotide sequences. Select the Tamura-Nei model. Note that the option to Select Codon Positions is now available. Click
on the button labeled Compute. A progress indicator will appear briefly.
The Tree Explorer will display the resulting Maximum Likelihood tree on the screen.
From the Tree Explorer toolbar, click on the i icon. The number of tabs shown here depends on the type of tree that was
constructed. For a Maximum Likelihood tree, the tabs are General, Tree, Branch and Character States. Take a look at each to
see the information they contain.
Saving your Tree
MEGA allows you to save trees in MEGAs native format or in the Newick format.
Example 11.2:
From the Tree Explorer window, select File | Save Current Session from the main menu. In the Save As dialog, use the Save
in drop-down to select the location then type in a name for the session in the File Name area. The tree will be saved with the
MEGA ".mts" extension.
From the Tree Explorer window, select File | Export Current Tree from the main menu. In the Save As dialog, use the Save
in drop-down to select the location then type in a name for the session in the File Name area. The tree will be saved in Newick
.nwk format.
From the Tree Explorer window, select File | Exit Tree Explorer from the main menu. Click the Ok button without saving.
Page 8 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
Editing Data Files
There may be times when you want to make changes to a data file. With the MEGA Alignment Explorer, you can rearrange the taxa,
delete blocks of taxa or delete blocks of sites. The altered data file can then be saved in either MEGA or FASTA format.
The Chloroplast_Martin.meg" data file, which is used in this tutorial, can be found in the MEGA/Examples folder (The default
location for Windows users is C:\Program Files\MEGA\Examples. The default location for Mac users is $HOME/MEGA/Examples,
where $HOME is the users home directory).
Using Alignment Explorer
Example 12.1:
From the main MEGA window, select Align | Edit/Build Alignment. Select Create new alignment | DNA. Then click Data |
Retrieve sequences from a file and press the Ok button.
In the Open window, find and select the "Chloroplast_Martin.meg" file.
Rearranging Data
Example 12.2:
In the Alignment Explorer window, click the row header for the row named Pinus. Hold the left mouse button down and drag
the row up, then release the mouse button when the position indicator is just below the Porphyra row.
Deleting rows
Example 12.3:
Now, click the mouse to highlight Porphyra. Select Edit | Delete on the main menu of the Alignment Explorer. Do the same
for the row Pinus.
Deleting sites
Example 12.4:
Click on the horizontal scroll bar at the bottom of the Alignment Explorer window. Drag it all the way to the right. Now click
on any cell in the last column. Notice that the Site # display changes to show the highest-numbered site, 11039.
You can delete blocks of sites in the same way that you can delete rows of data. Click on the gray header above any column of
sites, hold down the left mouse button and drag across to any other column header to select multiple columns. On the toolbar,
click the X icon to delete the selected sites.
Save the altered data file
Example 12.5:
On the Alignment Explorer menu, click on Data, and then select Export Alignment. Choose either MEGA format, FASTA
format, or the PAUP format. In the Save As window, select the folder in which you want to save your data file and then type a
name in the File Name area. Click the Save button.
Close the Alignment Explorer and click Ok without saving.
Resources
Molecular Evolution and
Phylogenetics (2000)
Introductory book containing many
examples for use with MEGA.
Phylogenetic Trees Made Easy
(2011) (2007)
A cookbook for learning phylogenetic
analysis using MEGA and other
programs.
MEGA: Molecular Evolutionary
Genetics Analysis (1993).
Institute of Molecular Evolutionary
Genetics. University Park, PA, USA.
Building Phylogenetic Trees from
Molecular Data with MEGA(2013)
A Paper, which explains how to
construct phylogenetic trees using
MEGA.
A Walk Through MEGA
Step-by-step instructions to learn how
to use MEGA.
Useful Publications
MEGA related publications.
MEGA Team
Research and development team.
Using MEGA
NEW! Online Manual NEW!
Reference and documentation.
MEGA 6 release notes
A list of major changes and new features
in the current stable version of MEGA.
Update History
A comprehensive list of major changes
with each software release.
Report a Bug
Help us improve our software by reporting
problems you encounter using MEGA.
Example Data
The following data files are provided as
downloads for reference purposes.
Suggestion Box
User feedback plays an important role in
the development of MEGA.
FAQ
Frequently asked questions.
Known Issues
Known issues which exist in MEGA.
Download Stats
MEGA 6 Downloads: 110,926
MEGA 5 Downloads: 405,108
MEGA 4 Downloads: 343,891
MEGA 3 Downloads: 106,285
MEGA 2 Downloads: 69,828
MEGA 1 Downloads: 2,929
Contact Us
Email us with any questions or concerns.
Citing MEGA in a Publication
Citation for MEGA 6:
Tamura K, Stecher G, Peterson D, Filipski A, and Kumar S (2013) MEGA6: Molecular Evolutionary Genetics Analysis Version
Page 9 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php
6.0. Molecular Biology and Evolution 30: 2725-2729.
- Download PDF
Citation for MEGA-CC:
Kumar S, Stecher G, Peterson D, and Tamura K (2012) MEGA-CC: Computing Core of Molecular Evolutionary Genetics
Analysis Program for Automated and Iterative Data Analysis. Bioinformatics 28:2685-2686.
- Download PDF
Citation for MEGA 5:
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, and Kumar S (2011) MEGA5: Molecular Evolutionary Genetics Analysis
using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution 28:
2731-2739.
- Download PDF
Citation for MEGA 4:
Tamura K, Dudley J, Nei M and Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software
version 4.0. Molecular Biology and Evolution 24: 1596-1599.
- Download PDF
Citation for MEGA 3:
Kumar S, Tamura K, Nei M (2004) MEGA3: Integrated Software for Molecular Evolutionary Genetics Analysis and Sequence
Alignment. Briefings in Bioinformatics 5:150-163.
- Download PDF
Citation for MEGA 2:
Kumar S, Tamura K, Jakobsen IB, Nei M (2001) MEGA2: Molecular Evolutionary Genetics Analysis Software. Bioinformatics
17:1244-1245.
- Download PDF
Citation for MEGA 1:
Kumar S, Tamura K, Nei M. (1994) MEGA: Molecular Evolutionary Genetics Analysis Software for Microcomputers.
Computer Applications in Biosciences 10:189-191.
- Download PDF
Copyright 1993-2014.
Page 10 of 10 MEGA :: Molecular Evolutionary Genetics Analysis
27/08/2014 http://www.megasoftware.net/tutorial.php

You might also like