Presentation about phylogenetic tree and its construction methods.

garimarathee9 130 views 25 slides Oct 19, 2024
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

includes information about phylogenetic tree and its construction methods in an easy and simplified manner.


Slide Content

Phylogenetic Tree Types Construction methods Submitted by: Garima. M.Sc. Biotechnology (04)

Introduction: A phylogenetic tree, is a diagram that depicts the lines of evolutionary descent of different species, organisms, or genes from a common ancestor.

Branches: The branches of a phylogenetic tree represent evolutionary lineages or the descent line. They connect the nodes or divergence sites. Nodes: Nodes represent a common ancestor of the species or groups that diverged from it. Nodes can be referred to as internal nodes or internal branches when they represent non-observable common ancestors. Tips or Leaves: The tips or leaves of a phylogenetic tree represent extant or alive species or groups. Terminal taxa are taxa that are located at the extremities of branches. Root: The root of the phylogenetic tree represents the most recent common ancestor of all the included species or groups. Typically, it is depicted at the base of the tree. Basic terminology :

5. Branch Length: In a phylogenetic tree, the branch length represents the quantity of evolutionary change that has occurred along a particular branch. In terms of time (e.g., millions of years) or genetic variation (e.g., DNA substitutions), the duration can be quantified. 6. Phylogenetic Distance: The phylogenetic distance between two species or groups quantifies how closely they are related. Typically, it is estimated using genetic or morphological differences. 7. Taxa: Taxa are the categories of organisms or species that comprise the phylogenetic tree. Individual species to higher taxonomic levels such as genera, families, orders, and even larger groups. 8. Clades: Clades are monophyletic entities in a phylogenetic tree, composed of an ancestor and all of its descendants. They share unique characteristics that are derived from a common ancestor.

Types : On the basis of presence or absence of a common root - No common root means no common ancestor Common root representing common ancestor

On the basis of topology : displays only the branching pattern of evolutionary relationships among organisms. Cladograms are unscaled, which means that the branch lengths do not reflect the amount of evolutionary divergence between taxa or operational taxonomic units (OTUs). that represents the evolutionary relationships among organisms by showing both the branching pattern and the amount of evolutionary divergence. Phylograms are scaled, which means that the branch lengths are proportional to the amount of evolutionary divergence.

Construction : Choice of molecular markers and taxon sampling Amplification / Sequencing Alignment Choice of evolutionary model Phylogenetic analysis Tree User defined trees and topology testing Results

Selecting sequences for phylogenetic analysis - The rate of mutation is assumed to be same in both coding and non coding regions. However, there is a difference in substitution rate. Non coding D NA regions have more substitution than coding regions. Proteins are much more conserved since they ‘need’ to conserve their function. So, it is better to use sequences that mutate slowly (proteins) than DNA. However, it genes are very small, or they mutate slowly, we can use them for building the trees.

Basic Steps - Choice of Molecular Markers Alignment Determining the subst itution Model Tree- building Tree- evaluation

Choice of Molecular Markers A molecular markers is a molecule contained within a sample taken from an organism or other matter. An ideal marker should be- A single copy gene may be more useful than multiple copy gene, this condition is satisfied by mitochondrial and nuclear genes. Their alignment should be easy. The substitution rate should be optimum so as to provide informative sites. Too much of base variation among taxa is not preferable which may not reflect true ancestry. Ribosomal RNA is considered as best target for phylogenetic as it is universal and is composed of highly reserved and variable regions.

2. Alignment After obtaining DNA from organism. The chosen markers are the amplified using a isolated DNA template and markers specific oligonucleotides as primers by PCR method. The amplified PCR products are then sequenced. 1. Only a successful sequence alignment produces a genealogically related tree. 2. A typical alignment procedure applied in phylogenetic studies should involves the application of CLUSTAL W followed by manual alignment editing and submission to tree- building program.

CLUSTAL W - It is a general purpose multiple alignment program for DNA or proteins. The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. If a tree is used to generate alignment for phylogenetic analysis, then the tree inferred from alignment logically should have same topology.

3. Determining the Substitution Model - To correct homology, statistical methods known as substitution models or evolutionary models, are needed to infer the true evolutionary distances between sequences, 2 main important substitution models : Jukes - Cantor model - Jukes - Cantor model assumes that purines as well as pyrimidines are substituted with equal probability. This model can only analyse reasonably closely related sequences.

2. Kimura model - Kimura-2 parameter model assumes that transition mutations should occur more often that transversion. This is a model that takes in to account the differential mutation rates of transitions & transversion and is more realistic. For protein sequences, the evolutionary distances from an alignment can be corrected using a PAM or other amino amino acid substitution matrix.

4. Tree - Building Methods - Character - Based Methods Maximum Parsimony Maximum Likelihood Method Distance - Based Methods Neighbor Joining ( NJ ) UPGMA

Popular technique used in cladistics to infer a phylogenetic tree for a set of taxa on basis of some observed data on similarities and differences among taxa. Principle - Searches tree that requires smallest no. of evolutionary changes to explain difference among OTUs. Invariant sites are no used in Parsimony ( they yield no information on character state changes.) Informative sites ( at least 2 different kinds of residues - each present at least 2 times) are used by Parsimony because they discriminate between topologies - different topologies require different no. of changes between residues. Maximum - Parsimony Method Character - Based Methods

Maximum Likelihood Method Maximum Likelihood Method create all the possible trees containing the set of organisms considered, and then use the statistics to eval uate the most likely tree. For a number of organisms, this is possible. Perform it's analysis on each position of the multiple alignment. Using a tree model for nucleotide substitutions, it will try to find most likely tree. Maximum Likelihood methods are very slow and computer expensive. Character - Based Methods

Neighbor Joining Method This algorithm is commonly applied with distance tree building, regardless of optimisation criterion. The fully resolved tree is ‘decomposed’ from fully unresolved star tree by successfully inserting branches between a pair of closest (actually, most isolated) neighbors and remaining terminals in the tree. This neighbor pair is then consolidated, effectively reforming a star tree, and this process is repeated. Method is rapid, requiring only few seconds or less for 50 sequences tree. Distance - Based Methods

UPGMA Unweighted Pair Group Method with Arithmetic Mean It is a clustering algorithm- it joins tree branches on the criterion of greater similarity among pairs and averages of joining pairs. It is not strictly an evolutionary method. UPGMA is expected to generate an accurate topology with true branch length only when the divergence is according to a molecular clock or approximately equal to raw sequence dissimilarity . Distance - Based Methods

5. Tree Evaluation Methods - Bootstrapping Method for testing how good a dataset fits a evolutionary model. This method can check branch arrangement or topology of phylogenetic tree. In bootstrapping, the program re-samples columns in multiple aligned group of sequences, and creates many new alignments replacing the original dataset. These new sets represent the population. Process is done at least 100 times and phylogenetic trees are generated from all sets. Part of the results will show the deviation of times a particular branch point occurred out of all the trees that were built.

Jackknife Method It is also a resampling technique. It resamples the original dataset by dropping one or more alignment positions in each replicate. As a consequence, each jackknife replicate is smaller than the original dataset and cannot contain duplicated data points. In practice jackknife is used much less frequently than bootstrap approach.