Distance based method

41,417 views 31 slides Jul 31, 2016
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

construction of phylogenetic tree using distance based methd


Slide Content

Phylogenetic tree construction by distance based method

INTRODUCTION A  phylogenetic tree also known as a  phylogeny is a diagram that depicts the lines of evolutionary descent of different species, organisms, or genes from a common ancestor. Attempt to reconstruct evolutionary ancestors Estimate time of divergence from ancestor

Can be used to solve a number of interesting problems Forensics HIV virus mutates rapidly Predicting evolution of influenza viruses Predicting functions of uncharacterized genes - ortholog detection Drug discovery Vaccine development Target inferred common ancestor

HOW TO CONSTRUCT A PHYLOGENETIC TREE Step1: Make a multiple alignment from base alignment or amino acid sequence (by using MUSCLE, BLAST, or other method)

Step 2: Check the multiple alignment if it reflects the evolutionary process. Step3 : Choose what method we are going to use and calculate the distance or use the result depending on the method. Step 4: Verify the result statistically.

TYPES OF APPROACHES CHARACTER BASED APPROACH It makes use of all known evolutionary information, i.e. the individual substitutions among the sequences, to determine the most likely ancestral sequences.

DISTANCE BASED APPROACH Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified and therefore they require an MSA(multiple sequnce alignment) as an input.

Distance-based methods must transform the sequence data into a pairwise similarity matrix for use during tree inference.

VARIOUS DISTANCE BASED METHODS UPGMA NJ(Neighbor Joining) FM(Fitch- Margoliash ) Minimum evolution

UPGMA Stands for Unweighted pair group method with arithmetic mean . Originally developed for numeric taxonomy in 1958 by Sokal and Michener. This method uses sequential clustering algorithm.

This method follows a clustering procedure : Assume that initially each species is a cluster on its own. Join closest 2 clusters and recalculate distance of the joint pair by taking the average . Repeat this process until all species are connected in a single cluster.

CONSTRUCTION OF PHYLOGENETIC TREE

DRAWBACK Strictly speaking, this algorithm is phenetic , which does not aim to reflect evolutionary descent. It assigns equal weight on the distance and assumes a randomized molecular clock. WPGMA( W eighted  P air  G roup  M ethod with  A rithmetic Mean)is a similar algorithm but assigns different weight on the distances.

NEIGHBOUR JOINING METHOD Neighbor-joining methods apply general  data clustering  techniques to sequence analysis using genetic distance as a clustering metric .  Developed in 1987 by Saitou and Nei .  The simple  neighbor-joining  method produces unrooted trees, but it does not assume a constant rate of evolution (i.e., a  molecular clock ) across lineages. 

It begins with an unresolved star-like tree . Each pair is evaluated for being joined and the sum of all branches length is calculated of the resultant tree . The pair that yields the smallest sum is considered the closest neighbors and is thus joined . A new branch is inserted between them and the rest of the tree and the branch length is recalculated. This process is repeated until only one terminal is present.

DRAWBACKS But it produces only one tree and neglects other possible trees, which might be as good as NJ trees, if not significantly better. Moreover since errors in distance estimates are exponentially larger for longer distances, under some condition, this method will yield a biased tree.

WEIGHTED NEIGHBOUR JOINING(WEIGHBOR) It is a new method proposed recently. The Weighbor criterion consists of two terms; 1. additivity term (of external branches) 2. positivity term (of internal branches), that quantifies the implications of joining the pair.

Weighbor gives less weight to the longer distances in the distance matrix and the resulting trees are less sensitive to specific biases than NJ and relatively immune to the "long branches attraction/distraction" drawbacks observed with other methods.

FITCH – MARGOLIASH METHOD Proposed in 1967 Produces unrooted trees Criteria for fitting trees to distance matrices Uses a weighted least squares method for clustering based on genetic distance. Closely related sequences are given more weight in the tree construction process to correct for the increased inaccuracy in measuring distances between distantly related sequences.

MINIMUM EVOLUTION First decribed by Kidd & Sgaramella – Zonta in 1971, then earlier by Rzhetsky & Nei in 1992. Based on the assumption that the tree with the smallest sum of branch length estimates is most likely to be the true one. U nrooted metric trees

In ME, the tree that minimizes the lengths of the tree, which is the sum of the lengths of the branches, is regarded as the estimate of the phylogeny: where n is the number of taxa in the tree, vi is the ith branch.  

DRAWBACKs In principle all different tree topologies have tobe investigated to find the minimum tree. However, this is impossible in practice because of the explosive increase in the number of tree topologies. Slower than clustering methods. Information lot when characters transformed to distances.

Advantages of distance based approach Less sensitive to variations in evolutionary rate than cluster analysis Fast Can handle many sequences at a time Produce a reasonable estimate of phylogeny

Disadvantages of distance based approach More sensitive than P arsimony or Maximum Likelihood to systematic errors. The relationship between the individual characters and the tree is lost in the process of reducing characters to distances. Strength of the technique is dependent on accuracy of the distance estimate, and thus dependent on the model used to obtain the distance matrix.

THANK YOU
Tags