ADANSONIAN CLASSIFICATION Presented by: Dr. Devyashree Medhi Second Year PGT, Dept. of Microbiology, GMCH
CONTENTS: INTRODUCTION TO BACTERIAL TAXONOMY CLASSIFICATION PRINCIPLES USED IN CLASSIFICATION ADENSONIAN APPROCH TO CLASSIFICATION THE NEO ADANSONIAN PRINCIPLES NUMERICAL TAXONOMY MERITS AND DEMERITS OF NUMERICAL TAXONOMY COMPUTATIONAL APPROACHES TO SYSTEMATIC BIOLOGY.
INTRODUCTION TO MICROBIAL TAXONOMY : Taxonomy is an area of biological science that provides a consistent means to classify, name and identify organisms. This consistency allows biologist worldwide to use a common label for every organism they study within their particular disciplines. Taxonomy comprises of three distinct, but highly inter-related disciplines, that include CLASSIFICATION NOMENCLATURE IDENTIFICATION.
CLASSIFICATION: Classification is the organization of microorganisms that share similar morphologic, physiologic, and genetic traits into specific groups/taxa. This system is hierarchic and consist of the following taxa designation in descending order:
PRINCIPLES USED IN CLASSIFICATION : There is no universally accepted principle to classify bacteria. There are mainly three approaches, that is: 1. Phylogenetic. 2. Molecular. 3. Adansonian.
Phylogenetic classification: Phylogenetic classification: It implies an evolutionary arrangement of species depending on various properties of an organism such as- morphology, staining characteristics, cultural characteristics, biochemical reactions, antigenic structure etc. Though convenient and user friendly, this method is not perfect because the weighted characters used may not be valid all the time for a given bacterium.
Molecular classification: It is based on the degree of genetic relatedness of different organisms. Guanine + Cytosine content of bacteria is estimated. The nucleotide base composition and the base ratio vary widely among different groups of organisms, but for any one particular species, it is constant.
Adansonian Approach To Classification: Michel Adanson, a French botanist, put forward a plan for assigning numerical values to the similarity between organisms and proposed equal weightage should be given to all characters while classifying plants. He used as many characters as possible for the classification, and such classification came to be known as the Adansonian classification. It seems certain today that Adanson played a key role in the history of natural sciences and with his empirical approach vision, achieved a perfect synthesis of “Linnaeism” and “Buffonism”.
….. The contribution of his book Famillies des plantes (1763-1764), however, was appreciated only after a very long delay. It was precisely on the bicentenary of the publication of the first volume of Familles des plantes that the pioneering treatise on phenetics, Principles of Numerical Taxonomy, by Peter H.A. Sneath and Robert R. Sokal appeared. An expanded edition was published in 1973. Sneath and Sokal rediscovered the natural classification proposed by Adanson, employed the idea that Adanson had used of employing combinatorial calculations in classification and realized that this classification was particularly well suited to the application of electronic calculators in taxonomy.
NEO ADENSONIAN PRINCIPLES The fundamental principles of the Neo-Adansonian school are the following: The evaluation of similarities among taxa is based on the observed values of characters and not on phylogenetic speculations and interpretations. Hence taxonomic similarities are phenetic rather than phyletic . The similarities are evaluated on the basis of many characters and all characters are considered to be of equal taxonomic value; hence no character is weighted more or less than any other character.
NUMERICAL TAXONOMY It is the analysis of various types of taxonomic data by mathematical or computerized methods and numerical evaluation of the similarities or affinities between taxonomic units, which are then arranged into taxa on the basis of their affinities. The Techniques of Numerical Taxonomy are as follows: 1) Collecting the organisms, or groups of organisms, to be compared, which are now known as Operation Taxonomic Units (OTUs). 2) Observing these OTUs for presence or absence or quantity of a large set of characters. 3) Drawing up of a table of OTUs versus characters.
Table of OTU versus Character: A character is usually defined as an attribute about which a single statement can be made, e.g., ‘present’ or ‘absent’ or some quantitative measurement. Before drawing the OTU × character table careful thought must be given to what comprises a single character. Some apparent characters may not be permissible because they are redundant. For instance, if an OTU ferments both glucose and sucrose with the production of acid and gas, this may generate three distinct characters, viz., acid from glucose; gas from glucose; sucrose fermented. Here it is redundant to score ‘gas from sucrose’ as a separate character if we know that the fermentation of sucrose involves an initial hydrolysis to glucose, which subsequently is fermented to acid and gas.
Furthermore care must be taken to avoid making illogical comparisons. Suppose one OTU ferments glucose with acid and gas and the second OTU does not ferment glucose at all then in case of the second OTU, it becomes illogical to score a result for ‘gas production’ as gas formation depends on the production of formic acid that we have already noted absent. We therefore score No Comparison (NC) for gas production by OTU number 2 with any other OTU. a: Acid production b: Gas formation .....
Methods of comparing similarities: After the OTU versus character table has been compiled, all possible pairs of OTUs are compared and their similarity computed. There are three basic methods by which measures of similarity may be computed: 1) Correlation coefficients. 2) Measures of taxonomic distance. 3) Similarity coefficients.
Correlation coefficients: Pearson correlation coefficient, also referred to as Pearson's r , the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear distance between two variables. It has a value between +1 and −1, where; +1 is total positive linear correlation 0 is no linear correlation -1 is total negative linear correlation.
Measures of Taxonomic distance: Measures of taxonomic distance attempt to plot the relative position of the OTUs in multi-dimensional space i.e., one dimension for each character studied. If two OTUs are identical their mean taxonomic distance would be 0, whereas if they are absolutely dissimilar their mean taxonomic distance would be +1.
Similarity coefficients: The similarity coefficients have found most application in studies of microbial classification mainly owing to the ease with which they can be computed and the results handled in subsequent stages of the classification. The character data must be broken down into a set of single characters and coded in binary form, i.e., 1/+ for the possession of a character; 0/- for the absence of a character; NC for ‘No Comparison’
…… The S value or the similarity coefficient value is defined as S = (a + d) ÷ (a + b + c + d), where a: sum of positive characters common to strain A and B; d: sum of negative characters common to strain A and B; b: sum of characters in which A is positive and B is negative; c: sum of characters in which A is negative and B is positive. The calculation yield values between 1 and 0; S = 1 means 100% similarity; S <0.02 means complete un-relatedness.
…… There are two chief methods of breaking down the quantitative data, viz., the additive and the non additive method. Since the additive method tend to over-emphasize differences which could be due to differences in growth rate, etc., and so tend to bias the S value in the direction of dissimilarity, therefore the non-additive approach is generally preferred. Apart from the NC entries, it is usual to delete as redundant any character which is uniformly positive or negative for all OTUs under study, otherwise bias towards excess similarity would certainly occur.
…… When the S values have been calculated for all possible pairs of OTUs, they are tabulated in a similarity matrix. Similarity matrix is a table of OTUs × OTUs which is symmetrical about its principal diagonal, since the S value between OTUs A and B is obviously the same as that between B and A. The similarity matrix is therefore recorded usually in triangular form, omitting these redundant entries.
Example of an OTU versus character table: A B C D E I 1 1 II 1 1 III 1 IV 1 1 V 1 1 1 1 1 VI 1 1 1 VII 1 1 1 VIII 1 1 1 1 IX 1 1 X 1 1 In this table: A, B, C, D. E are the OTUs. I, II, III, IV, V, VI, VII, VIII, IX, X are the ,characters under study. If similarity between A and B is to be calculated then as per the formula S= (a + d)÷(a + b + c + d); where a= 5 d= 4 b= 1 c= 0 Therefore S AB = (5 + 4) ÷ (5+4+1+0) = 0.9
Similarity matrix drawn from the OTU versus character table: A B C D E A B 0.9 C 0.2 0.3 D 0.4 0.5 0.6 E 0.3 0.4 0.7 0.7
Cluster analysis: When using S coefficient there are three main ways in which this operation, known as cluster analysis can be tackled: Single linkage Average linkage Total linkage The method most applied to microbial classification is that of single linkage due to its ease of computation and manipulation.
Single Linkage: First the matrix is scanned at a high level of S. The pairs of OTUs with S level ≥ the scan level is listed. Suppose we begin by scanning at a level of S= 1 i.e., absolute similarity; no such similarity exist in this example. The scan level is then decreased by an arbitrarily chosen value. Here a decrement of 0.2 would seem suitable. Therefore: Level OUT-pairs S = 0.8 A,B S= 0.6 A,B; C,D; C,E; D,E Here, we apply the principle of clustering by single linkage, viz. OTU- pairs are fused to form a single cluster if any one OTU of one pair has an S value ≥ the scan level with any one OTU of a second pair. Returning to the example: Level Clusters S= 0.6 A,B; C,D,E Similarly: S= 0.4 A,B; C,D,E (New OTU-pairs: A,D; B,D; B,E)
Dendrogram: At S= 0.6, (A, B) and (C,D,E) form two different clusters. But at S= 0.4 cluster A, B and cluster C,D,E have elements in common ( new OTU pairs A,D; B,D; B,E ) Therefore the five OTUs form into a single group at S= 0.4 and the clustering process ends. The results are then depicted in a family tree or a dendrogram, resembling that of the usual hierarchical classification.
....
….. To check the occurrence of serious distortions due to single linkage mean similarity is used. Mean similarity may be computed either between the members of a single cluster: within-cluster mean Or between two separate clusters: between-cluster mean.
Within-cluster mean: Two forms of within-cluster mean may be obtained: the square mean and the triangle mean. The square mean is the average of all 9 values in the square mean: Γ= 7÷9 = 0.7 The triangle mean ignores the redundant comparisons and self comparisons. Δ= 2.0÷3 =0.6 Δ < Γ , but the two becomes similar as the number of OTUs in the clusters increases. If we compare the means obtained above with their similarity level i.e., S=0.6 we see that the means are ≥ the clustering level. Mean greater than the cluster level indicates that the cluster is homogenous with respect to the mutual similarities between the individual members. C D E C 1 0.6 0.7 D 0.6 1 0.7 E 0.7 0.7 1 C D E C 1 0.6 0.7 D 0.6 1 0.7 E 0.7 0.7 1
Between-cluster mean: The between-cluster mean is obtained from the rectangular matrix of S values. The essence of this approach is that, fusion occurs only if the mean similarity between the OTU and its potential cluster or the mean similarity between two clusters, is ≥ the chosen level of S. This approach largely removes the danger, inherent in the single linkage method, of creating clusters which appear to be more homogenous then they really are.
Merits of Numerical Taxonomy: The data of conventional taxonomy is improved by numerical taxonomy as it utilizes better and more number of described characters. As numerical methods are more sensitive in delimiting taxa, the data obtained can be efficiently used in the construction of better keys and classification systems with the help of electronic data processing systems. Numerical taxonomy has in fact suggested several fundamental changes in the conventional classification systems. The number of existing biological concepts have been reinterpreted in the light of numerical taxonomy. Numerical taxonomy allows more taxonomic work to be done by less highly skilled workers.
Demerits of Numerical taxonomy: The numerical methods are useful in phenetic classifications and not phylogenetic classifications. The proponents of “biological” species concept, may not accept the specific limits bound by these methods. If characters chosen for comparison are inadequate, the statistical methods may give less satisfactory solution. Different taxonometric procedures may yield different results.
Computational approaches to systematic biology: Identax DELTA—DEscription Language for Taxonomy ( development discontinued in 2000) Free DELTA BioLink Linnaeus II (developed by ETI Bioinformatics) Digital Automated Identification SYstem (DAISY) Automated Bee Identification System (ABIS) The Ribosomal Database Project (RDP) Ribosomal Differentiation of Microorganisms (RIDOM) Multi-locus sequence typing (MLST)
Issues in designing and implementing systems biology applications: If all the data relevant to a classification are to be integrated, interoperability among databases and services is essential. GBIF has taken great care to develop data models and protocols that facilitate interoperability An efficient, ongoing process of correction and resolution of missing entries and revisions or extensions to underlying data models are the issues faced, and needs to be dealt with, by all systematists that provide databases or analysis services. The establishment of meaningful collaborations among database owners, application developers and organizations is another issue that affects the users of systematics applications and databases.
…. The nomenclature is not persistent because our understanding of the relationships among the taxa is constantly changing; indeed, even our ideas about what constitutes a species is under constant scrutiny. Nomenclature is frequently debated in cases where some species show heterogeneous association with various other species and this leads to the establishment of competing, parallel classifications .
Steps towards inter-operability: In order to clear the hurdles of accurately identifying, cataloguing, updating and retrieving information pertinent to species, a system of persistent identifiers needs to be established. There are at least two proposals under consideration: Life Science Identifiers (LSIDs) and Digital Object Identifiers (DOIs). LSIDs are used in Page's Taxonomic Search Engine, while DOIs are being developed as unique identifiers of names, taxa, exemplars and associated data by the NamesforLife initiative. The aim of both is to allow researchers to reach data and metadata relevant to an organism, no matter what label the organism may bear.
References: Prakash Bisen . Microbes in Practice(2014). Microbial Taxonomy, Chapter-8, pp:1962-259. Publisher IK International. Carteret, X.(2012). Michel Adanson au Sénégal (1749-1754) : Un grand voyage naturaliste et anthropologique du Siècle des lumières . Revue d’histoire des sciences, volume 65,(1),5-25.doi10.3917/rhs.651.0005. Charity Meier-Ewert & Adrian Gibbs (1970) An Adansonian Classification of Various Texts, Journal of the Australasian Universities Language and Literature Association, 33:1, 39-47, DOI:10.1080/00012793.1970.11760482. Lawrence G. Wayne. Selection for an Adansonian Analysis of Mycobacterial Taxonomy. J Bacteriol.1967.93:1382-1391. HILL, L.R. The Adansonian Classification of the Staphylococci. J. gen. Microbiol.1959: 29:277-288.