Diversity Analyses 2

Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

By: zw766900 (18 month(s) ago)

good ppt for us!

Presentation Transcript

Slide 1: 

Diversity Analyses Ashok K.Chhabra and R.K. Behl, Department of Plant Breeding, CCS Haryana Agricultural University, Hisar

Slide 2: 

Biodiversity is the outcome of natural evolution which has been going on for the last 3.5 billion years, when life first arose on this planet (Narian, 2000).

Slide 3: 

Genetic Variation-Sources Wild Weedy Progenitors Crop Communities- Domestication Natural Selection Spontaneous Mutations/Chromosomal Changes Primary GP Secondary GP Tertiary GP Allelic variation Variation in chromosome number , arrangement Diversity Analysis Phenotypic Molecular Quantitative Marker Karyomorphology/Molecular Markers-ancestry relationship/evolutionary patterns

Slide 4: 

Germplasm collections are repositories of the available biodiversity and are a valuable source of useful genes for plant breeders

Slide 5: 

The variability present among different genotypes of a species is known as Genetic Diversity geographical separation genetic barriers to crossability

Slide 6: 

Phenotypic Level Genotypic Level

Slide 7: 

P1 X P2 P1 X P2 > Diversity vs. Heterosis Alfa alfa, cotton, maize etc.

Population Str. vs. Genetic Gain : 

Population Str. vs. Genetic Gain P1 P2 F2 F3 F4

Slide 9: 

METHODS FOR ASSESSMENT OF BIODIVERSITY Phenotype based i) Metroglyph analysis iI) D2 statistic iii) Canonical Analysis iv) Cluster Analysis v) PCA i) Isozymes ii) DNA markers Genotype based

Slide 10: 

Methods Based on Phenotypic Scoring

Slide 11: 

Metroglyph analysis was developed for displaying a set of multivariate responses. In the graph each genotype vector is represented by a circle of fixed radius (called as glyph) with rays emanating from its periphery. Each variable is assigned a sign (ray position) and the length of ray represents the index score (long ray, short ray, no ray for high, medium, low scores, respectively) of the variate based on the range of variability. Metroglyph Analysis

Slide 12: 

An index can be prepared for each glyph by assigning values to long ray ( say 2), short ray (say 1) and no ray (say 0). The glyphs are positioned in the graph by selecting two most variable characters and ploting means of one character against those of the other character. The rays corresponding to these traits are deleted from each glyph.

Slide 13: 

Metroglyph and index score analysis Each character was illustrated by a ray at a fixed position on each glyph. The range of the character was represented by a different length of the ray. The index score were obtained by allotting numerical values (1, 2 or 3) to three grades of expression viz.,low, medium and high recognized in respect of each character and finally summing up the scores obtained by each variety for all characters under study.

Slide 14: 

Frequency Distribution of 70 Triticale strains on the basis of total index score

Slide 15: 

Multivariate analysis can be defined as the application of techniques that deal with the relationship among large number of variables in one or more sets simultaneously.

Slide 16: 

Uses Quantitative estimate of genetic diversity. Tracing evolutionary pattern Determining relationship between genetic and non-genetic factors. Classify germplasm into different groups. Selecting penotypically stable genotypes under varying environments. Choosing most desirable parents for hybridization – maximizing frequency of transgressive segregants.

Slide 17: 

Multivariate analysis developed by Mahalanobis known as Mahalanobis’ generalized distance. The main objective of the application of D 2 analysis is to reduce the number of comparisions among genotypes by classifying them into different clusters. The D2 analysis is carried out by taking all variables together form the data based on multiple characters. D Analysis 2

Slide 19: 

D estimation is complicated when variables are large in numbers and characters are correlated (- or +). This needs transformation of correlated variables in uncorrelated one using pivtol condensation method. This method is used for the inversion of error variance-covariance matrix and estimation of elements from error variance-co-variance matrix and error +genotype matrix. 2

Slide 20: 

D analysis helps the plant breeder in clustering genotypes into distinct groups which , in turn, help in choosing potential genotypes for hybridization. The analysis provides specific information about clusters: Cluster means Intra-inter cluster distances Contribution of different traits 2

Slide 21: 

Using D2 statistics 70 Triticale genotypes were grouped into 9 clusters by Tocher’s method Cluster I consisting of 32 genotypes was largest followed by II (15 genotypes) and III (six genotypes). The mutual relationship among clusters was shown with the help of cluster diagram as shown in Fig. 1.

Slide 22: 

D statistic measures The degree of diversification and Determines the relative proportion of each component character to the total divergence. 2

Slide 23: 

Points that should be taken into consideration while selecting parents on the basis of D2 statistic are: The relative contribution of each character to the total divergence. The choice of cluster with the maximum statistical distance, and The selection of one or two genotypes from such clusters.

Slide 24: 

Geographic distribution in relation to genetic divergence Heterosis as a function of genetic divergence. Probable ancestry as a measure of genetic divergence. Environmental sensitivity of genetic diversity. Other specific features

Slide 25: 

Canonical Analysis Multivariate statistical method for assessing the genetic diversity present in large number of germplasm lines. Mean value of all the given canonical variates are computed then out of them 2 canonical variates /roots (1 and 2) which supply the best two linear function are selected and used in plotting a 2 dimensional diagram. Clusters are made on the basis of closeness of more or less similar values of 1 and 2.

Slide 26: 

Canonical Analysis Seventy diverse genotypes of triticale for eight agronomic traits. On basis of canonical roots, 8 distinct clusters were obtained. The number of genotypes that were placed in 8 distinct clusters was 30, 22, 5, 7, 3, 1, 1 and 1 respectively. 30 22 7 5 3 1 1 1

Slide 27: 

Comparative efficiency between D, Metroglyph and Canonical Analysis Grouping pattern using Canonical Roots revealed 65.7 per cent resemblance with that of Tocher’s method The most divergent groups in Tocher’s method were apparent in Canonical Root Analysis also. Contrarily, such resemblances between Metroglyph analysis & Tocher’s method : 28.6 and Metroglyph analysis & Canonical root analysis : 27.1% Thus in general Metroglyph analyses showed less resemblance with other two methods. Therefore, it seems that grouping pattern in first two methods is mutually exclusive as both are derived from multivariate analysis, whereas, Metroglyph is based on the average scores for different traits in different genotypes. 2

Slide 28: 

M D C 65.7 % 28.6 % 27.1% Comparative efficiency between D2, Metroglyph and Canonical Analysis

Slide 29: 

Main objective: Transformation of multi correlated variables into another set of uncorrelated variables and choosing a few (generally 2-3) of them based on the interest of the researcher. In statistical terms, the PCA can be considered as the usual development of eigen-values (or characteristics roots or latent roots) and mutually independent eigen vectors (Or principle components) ranked in descending order of variance size Principle Component Analysis

Principal component and Principal factor analysis : 

Principal component and Principal factor analysis Problem: Include all the possible variables, making the data matrix perceivably large, complicated, unmanageable and beyond comprehension. Principal component analysis, basically a data reduction technique solve this problem Transforming the original set of variables into a smaller set of linear combinations that account for most of the variability of the original set.

OBJECTIVE : 

OBJECTIVE The objective of principal component analysis is to identify the minimum number of components, which can explain maximum variation out of the total variance (Anderson, 1972; Morrison, 1978 and Dillon and Goldstein, 1984).

Features : 

Features The first principal component absorbs and accounts for maximum proportion of total variability in the set of all variables and remaining components account for progressively lesser and lesser amount of variation

Procedure : 

Procedure Extrtaction of PCs Variance-Covariance matrix or correlation matrix Number of PCs to be retained Eigen roots > 1 (Kaiser 1958)

Slide 34: 

The number of PC’s is equal to the number of variables The vector of PCs is equal to the number of variables The sum of the variances of the PCs’ is equal to the sum of the variances of the original varieties. The first PC ( the normalized linear combination) accounts for maximum possible variance following lby second, third and so on. Principle components can be computed based on variance-covariance matrix. However, the two approaches will lead to different kinds of results since PCA is not scale invariant. The PCA is most suitable if variables are in the same unit. If not, then standardization of all the varieties by dividing each by its estimated standard deviation i.e. using a correlation matrix instead of variance-covariance matrix. If the variables are measured in same units then variance – covariance matrix gives better interpretable results. Some general characteristics of the PCA are

PCA Vs PF : 

PCA Vs PF In PCA, total variation is considered In PF analysis interest centers on that part of variance only, which is shared by the common factors leaving aside the unique factor (including error) of the variable. Moreover, in contrast to PCA, here the component axes are allowed to interact resulting in distortion of mutual orthogonality. Thus results of PF analysis as compared to PCA can be predominantly ascribed to influence of environment and interaction among the principal axes.

Slide 36: 

CLUSTERING OF GENOTYPES-PF

Procedure : 

Procedure Principal component method does not require assumption of multivariate normal distribution of population in contrast to the other methods like maximum likelihood method (Jaiswal, 2000) Initially data analysed without any rotation When no clear cut picture varimax rotation (Kaiser 1958) Different factors load different characters

Plotting of Genotypes on Different factors : 

Plotting of Genotypes on Different factors Principal factor scores are calculated for all the genotypes for all the principal factors using Anderson-Rubin method and are utilized in finding genotypes superior for different factors i.e. for all the characters cumulatively ascribed to that factor. A high value of principal factor score of a particular genotype in a particular principal factor denotes high values for those variables in that genotype, which that factor is representing.

CLUSTER ANALYSIS : 

CLUSTER ANALYSIS STEPS Obtain the data matrix Standardize the data matrix (Optional) Compute the resemblance matrix Execute the Clustering method Rearrange the data and resemblance matrices Compute the Cophenetic correlation coefficient

Data Matrix : 

Data Matrix

Strandardize data matrix (optional) : 

Strandardize data matrix (optional) Zij= Xij – Xi Si

Compute the resemblance matrix : 

Compute the resemblance matrix

Unweighted Pair group method using Arithmatic averages (UPGMA) : 

Unweighted Pair group method using Arithmatic averages (UPGMA)

Revise the resemblance matrix : 

Revise the resemblance matrix

Revise further : 

Revise further

Revise further : 

Revise further

Tree Preparation : 

Tree Preparation

Rearranged : 

Rearranged

Compute cophenetic correlation coefficients= 0.93 : 

Compute cophenetic correlation coefficients= 0.93

HIERACHICAL CLUSTER ANALYSIS : 

HIERACHICAL CLUSTER ANALYSIS OBJECTIVE To identify groups or variables that are similar among themselves, especially when they are closely related (Sneath and Sokal, 1973)

Advantages : 

Advantages Both quantitative and qualitative data which allows utilizing all the information available on the sample. Each entry is treated as an individual entity of equal weight it defines the degree of relatedness among the samples and can predict the degree of segregation of given samples thus making it a powerful tool to precisely classify the population.

Procedure : 

Procedure Clusters are formed by grouping cases into bigger and bigger clusters until all cases are members of a single cluster. UPGMA (Unweighted pair-group method using arithmetic averages) method of hierarchical cluster analysis is important UPGMA defines the distance between two clusters as the average of the distances between all pairs of cases in which one member of the pair is from each of the clusters. Distance or similarity measures are generated by the Proximity procedure.

Procedure (Contd) : 

Procedure (Contd) Actual formation of clusters Dendogram is produced using rescaled distances Dendogram is the transformation of proximity matrix into a tree. The tree makes it easy to see the similarities and dissimilarities between all pairs of objects. Deciding no. of clusters- Cut the tree at some point within a wide range of the resemblance co

Slide 54: 

Dendrogram of 71 accessions of O. glaberrima & O. barthii

Slide 55: 

Dendrogramof 127 O. nivara accessions

Slide 56: 

Similarity Coefficient Level of Genetic variation in cultivated germplasm

Slide 57: 

Zhong-You-Zao81 Peng-Shan-Tie-G Level of Genetic variation in cultivated germplasm – PAU lines

Slide 58: 

384-entry GCP sorghum reference set selected from 3365 accessions based on amplification product diversity from 41 primer pairs detecting SSR loci distributed across all 10 linkage groups represents wild & landrace variability available to sorghum breeders representative of race x geographic origin distribution of this genetic variation

New tools for sorghum : 

3365-entry GCP sorghum composite germplasm collection New tools for sorghum East Asia, India, Middle East, Western Africa, Central Africa, Eastern Africa, Southern Africa, North America, Latin America, & Australia

Slide 60: 

Wild & weedy Bicolor Durra Kafir Caudatum Guinea

New tools for sorghum : 

New tools for sorghum 384-entry GCP sorghum reference set

Slide 62: 

New tools for sorghum 48-entry GCP sorghum micro-core collection

Slide 63: 

Hierarchical clustering (UPGMA) of 92 sorghum lines based on SSR allelic data for 46 primer pairs - radial representationDissimilarity Index: Simple matching as implemented in DARwin Grain mold reaction Susceptible Resistant

Slide 64: 

Hierarchical clustering (UPGMA) of 92 sorghum lines based on SSR allelic data for 46 primer pairs – Hierarchical representationDissimilarity Index: Simple matching as implemented in DARwin Grain mold reaction Susceptible Resistant

Slide 65: 

Weighted Neighbor Joining-based clustering of 92 sorghum lines based on SSR allelic data for 46 primer pairs – Hierarchical representationDissimilarity Index: Simple matching as implemented in DARwin Grain mold reaction Susceptible Resistant

Slide 66: 

Weighted Neighbor Joining-based clustering of 92 sorghum lines based on SSR allelic data for 46 primer pairs – Radial representationDissimilarity Index: Simple matching as implemented in DARwin Grain mold reaction Susceptible Resistant

Slide 67: 

Weighted Neighbor Joining-based clustering of 92 sorghum lines based on SSR allelic data for 46 primer pairs – Radial representationDissimilarity Index: Simple matching as implemented in DARwin 1 2 5 4 3 6 1 2 3 4 5 6a 6b Grain mold reaction Susceptible Resistant

Slide 68: 

ICRISAT-Patancheru pearl millet hybrid parents Clusters 1-3 Cluster 3 Cluster 2 Cluster 1 ICMB 04999 ICMB 04888 ICMB 05444 841B, 841B 842B ICMP 451 IPC 338 ICMB 02555 AIMP 92901 S1-296 ICMB 05222 ICMB 06444 GB 8735 S1-15 ICMB 05888 81B ICMB 04111 ICMB 04333 843B ICMB 91222 ICMB 00111

Slide 69: 

Application of molecular markers for diversity analysis: Examples

Slide 70: 

CONCLUSIONS Clustering of genotypes into different groups is aimed at selection of parents for hybridization. Metroglyph analysis is graphical representation which provides clues about the diversity in the germplasm particularly when two characters to be used on X and Y axis are distinctly variable. This method can be used for initial work D2 analysis is a potent multivariate analysis to classify germplasm in large number of correlated variables are to be considered. Canonical analysis is useful where canonical roots 1 and 2 ( based on z1 and z2) explain maximum variances. PCA is useful where a few principal components specially 1 and 2 account for maximum variability and the traits are measured in same units. However, PCA can also be used even if the traits are measured in different units, however that needs standardization of variates. Isozymes and DNA markers, being expensive and technologically intensive should be used for validation of statistical methods and to precisely characterize genetic diversity at protein and gene level respectively.

Slide 71: 

THANKS