logging in or signing up Linkage intro Maria Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 601 Category: Education License: All Rights Reserved Like it (1) Dislike it (0) Added: January 28, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005Slide2: Outline 1. Aim 2. The Human Genome 3. Principles of Linkage Analysis 4. Parametric Linkage Analysis 5. Nonparametric Linkage AnalysisSlide3: 1. AimSlide4: For a heritable trait... localizes region of the genome where a locus (loci) that regulates the trait is likely to be harboured identifies a locus that regulates the trait Linkage: Association: Family-specific phenomenon: Affected individuals in a family share the same ancestral predisposing DNA segment at a given trait locus Population-specific phenomenon: Affected individuals in a population share the same ancestral predisposing DNA segment at a given trait locusSlide5: 2. Human GenomeSlide6: A DNA molecule is a linear backbone of alternating sugar residues and phosphate groups Attached to carbon atom 1’ of each sugar is a nitrogenous base: A, C, G or T Two DNA molecules are held together in anti-parallel fashion by hydrogen bonds between bases [Watson-Crick rules] Antiparallel double helix Only one strand is read during gene transcription Nucleotide: 1 phosphate group + 1 sugar + 1 base DNA structure A gene is a segment of DNA which is transcribed to give a protein or RNA product Slide7: DNA polymorphisms RFLPs A B Minisatellites Microsatellites >100,000 Many alleles, (CA)n, very informative, even, easily automated SNPs 10,054,521 (25 Jan ‘05) Most with 2 alleles (up to 4), not very informative, even, easily automatedSlide8: Haploid gametes ♁ ♂ ♂ ♁ G1 phase chr1 chr1 S phase Diploid zygote 1 cell M phase Diploid zygote >1 cell ♁ ♂ ♁ A - B - A - B - A - B - A - B - A - B - A - B - ♂ ♁ A - B - A - B - ♂ ♁ - A - B - A - B - A - B - A - B DNA organization Mitosis 22 + 1 2 (22 + 1) 2 (22 + 1) 2 (22 + 1)Slide9: Diploid gamete precursor cell (♂) (♁) (♂) (♁) Haploid gamete precursors Hap. gametes NR NR R R ♁ A - B - - A - B A - B - - A - B A - B - - A - B A - B - - A - B ♂ ♁ A - B - A - B - - A - B - A - B DNA recombination Meiosis 2 (22 + 1) 2 (22 + 1) 22 + 1 22 + 1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1Slide10: Diploid gamete precursor (♂) (♁) (♂) (♁) Haploid gamete precursors Hap. gametes NR NR NR NR ♁ A - B - - A - B A - B - - A - B A - B - - A - B A - B - - A - B ♂ ♁ A - B - A - B - - A - B - A - B DNA recombination between linked loci Meiosis 2 (22 + 1) 22 + 1Slide11: Human Genome - summary Recombination fraction between loci A and B (θ) Proportion of gametes produced that are recombinant for A and B If A and B are very far apart: 50%R:50%NR - θ = 0.5 If A and B are very close together: <50%R - 0 ≤ θ < 0.5 Recombination fraction (θ) can be converted to genetic distance (cM) Haldane: eg. θ=0.17, cM=20.8 Kosambi: eg. θ=0.17, cM=17.7 DNA is a linear sequence of nucleotides partitioned into 23 chromosomes Two copies of each chromosome (2x22 autosomes + XY), from paternal and maternal origins. During meiosis in gamete precursors, recombination can occur between maternal and paternal homologs Slide12: 3. Principles of Linkage AnalysisSlide13: Linkage Analysis requires genetic markers M1 M2 Mn M1 M2 Mn M1 M2 Mn θ 0.5 0.5 .4 .3 .15 .3 .4 0.5 Q θ 0.5 0.5 .4 .3 .1 .26 .35 0.5 .35 .22 .3 .4 Slide14: Linkage Analysis: Parametric vs. Nonparametric Q M Phe A D C E Genetic factors Environmental factors Mode of inheritance Recombination Correlation Chromosome Gene Adapted from Weiss & Terwilliger 2000Slide15: 4. Parametric Linkage AnalysisSlide16: Linkage with informative phase known meiosis M2M5Q2Q2 M1M6Q1Q? M1Q1/M2Q2 M3M4Q2Q2 M1Q1/M3Q2 M2Q2/M3Q2 M1Q1/M4Q2 M1Q1/M4Q2 M2Q2/M4Q2 M2Q1/M3Q2 Chromosome M1..6 Q1,2 Autosomal dominant, Q1 predisposing allele Gene ♁ ♂ NR: M1Q1 NR: M2Q2 R: M1Q2 R: M2Q1 θMQ = 1/6 = 0.17 Informative Phase known (~20.8 cM) M1M2Q1Q2 M1 Q1 M2 Q2Slide17: M1M2Q1Q2 M3M4Q2Q2 NR: M1Q1 NR: M2Q2 R: M1Q2 R: M2Q1 Q2Q2 Q1Q? P 1-θ 1-θ θ θ M1Q1/M2Q2 R: M1Q1 R: M2Q2 NR: M1Q2 NR: M2Q1 P θ θ 1-θ 1-θ M1Q2/M2Q1 N 3 2 0 1 N 3 2 0 1 + + Informative Phase unknown Linkage with informative phase unknown meiosis M1Q1/M3Q2 M2Q2/M3Q2 M1Q1/M4Q2 M1Q1/M4Q2 M2Q2/M4Q2 M2Q1/M3Q2Slide18: Parametric LOD score calculation Overall LOD score for a given θ is the sum of all family LOD scores at θ eg. LOD=3 for θ=0.28 Slide19: M1 M2 Mn θ 0.5 0.5 .4 .3 .1 .3 .4 0.5 Q For each marker, estimate the θ that yields highest LOD score across all families Markers with a significant parametric LOD score (>3) are said to be linked to the trait locus with recombination fraction θ This θ (and the LOD) will depend upon the mode of inheritance assumed MOI determines the genotype at the trait locus Q and thus determines the number of meiosis which are recombinant or nonrecombinant. Limited to Mendelian diseases. Parametric Linkage Analysis - summarySlide20: M1M2Q1Q1 M3M4Q1Q2 M2M3Q1Q1 M1M4Q1Q2 M1M4Q1Q1 M2M4Q1Q2 NR: M3Q1 NR: M4Q2 R: M3Q2 R: M4Q1 Q1Q1 Q2Q? P 1-θ 1-θ θ θ M3Q1/M4Q2 R: M3Q1 R: M4Q2 NR: M3Q2 NR: M4Q1 P θ θ 1-θ 1-θ M3Q2/M4Q1 N 1 2 0 1 N 1 2 0 1 + + Practical 1. Identify informative individual(s) 2. Reconstruct possible phase(s) 3. Classify gametes as R or NR 4. Count R and NR gametes 5. Express 6. Express LOD score Slide21: Practical II Talk example Practical example Graph each…Slide22: Outline 1. Aim 2. The Human Genome 3. Principles of Linkage Analysis 4. Parametric Linkage Analysis 5. Nonparametric Linkage AnalysisSlide23: 5. Nonparametric Linkage AnalysisSlide24: Approach Parametric: genotype marker locus & genotype trait locus (latter inferred from phenotype according to a specific disease model) Parameter of interest: θ between marker and trait loci Nonparametric: genotype marker locus & phenotype If a trait locus truly regulates the expression of a phenotype, then two relatives with similar phenotypes should have similar genotypes at a marker in the vicinity of the trait locus, and vice-versa. Interest: correlation between phenotypic similarity and marker genotypic similarity No need to specify mode of inheritance, allele frequencies, etc...Slide25: Phenotypic similarity between relatives Squared trait differences Squared trait sums Trait cross-product Trait variance-covariance matrix Affection concordance T2 T1Slide26: Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same DNA sequence but they are not necessarily derived from a known common ancestor IBD Alleles shared Identical By Descent are a copy of the same ancestor allele M1 Q1 M2 Q2 M3 Q3 M3 Q4 M1 Q1 M3 Q3 M1 Q1 M3 Q4 M1 Q1 M2 Q2 M3 Q3 M3 Q4 IBS IBD 2 1 Inheritance vector (M) 0 0 0 1 1 Slide27: Genotypic similarity between relatives - M1 Q1 M3 Q3 M2 Q2 M3 Q4 Number of alleles IBD 0 M1 Q1 M3 Q3 M1 Q1 M3 Q4 1 M1 Q1 M3 Q3 M1 Q1 M3 Q3 2 Proportion of alleles IBD - 0 0.5 1 Inheritance vector (M) 0 0 1 1 0 0 0 1 0 0 0 0Slide28: Genotypic similarity between relatives - 22nSlide29: Statistics that incorporate both phenotypic and genotypic similarities Genotypic similarity ( ) Phenotypic similarity 0 0.5 1Slide30: Haseman-Elston regression – Quantitative traits Phenotypic dissimilarity Genotypic similarity b × = + c 0 0.5 1Slide31: VC ML – Quantitative & Categorical traits method 0 0.5 1 H1: H0: e.g. LOD=3Slide32: Individual LOD scores can be expressed as P values (Pointwise) LOD Chi-sq (n-df) P value 2.1 9.67 0.0009 Genome-wide linkage analysis (e.g. VC) (x4.6)Slide33: Statistics for selected samples T2 T1 H0 (No linkage): Mean H1 (Linkage): Mean H0 (No linkage): Mean H1 (Linkage): Mean Mean IBD sharing statistics (Risch & Zhang 1995, 1996)Slide34: Other Linkage statistics Dependent variable: Phenotypes Independent variable: Dependent variable: Independent variable: Phenotypes Extensions to Haseman Elston VC ML with mixture distribution Pedwide-regression Analysis (“reverse HE”) Reverse VC ML (Wright 1997, Drigalenko 1998, Elston et al. 2000, Forrest 2001, Visscher & Hopper 2001, Xu et al. 2000, Sham & Purcell 2001) (Eaves et al. 1996) (Sham et al. 2002) (Sham et al. 2000) Statistics for affection traits Based on IBD scoring functions eg. Sall (Whittemore & Halpern 1994, Kong & Cox 1997) Forrest & Feingold 2000 Mixed statistic Slide35: No need to specify mode of inheritance Nonparametric Linkage Analysis - summary Models phenotypic and genotypic similarity of relatives Expression of phenotypic similarity, calculation of IBD HE and VC are the most popular statistics used for linkage of quantitative traits Other statistics available, specially for affection traits Type I error? Power?Slide36: Type I error Type I error True positive LOD k Theoretical (Lander & Kruglyak 1995) EmpiricalSlide37: Theoretical genome-wide thresholds Genome-wide threshold for suggestive linkage LOD score that occurs by chance alone on average once per scan LOD = 2.2, Chi-sq = 10.1, Pointwise P = 0.00074 Genome-wide threshold for significant linkage LOD score that occurs by chance alone on average once per 20 scans LOD = 3.6, Chi-sq = 16.7, Pointwise P = 0.000022 Slide38: Empirical genome-wide thresholds Genome-wide threshold for suggestive linkage LOD score that occurs by chance alone on average once per scan Genome-wide threshold for significant linkage LOD score that occurs by chance alone on average once per 20 scans You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Linkage intro Maria Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 601 Category: Education License: All Rights Reserved Like it (1) Dislike it (0) Added: January 28, 2008 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Slide1: Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005Slide2: Outline 1. Aim 2. The Human Genome 3. Principles of Linkage Analysis 4. Parametric Linkage Analysis 5. Nonparametric Linkage AnalysisSlide3: 1. AimSlide4: For a heritable trait... localizes region of the genome where a locus (loci) that regulates the trait is likely to be harboured identifies a locus that regulates the trait Linkage: Association: Family-specific phenomenon: Affected individuals in a family share the same ancestral predisposing DNA segment at a given trait locus Population-specific phenomenon: Affected individuals in a population share the same ancestral predisposing DNA segment at a given trait locusSlide5: 2. Human GenomeSlide6: A DNA molecule is a linear backbone of alternating sugar residues and phosphate groups Attached to carbon atom 1’ of each sugar is a nitrogenous base: A, C, G or T Two DNA molecules are held together in anti-parallel fashion by hydrogen bonds between bases [Watson-Crick rules] Antiparallel double helix Only one strand is read during gene transcription Nucleotide: 1 phosphate group + 1 sugar + 1 base DNA structure A gene is a segment of DNA which is transcribed to give a protein or RNA product Slide7: DNA polymorphisms RFLPs A B Minisatellites Microsatellites >100,000 Many alleles, (CA)n, very informative, even, easily automated SNPs 10,054,521 (25 Jan ‘05) Most with 2 alleles (up to 4), not very informative, even, easily automatedSlide8: Haploid gametes ♁ ♂ ♂ ♁ G1 phase chr1 chr1 S phase Diploid zygote 1 cell M phase Diploid zygote >1 cell ♁ ♂ ♁ A - B - A - B - A - B - A - B - A - B - A - B - ♂ ♁ A - B - A - B - ♂ ♁ - A - B - A - B - A - B - A - B DNA organization Mitosis 22 + 1 2 (22 + 1) 2 (22 + 1) 2 (22 + 1)Slide9: Diploid gamete precursor cell (♂) (♁) (♂) (♁) Haploid gamete precursors Hap. gametes NR NR R R ♁ A - B - - A - B A - B - - A - B A - B - - A - B A - B - - A - B ♂ ♁ A - B - A - B - - A - B - A - B DNA recombination Meiosis 2 (22 + 1) 2 (22 + 1) 22 + 1 22 + 1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1 chr1Slide10: Diploid gamete precursor (♂) (♁) (♂) (♁) Haploid gamete precursors Hap. gametes NR NR NR NR ♁ A - B - - A - B A - B - - A - B A - B - - A - B A - B - - A - B ♂ ♁ A - B - A - B - - A - B - A - B DNA recombination between linked loci Meiosis 2 (22 + 1) 22 + 1Slide11: Human Genome - summary Recombination fraction between loci A and B (θ) Proportion of gametes produced that are recombinant for A and B If A and B are very far apart: 50%R:50%NR - θ = 0.5 If A and B are very close together: <50%R - 0 ≤ θ < 0.5 Recombination fraction (θ) can be converted to genetic distance (cM) Haldane: eg. θ=0.17, cM=20.8 Kosambi: eg. θ=0.17, cM=17.7 DNA is a linear sequence of nucleotides partitioned into 23 chromosomes Two copies of each chromosome (2x22 autosomes + XY), from paternal and maternal origins. During meiosis in gamete precursors, recombination can occur between maternal and paternal homologs Slide12: 3. Principles of Linkage AnalysisSlide13: Linkage Analysis requires genetic markers M1 M2 Mn M1 M2 Mn M1 M2 Mn θ 0.5 0.5 .4 .3 .15 .3 .4 0.5 Q θ 0.5 0.5 .4 .3 .1 .26 .35 0.5 .35 .22 .3 .4 Slide14: Linkage Analysis: Parametric vs. Nonparametric Q M Phe A D C E Genetic factors Environmental factors Mode of inheritance Recombination Correlation Chromosome Gene Adapted from Weiss & Terwilliger 2000Slide15: 4. Parametric Linkage AnalysisSlide16: Linkage with informative phase known meiosis M2M5Q2Q2 M1M6Q1Q? M1Q1/M2Q2 M3M4Q2Q2 M1Q1/M3Q2 M2Q2/M3Q2 M1Q1/M4Q2 M1Q1/M4Q2 M2Q2/M4Q2 M2Q1/M3Q2 Chromosome M1..6 Q1,2 Autosomal dominant, Q1 predisposing allele Gene ♁ ♂ NR: M1Q1 NR: M2Q2 R: M1Q2 R: M2Q1 θMQ = 1/6 = 0.17 Informative Phase known (~20.8 cM) M1M2Q1Q2 M1 Q1 M2 Q2Slide17: M1M2Q1Q2 M3M4Q2Q2 NR: M1Q1 NR: M2Q2 R: M1Q2 R: M2Q1 Q2Q2 Q1Q? P 1-θ 1-θ θ θ M1Q1/M2Q2 R: M1Q1 R: M2Q2 NR: M1Q2 NR: M2Q1 P θ θ 1-θ 1-θ M1Q2/M2Q1 N 3 2 0 1 N 3 2 0 1 + + Informative Phase unknown Linkage with informative phase unknown meiosis M1Q1/M3Q2 M2Q2/M3Q2 M1Q1/M4Q2 M1Q1/M4Q2 M2Q2/M4Q2 M2Q1/M3Q2Slide18: Parametric LOD score calculation Overall LOD score for a given θ is the sum of all family LOD scores at θ eg. LOD=3 for θ=0.28 Slide19: M1 M2 Mn θ 0.5 0.5 .4 .3 .1 .3 .4 0.5 Q For each marker, estimate the θ that yields highest LOD score across all families Markers with a significant parametric LOD score (>3) are said to be linked to the trait locus with recombination fraction θ This θ (and the LOD) will depend upon the mode of inheritance assumed MOI determines the genotype at the trait locus Q and thus determines the number of meiosis which are recombinant or nonrecombinant. Limited to Mendelian diseases. Parametric Linkage Analysis - summarySlide20: M1M2Q1Q1 M3M4Q1Q2 M2M3Q1Q1 M1M4Q1Q2 M1M4Q1Q1 M2M4Q1Q2 NR: M3Q1 NR: M4Q2 R: M3Q2 R: M4Q1 Q1Q1 Q2Q? P 1-θ 1-θ θ θ M3Q1/M4Q2 R: M3Q1 R: M4Q2 NR: M3Q2 NR: M4Q1 P θ θ 1-θ 1-θ M3Q2/M4Q1 N 1 2 0 1 N 1 2 0 1 + + Practical 1. Identify informative individual(s) 2. Reconstruct possible phase(s) 3. Classify gametes as R or NR 4. Count R and NR gametes 5. Express 6. Express LOD score Slide21: Practical II Talk example Practical example Graph each…Slide22: Outline 1. Aim 2. The Human Genome 3. Principles of Linkage Analysis 4. Parametric Linkage Analysis 5. Nonparametric Linkage AnalysisSlide23: 5. Nonparametric Linkage AnalysisSlide24: Approach Parametric: genotype marker locus & genotype trait locus (latter inferred from phenotype according to a specific disease model) Parameter of interest: θ between marker and trait loci Nonparametric: genotype marker locus & phenotype If a trait locus truly regulates the expression of a phenotype, then two relatives with similar phenotypes should have similar genotypes at a marker in the vicinity of the trait locus, and vice-versa. Interest: correlation between phenotypic similarity and marker genotypic similarity No need to specify mode of inheritance, allele frequencies, etc...Slide25: Phenotypic similarity between relatives Squared trait differences Squared trait sums Trait cross-product Trait variance-covariance matrix Affection concordance T2 T1Slide26: Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same DNA sequence but they are not necessarily derived from a known common ancestor IBD Alleles shared Identical By Descent are a copy of the same ancestor allele M1 Q1 M2 Q2 M3 Q3 M3 Q4 M1 Q1 M3 Q3 M1 Q1 M3 Q4 M1 Q1 M2 Q2 M3 Q3 M3 Q4 IBS IBD 2 1 Inheritance vector (M) 0 0 0 1 1 Slide27: Genotypic similarity between relatives - M1 Q1 M3 Q3 M2 Q2 M3 Q4 Number of alleles IBD 0 M1 Q1 M3 Q3 M1 Q1 M3 Q4 1 M1 Q1 M3 Q3 M1 Q1 M3 Q3 2 Proportion of alleles IBD - 0 0.5 1 Inheritance vector (M) 0 0 1 1 0 0 0 1 0 0 0 0Slide28: Genotypic similarity between relatives - 22nSlide29: Statistics that incorporate both phenotypic and genotypic similarities Genotypic similarity ( ) Phenotypic similarity 0 0.5 1Slide30: Haseman-Elston regression – Quantitative traits Phenotypic dissimilarity Genotypic similarity b × = + c 0 0.5 1Slide31: VC ML – Quantitative & Categorical traits method 0 0.5 1 H1: H0: e.g. LOD=3Slide32: Individual LOD scores can be expressed as P values (Pointwise) LOD Chi-sq (n-df) P value 2.1 9.67 0.0009 Genome-wide linkage analysis (e.g. VC) (x4.6)Slide33: Statistics for selected samples T2 T1 H0 (No linkage): Mean H1 (Linkage): Mean H0 (No linkage): Mean H1 (Linkage): Mean Mean IBD sharing statistics (Risch & Zhang 1995, 1996)Slide34: Other Linkage statistics Dependent variable: Phenotypes Independent variable: Dependent variable: Independent variable: Phenotypes Extensions to Haseman Elston VC ML with mixture distribution Pedwide-regression Analysis (“reverse HE”) Reverse VC ML (Wright 1997, Drigalenko 1998, Elston et al. 2000, Forrest 2001, Visscher & Hopper 2001, Xu et al. 2000, Sham & Purcell 2001) (Eaves et al. 1996) (Sham et al. 2002) (Sham et al. 2000) Statistics for affection traits Based on IBD scoring functions eg. Sall (Whittemore & Halpern 1994, Kong & Cox 1997) Forrest & Feingold 2000 Mixed statistic Slide35: No need to specify mode of inheritance Nonparametric Linkage Analysis - summary Models phenotypic and genotypic similarity of relatives Expression of phenotypic similarity, calculation of IBD HE and VC are the most popular statistics used for linkage of quantitative traits Other statistics available, specially for affection traits Type I error? Power?Slide36: Type I error Type I error True positive LOD k Theoretical (Lander & Kruglyak 1995) EmpiricalSlide37: Theoretical genome-wide thresholds Genome-wide threshold for suggestive linkage LOD score that occurs by chance alone on average once per scan LOD = 2.2, Chi-sq = 10.1, Pointwise P = 0.00074 Genome-wide threshold for significant linkage LOD score that occurs by chance alone on average once per 20 scans LOD = 3.6, Chi-sq = 16.7, Pointwise P = 0.000022 Slide38: Empirical genome-wide thresholds Genome-wide threshold for suggestive linkage LOD score that occurs by chance alone on average once per scan Genome-wide threshold for significant linkage LOD score that occurs by chance alone on average once per 20 scans