logging in or signing up Class10 Columbia Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 280 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 16, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Computational Human Genetics: Computational Human Genetics Itsik Pe'er Department of Computer Science Columbia University Fall 2006Reminder: Reminder Structural variants: Detected by analysis Detected by technology Also: Project outline presentations Model organisms What can we do to improve technology?Meeting #10: Meeting #10 Genotyping technologiesGenotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogHow can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA Cutting DNA with Enzymes: Cutting DNA with Enzymes Non-specific digestion Specific digestion: restriction enzymesPasting DNA to Sticky Ends: Pasting DNA to Sticky Ends Complementary DNA sticky ends hybridize and are ligated Allows insertion of DNACopying DNA by PCR: Copying DNA by PCR Polymerase Chain Reaction technique to create many copies of a short genome segment Requires: Synthesizing flanking sequences (primers) 2-stranded DNA Region of interest Primer sequencesCopying DNA by PCR: Copying DNA by PCR Primers PolymeraseCopying DNA by PCR: Copying DNA by PCR Exponential increase in the number of amplicon moleculesHow can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA Observing Presence of DNA: Observing Presence of DNA Radioactive phosphate in nucleotides Fluorescent labels (attached molecules) Sizing DNA with Electrophorsis: Sizing DNA with Electrophorsis (-)-charged DNA molecules are lined up at cathode Travel to anode through buffer Longer molecules are slower Photograph labeled DNAs Buffer Starting line Detect DNA by Hybridization: Detect DNA by Hybridization Probes – short, single strand DNA molecules Apply mixture to array of probes, wash, photo Only probes that have reverse-complements light up Sequencing DNA: Sequencing DNA A polymerase mix with A-stop bases creates all A-terminating prefixes Run electrophoresis Repeat for all bases A ATTA ATTATGCTA TAGCATAAT ACGT AGenotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogPrinciples: Principles Allele-dependent chemical event Hybridization Extension Ligation Reading of signal from the event Detecting probe/extension/ligand Considerations: Robustness Throughput Cost Example: MassArray (Sequenom): Example: MassArray (Sequenom)Example: MassArray (Sequenom): Example: MassArray (Sequenom) Event: Extension of SNP-specific primer (amplified) Detection: Mass spectrometry Specs: Up to ~20SNPs x ~400 samples at a time 0.10$ per call; requires SNP-specific PCR+probe Computation: Design primers of different weightsMolecular Inversion Probes: Molecular Inversion Probes Design a probe with hybridizing flanksMolecular Inversion Probes: Molecular Inversion Probes Molecular Inversion Probes: Molecular Inversion Probes Event: Allele-dependant ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.02$ per call; SNP-specific probe; single PCR Computation: Choose tagging sequences Example: BeadArray (Illumina): Example: BeadArray (Illumina) A/C T GExample: BeadArray (Illumina): Example: BeadArray (Illumina) Event: Allele-specific ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.002$ per call; SNP-specific probe; per-SNP PCR Computation: Make calls (clustering in polar coordinates) Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogExample: Affymetrix GeneChip: Example: Affymetrix GeneChip Genomic DNA with SNPsExample: Affymetrix GeneChip: Example: Affymetrix GeneChipExample: Affymetrix GeneChip: Example: Affymetrix GeneChip Example: Affymetrix GeneChip: Example: Affymetrix GeneChip Event: Hybridization to array Detection: multiple probes; fluorescent target Specs: Up to 500k SNPs x~100 samples at a time 0.001$ per call; SNP-specific probe; single PCR Unflexible Computation: Genotype calls Calling Affymetrix Genotypes: Calling Affymetrix Genotypes “Dynamic Model”: For each quartet: Rank hypotheses (AA/AB/BB/Null) Score the rankings being the same Ranking hypotheses: LogLikelihoods: L(AA), L(AB) L(BB) L(Null) Assume: Normal signal Different means for true/noise signal Clustering Approach: Clustering Approach Given signals from many individuals: Better estimates of mean signal Allows clustering in high dimension: Bayesian Robust Linear Model with Mahalanovich distance Which Probes Really Type?: Which Probes Really Type?Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogThe X-Prize for Genomics: The X-Prize for Genomics Announced: Oct 2006 $10,000,000 cash Sequence 100 humans <10 days < 0.00001 error rate <2% missing data <$10000 recurrent cost Semi-annual competition, till 2013How far are we?: How far are we? Genotyping? If made to work everywhere If we know all SNPs If we type all structural variants If we shave half an order of magnitude off cost If the X-Prize committee accepts that Standard sequencing: Human Genome: $0.09/finished bp Today: ~$5M/genomeCandidate: 454: Candidate: 454 ~$1M/genomeCandidate: Helicos: Candidate: Helicos http://helicosbio.com/B38AD5C6BCE640D9B97A44977D5E1CEF.asp?ie_key=4546BA83C53E4D988C4F5E1B5CE5E2CE Probably $100k/genome Candidate: Solexa: Candidate: Solexa http://www.solexa.com/technology/demo.html Probably $100k/genomeSummary: Summary Diverse technologies, diverse problems An affordable personal genome in our timeModel Organisms: Model Organisms + Controlled breeding Controlled environment - Controlled past breeding Relevance to human diseaseMouse History: Mouse History Genetic Archaeology: Genetic Archaeology SNP Rate: Same or Different?: SNP Rate: Same or Different? Mosaic Structure: Mosaic Structure Segmental Phylogenies: Segmental Phylogenies Dogs: Dogs Sequencing an Inbred Boxer: Sequencing an Inbred Boxer Linkage Disequilibrium in Dogs: Linkage Disequilibrium in Dogs Larger Ne (ancestral) Strong bottleneckDog History: Dog History Two bottleneck events Suggests 2-stage disease mapping Further Reading: Further Reading Di X, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, Shen MM, Kulp D, Kennedy GC, Mei R, Jones KW, Cawley S. Bioinformatics. 2005 May 1;21(9):1958-63. Rabbee N and Speed TP Bioinformatics. 2006 Jan 1;22(1):7-12. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors.Nature. 2005 Sep 15;437(7057):376-80 Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, …,Lander ES, Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005 Dec 8;438(7069):803-19. Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR, Daly MJ Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. Genome Res. 2004 Aug;14(8):1493-500. Wade CM, Daly MJ Genetic variation in laboratory mice.Nat Genet. 2005 Nov;37(11):1175-80. Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC, Lander ES, Lindblad-Toh K, Daly The mosaic structure of variation in the laboratory mouse genome. Nature. 2002 Dec 5;420(6915):574-8. You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Class10 Columbia Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 280 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 16, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Computational Human Genetics: Computational Human Genetics Itsik Pe'er Department of Computer Science Columbia University Fall 2006Reminder: Reminder Structural variants: Detected by analysis Detected by technology Also: Project outline presentations Model organisms What can we do to improve technology?Meeting #10: Meeting #10 Genotyping technologiesGenotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogHow can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA Cutting DNA with Enzymes: Cutting DNA with Enzymes Non-specific digestion Specific digestion: restriction enzymesPasting DNA to Sticky Ends: Pasting DNA to Sticky Ends Complementary DNA sticky ends hybridize and are ligated Allows insertion of DNACopying DNA by PCR: Copying DNA by PCR Polymerase Chain Reaction technique to create many copies of a short genome segment Requires: Synthesizing flanking sequences (primers) 2-stranded DNA Region of interest Primer sequencesCopying DNA by PCR: Copying DNA by PCR Primers PolymeraseCopying DNA by PCR: Copying DNA by PCR Exponential increase in the number of amplicon moleculesHow can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA Observing Presence of DNA: Observing Presence of DNA Radioactive phosphate in nucleotides Fluorescent labels (attached molecules) Sizing DNA with Electrophorsis: Sizing DNA with Electrophorsis (-)-charged DNA molecules are lined up at cathode Travel to anode through buffer Longer molecules are slower Photograph labeled DNAs Buffer Starting line Detect DNA by Hybridization: Detect DNA by Hybridization Probes – short, single strand DNA molecules Apply mixture to array of probes, wash, photo Only probes that have reverse-complements light up Sequencing DNA: Sequencing DNA A polymerase mix with A-stop bases creates all A-terminating prefixes Run electrophoresis Repeat for all bases A ATTA ATTATGCTA TAGCATAAT ACGT AGenotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogPrinciples: Principles Allele-dependent chemical event Hybridization Extension Ligation Reading of signal from the event Detecting probe/extension/ligand Considerations: Robustness Throughput Cost Example: MassArray (Sequenom): Example: MassArray (Sequenom)Example: MassArray (Sequenom): Example: MassArray (Sequenom) Event: Extension of SNP-specific primer (amplified) Detection: Mass spectrometry Specs: Up to ~20SNPs x ~400 samples at a time 0.10$ per call; requires SNP-specific PCR+probe Computation: Design primers of different weightsMolecular Inversion Probes: Molecular Inversion Probes Design a probe with hybridizing flanksMolecular Inversion Probes: Molecular Inversion Probes Molecular Inversion Probes: Molecular Inversion Probes Event: Allele-dependant ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.02$ per call; SNP-specific probe; single PCR Computation: Choose tagging sequences Example: BeadArray (Illumina): Example: BeadArray (Illumina) A/C T GExample: BeadArray (Illumina): Example: BeadArray (Illumina) Event: Allele-specific ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.002$ per call; SNP-specific probe; per-SNP PCR Computation: Make calls (clustering in polar coordinates) Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogExample: Affymetrix GeneChip: Example: Affymetrix GeneChip Genomic DNA with SNPsExample: Affymetrix GeneChip: Example: Affymetrix GeneChipExample: Affymetrix GeneChip: Example: Affymetrix GeneChip Example: Affymetrix GeneChip: Example: Affymetrix GeneChip Event: Hybridization to array Detection: multiple probes; fluorescent target Specs: Up to 500k SNPs x~100 samples at a time 0.001$ per call; SNP-specific probe; single PCR Unflexible Computation: Genotype calls Calling Affymetrix Genotypes: Calling Affymetrix Genotypes “Dynamic Model”: For each quartet: Rank hypotheses (AA/AB/BB/Null) Score the rankings being the same Ranking hypotheses: LogLikelihoods: L(AA), L(AB) L(BB) L(Null) Assume: Normal signal Different means for true/noise signal Clustering Approach: Clustering Approach Given signals from many individuals: Better estimates of mean signal Allows clustering in high dimension: Bayesian Robust Linear Model with Mahalanovich distance Which Probes Really Type?: Which Probes Really Type?Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse DogThe X-Prize for Genomics: The X-Prize for Genomics Announced: Oct 2006 $10,000,000 cash Sequence 100 humans <10 days < 0.00001 error rate <2% missing data <$10000 recurrent cost Semi-annual competition, till 2013How far are we?: How far are we? Genotyping? If made to work everywhere If we know all SNPs If we type all structural variants If we shave half an order of magnitude off cost If the X-Prize committee accepts that Standard sequencing: Human Genome: $0.09/finished bp Today: ~$5M/genomeCandidate: 454: Candidate: 454 ~$1M/genomeCandidate: Helicos: Candidate: Helicos http://helicosbio.com/B38AD5C6BCE640D9B97A44977D5E1CEF.asp?ie_key=4546BA83C53E4D988C4F5E1B5CE5E2CE Probably $100k/genome Candidate: Solexa: Candidate: Solexa http://www.solexa.com/technology/demo.html Probably $100k/genomeSummary: Summary Diverse technologies, diverse problems An affordable personal genome in our timeModel Organisms: Model Organisms + Controlled breeding Controlled environment - Controlled past breeding Relevance to human diseaseMouse History: Mouse History Genetic Archaeology: Genetic Archaeology SNP Rate: Same or Different?: SNP Rate: Same or Different? Mosaic Structure: Mosaic Structure Segmental Phylogenies: Segmental Phylogenies Dogs: Dogs Sequencing an Inbred Boxer: Sequencing an Inbred Boxer Linkage Disequilibrium in Dogs: Linkage Disequilibrium in Dogs Larger Ne (ancestral) Strong bottleneckDog History: Dog History Two bottleneck events Suggests 2-stage disease mapping Further Reading: Further Reading Di X, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, Shen MM, Kulp D, Kennedy GC, Mei R, Jones KW, Cawley S. Bioinformatics. 2005 May 1;21(9):1958-63. Rabbee N and Speed TP Bioinformatics. 2006 Jan 1;22(1):7-12. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors.Nature. 2005 Sep 15;437(7057):376-80 Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, …,Lander ES, Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005 Dec 8;438(7069):803-19. Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR, Daly MJ Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. Genome Res. 2004 Aug;14(8):1493-500. Wade CM, Daly MJ Genetic variation in laboratory mice.Nat Genet. 2005 Nov;37(11):1175-80. Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC, Lander ES, Lindblad-Toh K, Daly The mosaic structure of variation in the laboratory mouse genome. Nature. 2002 Dec 5;420(6915):574-8.