Class10

Uploaded from authorPOINT Lite
Download as
 PPT
Presentation Description 

No description available

Happy Thanksgiving
What's up on authorSTREAM?
Views: 132
Like it  ( Likes) Dislike it  ( Dislikes)
Added: November 16, 2007 This Presentation is Public 
Presentation Category : Entertainment All Rights Reserved
Presentation Transcript

Computational Human Genetics: Computational Human Genetics Itsik Pe'er Department of Computer Science Columbia University Fall 2006


Reminder: Reminder Structural variants: Detected by analysis Detected by technology Also: Project outline presentations Model organisms What can we do to improve technology?


Meeting #10: Meeting #10 Genotyping technologies


Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse Dog


How can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA


Cutting DNA with Enzymes: Cutting DNA with Enzymes Non-specific digestion Specific digestion: restriction enzymes


Pasting DNA to Sticky Ends: Pasting DNA to Sticky Ends Complementary DNA sticky ends hybridize and are ligated Allows insertion of DNA


Copying DNA by PCR: Copying DNA by PCR Polymerase Chain Reaction technique to create many copies of a short genome segment Requires: Synthesizing flanking sequences (primers) 2-stranded DNA Region of interest Primer sequences


Copying DNA by PCR: Copying DNA by PCR Primers Polymerase


Copying DNA by PCR: Copying DNA by PCR Exponential increase in the number of amplicon molecules


How can we examine DNA?: How can we examine DNA? Cut up DNA Paste DNA Copy DNA Observe presence of DNA Measure DNA Detect DNA Sequence DNA


Observing Presence of DNA: Observing Presence of DNA Radioactive phosphate in nucleotides Fluorescent labels (attached molecules)


Sizing DNA with Electrophorsis: Sizing DNA with Electrophorsis (-)-charged DNA molecules are lined up at cathode Travel to anode through buffer Longer molecules are slower Photograph labeled DNAs Buffer Starting line


Detect DNA by Hybridization: Detect DNA by Hybridization Probes – short, single strand DNA molecules Apply mixture to array of probes, wash, photo Only probes that have reverse-complements light up


Sequencing DNA: Sequencing DNA A polymerase mix with A-stop bases creates all A-terminating prefixes Run electrophoresis Repeat for all bases A ATTA ATTATGCTA TAGCATAAT ACGT A


Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview + examples Focused examination: Affymetrix Future of sequencing Model organisms: Mouse Dog


Principles: Principles Allele-dependent chemical event Hybridization Extension Ligation Reading of signal from the event Detecting probe/extension/ligand Considerations: Robustness Throughput Cost


Example: MassArray (Sequenom): Example: MassArray (Sequenom)


Example: MassArray (Sequenom): Example: MassArray (Sequenom) Event: Extension of SNP-specific primer (amplified) Detection: Mass spectrometry Specs: Up to ~20SNPs x ~400 samples at a time 0.10$ per call; requires SNP-specific PCR+probe Computation: Design primers of different weights


Molecular Inversion Probes: Molecular Inversion Probes Design a probe with hybridizing flanks


Molecular Inversion Probes: Molecular Inversion Probes


Molecular Inversion Probes: Molecular Inversion Probes Event: Allele-dependant ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.02$ per call; SNP-specific probe; single PCR Computation: Choose tagging sequences


Example: BeadArray (Illumina): Example: BeadArray (Illumina) A/C T G


Example: BeadArray (Illumina): Example: BeadArray (Illumina) Event: Allele-specific ligation PCR Detection: “bar-code” tag hybridizes to array Specs: Up to 500k SNPs x ~100 samples at a time 0.002$ per call; SNP-specific probe; per-SNP PCR Computation: Make calls (clustering in polar coordinates)


Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse Dog


Example: Affymetrix GeneChip: Example: Affymetrix GeneChip Genomic DNA with SNPs


Example: Affymetrix GeneChip: Example: Affymetrix GeneChip


Example: Affymetrix GeneChip: Example: Affymetrix GeneChip


Example: Affymetrix GeneChip: Example: Affymetrix GeneChip Event: Hybridization to array Detection: multiple probes; fluorescent target Specs: Up to 500k SNPs x~100 samples at a time 0.001$ per call; SNP-specific probe; single PCR Unflexible Computation: Genotype calls


Calling Affymetrix Genotypes: Calling Affymetrix Genotypes “Dynamic Model”: For each quartet: Rank hypotheses (AA/AB/BB/Null) Score the rankings being the same Ranking hypotheses: LogLikelihoods: L(AA), L(AB) L(BB) L(Null) Assume: Normal signal Different means for true/noise signal


Clustering Approach: Clustering Approach Given signals from many individuals: Better estimates of mean signal Allows clustering in high dimension: Bayesian Robust Linear Model with Mahalanovich distance


Which Probes Really Type?: Which Probes Really Type?


Genotyping Technologies: Genotyping Technologies Basic techniques in molecular biology Genotyping technologies: Overview Focused examination: Affymetrix Future of sequencing Model organisms: Mouse Dog


The X-Prize for Genomics: The X-Prize for Genomics Announced: Oct 2006 $10,000,000 cash Sequence 100 humans <10 days < 0.00001 error rate <2% missing data <$10000 recurrent cost Semi-annual competition, till 2013


How far are we?: How far are we? Genotyping? If made to work everywhere If we know all SNPs If we type all structural variants If we shave half an order of magnitude off cost If the X-Prize committee accepts that Standard sequencing: Human Genome: $0.09/finished bp Today: ~$5M/genome


Candidate: 454: Candidate: 454 ~$1M/genome


Candidate: Helicos: Candidate: Helicos http://helicosbio.com/B38AD5C6BCE640D9B97A44977D5E1CEF.asp?ie_key=4546BA83C53E4D988C4F5E1B5CE5E2CE Probably $100k/genome


Candidate: Solexa: Candidate: Solexa http://www.solexa.com/technology/demo.html Probably $100k/genome


Summary: Summary Diverse technologies, diverse problems An affordable personal genome in our time


Model Organisms: Model Organisms + Controlled breeding Controlled environment - Controlled past breeding Relevance to human disease


Mouse History: Mouse History


Genetic Archaeology: Genetic Archaeology


SNP Rate: Same or Different?: SNP Rate: Same or Different?


Mosaic Structure: Mosaic Structure


Segmental Phylogenies: Segmental Phylogenies


Dogs: Dogs


Sequencing an Inbred Boxer: Sequencing an Inbred Boxer


Linkage Disequilibrium in Dogs: Linkage Disequilibrium in Dogs Larger Ne (ancestral) Strong bottleneck


Dog History: Dog History Two bottleneck events Suggests 2-stage disease mapping


Further Reading: Further Reading Di X, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, Shen MM, Kulp D, Kennedy GC, Mei R, Jones KW, Cawley S. Bioinformatics. 2005 May 1;21(9):1958-63. Rabbee N and Speed TP Bioinformatics. 2006 Jan 1;22(1):7-12. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. Genome sequencing in microfabricated high-density picolitre reactors.Nature. 2005 Sep 15;437(7057):376-80 Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, …,Lander ES, Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005 Dec 8;438(7069):803-19. Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR, Daly MJ Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. Genome Res. 2004 Aug;14(8):1493-500. Wade CM, Daly MJ Genetic variation in laboratory mice.Nat Genet. 2005 Nov;37(11):1175-80. Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC, Lander ES, Lindblad-Toh K, Daly The mosaic structure of variation in the laboratory mouse genome. Nature. 2002 Dec 5;420(6915):574-8.