Presentation Transcript
Genomics and BioinformaticsThe "new" biology :6/10/2009 URC,Allahabad 1 Genomics and BioinformaticsThe "new" biology Brijesh Singh Yadav
Bioinformatics Research Cell
United Research Center
Allahabad, India.
Slide 2:6/10/2009 URC,Allahabad 2
What is genomics :6/10/2009 URC,Allahabad 3 What is genomics Genome
All the DNA contained in the cell of an organism
Genomics
The comprehensive study of the interactions and functional dynamics of whole sets of genes and their products. (NIAAA, NIH)
A "scaled-up" version of genetics research in which scientists can look at all of the genes in a living creature at the same time. (NIGMS, NIH)
Genome sequencing chronology :6/10/2009 URC,Allahabad 4 Genome sequencing chronology
Genome sequencing chronology :6/10/2009 URC,Allahabad 5 Genome sequencing chronology
Genome sequencing projects (as of 1/26,2007) :6/10/2009 URC,Allahabad 6 Genome sequencing projects (as of 1/26,2007)
Slide 7:6/10/2009 URC,Allahabad 7
Slide 8:6/10/2009 URC,Allahabad 8 Genome sequencing helps in:
Identifying new genes (“gene discovery”)
Looking at chromosome organization and structure
Finding gene regulatory sequences
Comparative genomics
These in turn lead to advances in:
Medicine
Agriculture
Biotechnology
Understanding evolution and other basic science questions
Information contents in a genome :6/10/2009 URC,Allahabad 9 Information contents in a genome Gene
Protein coding genes
RNA genes
Regulatory elements
Gene expression control
Chromatin remodeling
Matrix attachment sites
“Non-functional” elements
Selfish elements
“Junk” DNA
??
The “central dogma” of molecular biology :6/10/2009 URC,Allahabad 10 The “central dogma” of molecular biology Central dogma DNA RNA Protein Transcription Translation Replication
Expanded “central dogma” of molecular biology :6/10/2009 URC,Allahabad 11 Expanded “central dogma” of molecular biology A more comprehensive view DNA RNA Protein Transcription Translation Replication Metabolite Pheno-
type
New disciplines due to the advance in genomics :6/10/2009 URC,Allahabad 12 New disciplines due to the advance in genomics Omics DNA RNA Protein Transcription Translation Replication Metabolite Pheno-
type Structural
genomics Transcriptomics Proteomics Metabolomics Genomic DNA
sequences Transcript seq
Microarray data
Cis-elements
TF binding sites
Epigenetic regulation Shotgun protein seq
Subcellular location
Post-translational mod
Protein interaction
Protein structure Metabolite concn
Metabolic flux Genetic interactions
Systematic KO
Disease information
Slide 13:6/10/2009 URC,Allahabad 13 find all motifs in genome identify
transcription factors identify binding motif identify target genes Transcription factors, binding sites, and target genes computational searching
ChIP-chip computational searching
microarrays
genetic screens bioinformatics (e.g., Gibbs sampling on microarray data)
molecular biology using purified protein or protein extracts genetic screens
one-hybrid assays
sequence motifs/homology
Nature omics gateway :6/10/2009 URC,Allahabad 14 Nature omics gateway
Three perspectives of our biological world :6/10/2009 URC,Allahabad 15 Three perspectives of our biological world The cellular level, the individual, the tree of life
Further complications :6/10/2009 URC,Allahabad 16 Further complications Cell-cell interactions
Cell types
Environmental conditions
Developmental programming
Interactions at the organismal level
Interactions at the population, ecosystem level
Impact of Genomics on Medicine :6/10/2009 URC,Allahabad 17 How to characterize new diseases?
What new treatments can be discovered?
How do we treat individual patients? Tailoring treatments? Impact of Genomics on Medicine
Bioinformatics :6/10/2009 URC,Allahabad 18 Bioinformatics Conceptualizing biology in terms of molecules and then applying “informatics” techniques from math, computer science, and statistics to understand and organize the information associated with these molecules on a large scale
How do we use Bioinformatics? :6/10/2009 URC,Allahabad 19 How do we use Bioinformatics? Store/retrieve biological information (databases)
Retrieve/compare gene sequences
Predict function of unknown genes/proteins
Search for previously known functions of a gene
Compare data with other researchers
Compile/distribute data for other researchers
Example: Sequence alignment :6/10/2009 URC,Allahabad 20 Example: Sequence alignment Align retinol-binding protein and b-lactoglobulin 1 MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEG 50 RBP
. ||| | . |. . . | : .||||.:| :
1 ...MKCLLLALALTCGAQALIVT..QTMKGLDIQKVAGTWYSLAMAASD. 44 lactoglobulin
51 LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE 97 RBP
: | | | | :: | .| . || |: || |.
45 ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTK 93 lactoglobulin
98 DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC 136 RBP
|| ||. | :.|||| | . .|
94 IPAVFKIDALNENKVL........VLDTDYKKYLLFCMENSAEPEQSLAC 135 lactoglobulin
137 RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV 185 RBP
. | | | : || . | || |
136 QCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI....... 178 lactoglobulin >RBP
MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQEELCLARQYRLIV
>lactoglobulin
MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI
Microarray data analysis :6/10/2009 URC,Allahabad 21 Microarray data analysis A simplified pipeline
Example: Microarray :6/10/2009 URC,Allahabad 22 Example: Microarray A solid support (e.g. a membrane or glass slide) on which DNA of known sequence is deposited in a grid-like fashion
Example: Identification of cis-elements :6/10/2009 URC,Allahabad 23 Example: Identification of cis-elements The on-off switches and rheostats of a cell operating at the gene level.
They control whether and how vigorously that genes will be transcribed into RNAs.
Motif model: Position Frequency Matrix (PFM) :6/10/2009 URC,Allahabad 24 Motif model: Position Frequency Matrix (PFM) fb,i : freuqnecy of a base b occurred at the i-th position D’haeseleer (2006) Nature Biotech. 24:423
Final example: Relationships between sequences :6/10/2009 URC,Allahabad 25 Final example: Relationships between sequences Sanger and colleagues (1950s): 1st sequence
Insulin from various mammals
The END :6/10/2009 URC,Allahabad 26 The END ...