DATA MINING ON GENOMICS

Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

DATA MINING ON GENOMICS :

DATA MINING ON GENOMICS By K.SAI KOUMUDI 12311A1269

What is Genomics and Why Data Mining?:

What is Genomics and Why Data Mining? ‘Gene’ refers specifically to a protein coding sequence of DNA, the word ‘genome’ encompasses the entire set of genetic information, which includes sequences involved in gene regulation, and sequences with unknown functions. Parkinson’s disease Replacing with stem cells.

How is DNA stored?:

How is DNA stored? The DNA is stored as a code made of four chemical bases: Adenine(A) Guanine(G) Cytosine(C) Thymine(T) Human DNA consists of about 3 billion bases and more than 99% of those bases are the same in all people. Darwin Theory Neo Darwin Theory

DNA: Deoxyribonucleic acid:

DNA: D eoxyribonucleic acid Phosphate + sugar molecule + DNA=Nucleotide(two long strands)

DATA MINING::

DATA MINING: Data cleaning, Data integration,reference reconciliation, classification and clustering will facilitate the integration of biological data and construction of data warehouses for biological data analysis. Tools like BLAST and FASTA are used for systematic analysis of genomic data. The help in aligning, indexing, similarity search and comparative analysis of multiple nucleotide.

Slide6:

Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system.

Few Data mining systems::

Few Data mining systems: Microsoft SQL Server SAS Enterprise Miner IBM Intelligent Miner SGI Mineset Multiple data mining algorithms and advanced statistics are used. Huge datawarehouses are present in US, UK, Canada, Israel, Australia which stores about 200-600 gigabytes of data in one repository.

Visualization::

Visualization: Alignments among genomic sequences and interactions between them can be expressed in Graphic forms Transformed into various kinds of easy-to-understand visual displays. They facilitate pattern understanding, knowledge discovery and interactive data exploration.

Sample patterns::

Sample patterns:

THANK YOU.:

THANK YOU.

ANY QUESTIONS?:

ANY QUESTIONS?

authorStream Live Help