Relational DB technology for Data warehouse

Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Relational DB technology for Data warehouse:

Relational DB technology for Data warehouse

Slide 2:

Speed up- The ability to execute the same request on the same amount of data in less time. Scale up-The ability to obtain the same performance on the same request as the DB size increases.

Types of parallelism:

Types of parallelism Interquery parallelism - In which different Server threads or processors handle multiple request at the same time. It increase the throughput and allows the support of more concurrent users.

Slide 4:

Intraquery parallelism- It can be done in two way- Horizontal - DB is partitioned across multiple disk and parallel processing occurs within a specific task that is performed concurrently on different processors. Vertical - An output from one task become input of other task

Partitioning of database:

Partitioning of database A partition is a division of a logical database or its constituting elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability reasons.

Data partitioning:

Data partitioning It is a key requirement for effective parallel execution of database operations. Data Partitioning is also the process of logically and/or physically partitioning data into segments that are more easily maintained or accessed. Current RDBMS systems provide this kind of distribution functionality. Partitioning of data helps in performance and utility processing. I/O operations such as read and write can be performed in parallel. It can be done by – Randomly Intelligently

Randomly data partitioning:

Randomly data partitioning It includes random data striping across multiple disk on a single server. Another option is round robin partitioning in which each new records is placed on the next disk assigned to the database.

Intelligent data partitioning :

Intelligent data partitioning It assumes that DBMS knows where a specific records is located and does not waste time searching for it across all disks. This technique includes- Hash partitioning Key range partitioning Schema partitioning User-defined partitioning

DB architecture for parallel processing:

DB architecture for parallel processing Shared memory architecture Shared disk architecture Shared nothing architecture Combined architecture

Shared memory architecture :

Shared memory architecture Also k/a shared everything. It is traditional approach to implement an RDBMS and SMP hardware. A single RDBMS server can potentially utilize all processors ,access all memory, and access the entire database. Shared Memory Architecture (SMA) refers to a design where the graphics chip does not have its own dedicated memory, and instead shares the main system RAM with the CPU and other components.

Shared memory architecture:

Shared memory architecture Interconnection network PU PU PU PU Global Shared Memory

Shared disk architecture :

Shared disk architecture It implements the concept of shared ownership of entire DB between RDBMS server. A shared-disk parallel machine is one in which all processors can access the same disks with about the same performance, but are unable to access each other’s RAM. The failure of a single DBMS processing node does not affect the other nodes’ ability to access the full database. There is no partitioning of the data in a shared disk system, data can be copied into RAM and modified on multiple machines.

Slide 14:

Interconnection network PU PU PU PU Global Shared Disk Subsystem Local memory Local memory Local memory Local memory Shared Disk Architecture

Shared nothing architecture :

Shared nothing architecture In this environment the data is partitioned across all disks and DBMS is partitioned across the co servers. It offers non-linear scalability. A shared nothing architecture (SNA) is a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. Shared nothing architecture (SNA) is a distributed computing architecture which consists of multiple nodes such that each node has it’s own private memory, disks and input/output devices independent of any other node in the network.

Shared nothing memory:

Shared nothing memory Interconnection network PU PU PU PU Local memory Local memory Local memory Local memory

Combined architecture:

Combined architecture Interserver parallelism- each query is Parallelized across multiple servers. Intraserver Parallelism- A query is parallelized within the server.

Parallel RDBMS features:

Parallel RDBMS features Scope and technique of parallel DBMS operation. Optimizer implementation. Application transparency. The parallel environment. DBMS management tools. Price/Performance.