DATA WAREHOUSING (1)

Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

SEMINAR ON DATA WAREHOUSING: 

SEMINAR ON DATA WAREHOUSING PREPARED BY: Naoneet Prakash B.Tech ( 7 th Sem. ) Institute of Technical Education And Research Guided By: Mrs.Binita Kumari

OutLines: 

OutLines Definition Need for Data Warehousing Creating and maintaining a warehouse Advantages of warehousing Data warehouse architecture Physical structure of data warehouse Conceptual modeling of data warehouse Application areas Conclusion

Data, Data everywhere yet ...: 

Data, Data everywhere yet ... I can’t find the data I need data is scattered over the network many versions, subtle differences need an expert to get the data I can’t understand the data I found available data poorly documented I can’t use the data I found results are unexpected data needs to be transformed from one form to other

What is a Data Warehouse?: 

What is a Data Warehouse? A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.

Why Data Warehousing?: 

Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? Which customers are most likely to go to the competition ? What impact will new products/services have on revenue and margins? What product prom- otions have the biggest impact on revenue? What is the most effective distribution channel? Why Data Warehousing?

Need for Data Warehousing: 

Need for Data Warehousing Integrated, company-wide view of high-quality information from disparate databases. Separation of operational and informational systems and data, for improved performance .

Creating and Maintaining a Warehouse: 

Creating and Maintaining a Warehouse Data warehouse needs several tools that automate or support tasks such as: Data extraction from different external data sources, operational databases, files of standard applications (e.g. Excel, COBOL applications), and other word documents. Data cleaning (finding and resolving inconsistency in the source data). Integration and transformation of data (between different data formats, languages, etc.).

Contd. …: 

Contd. … Data loading (loading the data into the data warehouse). Data replication (replicating source database into the data warehouse). Data archiving. Checking for data quality. Analyzing metadata.

Data Warehouse Architecture: 

Relational Databases Legacy Data Purchased Data ERP Systems Data Warehouse Architecture Data Warehouse Engine Optimized Loader Extraction Cleansing Metadata Repository Analyze Query

Physical Structure of Data Warehouse: 

Physical Structure of Data Warehouse There are three basic architectures for constructing a data warehouse : Centralized Federated Tiered

Centralized Architecture: 

Centralized Architecture Central Data Warehouse Client Client Client Source Source Client Client

Physical Structure of Data Warehouse: 

Physical Structure of Data Warehouse Federated architecture : The logical data warehouse is only virtual. Tiered architecture : The central data warehouse is physical. There exist local data marts on different tiers which store copies or summarization of the previous tier.

Federated Architecture: 

Federated Architecture Logical data Warehouse Source Source Local Data Marts End Users Marketing financial distribution

Tiered Architecture: 

Tiered Architecture Physical data Warehouse Source Source Workstations Local Data Marts

Conceptual Modeling of Data Warehouses: 

Conceptual Modeling of Data Warehouses The three basic conceptual schemas: Star schema Snowflake schema Fact constellations

Star schema: 

Star schema In this schema a single object (fact table) in the middle is connected to a number of dimension tables . Fact Table Time Customer No. Product No. . City

Example of Star schema: 

Example of Star schema Date Month Year Date CustId CustName CustCity CustCountry Customer Sales Fact Table Date Product Store Customer unit_sales dollar_sales schilling_sales Measurements ProductNo ProdName ProdDesc Category Product StoreID City State Country Region Store

Snowflake schema: 

Snowflake schema A snowflake schema is a refinement of star schema where the dimensional hierarchy is represented explicitly by normalizing the dimension tables. Fact Table Time Customer No. Product No. . City Region

Example of Snowflake schema: 

Example of Snowflake schema CustId CustName CustCity CustCountry Cust Date Month Date Month Year Month Year Year City State City Country Region Country State Country State StoreID City Store Measurements ProductNo ProdName ProdDesc Category Product Sales Fact Table Date Product Store Customer unit_sales dollar_sales schilling_sales

Fact Constellations: 

Fact Constellations In fact constellations, multiple fact tables are present that share many dimension tables. For example : Booking and Checkout may share many dimension tables in a hotel . Hotels Travel Agents Promotion Room Type Customer Booking Checkout

Application areas: 

Application areas Industry Application Finance Credit Card Analysis Insurance Claims, Fraud Analysis Telecommunication Call record analysis Transport Logistics management Consumer goods promotion analysis Data Service providers Value added data Utilities Power usage analysis

Conclusion: 

Conclusion Data warehouse is not a software product or application - it is an important information processing system architecture for decision making!

References: 

References Inmon , Bill (1992). Building the Data Warehouse . Kimball, Ralph et al. (1998); The Data Warehouse Lifecycle Toolkit , p17. Han Jiawei,Kamber Micheline (2008), Data Mining-Concepts And Techniques

Any Queries ?: 

Any Queries ? THANK YOU