ISQS 6339 Data Management & Business IntelligenceIntroduction: ISQS 6339 Data Management & Business Intelligence Introduction Zhangxi Lin
Outline: Outline Definitions of BI
Categorizations of BI
BI products for the class
BI cases
About data mining
What is Business Intelligence: What is Business Intelligence A Simple Definition: The applications and technologies transforming Business Data into Action
Business intelligence (BI) is a business management term
refers to applications and technologies which are used to gather, provide access to, and analyze data and information about their company operations.
Business intelligence systems can help companies gain more comprehensive knowledge of the factors affecting their business, and help companies to make better business decisions.
BI Problems : BI Problems Structured
Detecting Credit card fraud
Setting Loan parameters
Market segmentation/Mass customization
Deciding Marketing mix
Customer Churn
Reducing employee turnover
Improving Quality/Efficiency
…
Unstructured
Data exploration
Utilization of resources (stored knowledge) to maximum effectiveness
…
BI Applications: BI Applications Customer Analytics
Customer profiling
Targeted marketing
Personalization
Collaborative filtering
Customer satisfaction
Customer lifetime value
Customer loyalty
Sales Channel Analytics
Marketing
Sales performance and pipeline
BI Applications (2): BI Applications (2) Supply Chain Analytics
Supplier and vendor management
Shipping
Inventory control
Distribution analysis
Behavior Analysis
Purchasing trends
Web activity
Fraud and abuse detection
Customer attrition
Social network analysis
BI Market: BI Market Tools providers
Microsoft
IBM
Oracle
Main adopters
Google
Yahoo
Tencent
Best Buy
Wart Mart
Why is BI getting hot?: Why is BI getting hot? Demands from processing explosive information
MIS/ERP
Internet
Gartner Says Business Intelligence Software Market to Reach $3 Billion in 2009 Gartner's CIO Survey ranked BI as number one technology priority for 2006 London, UK, 7 February 2006 - New license revenue in the worldwide business intelligence (BI) software market is poised for constant growth through 2009, when the market is projected to reach $3 billion in 2009, according to the latest forecasts by Gartner Inc. In 2006, the market is estimated to reach 2.5 billion, a six percent increase from 2005.
The process of BI: The process of BI Data -> information -> knowledge -> actionable plans
Data -> information: the process of determining what data is to be collected and managed and in what context
Information -> knowledge: The process involving the analytical components, such as data warehousing, online analytical processing, data quality, data profiling, business rule analysis, and data mining
Knowledge -> actionable plans: The most important aspect in a BI process
Actionable Knowledge: Actionable Knowledge An information asset retains its value on if the converted knowledge is actionable.
Need some methods for extracting value from knowledge
This is not a technical issue but an organizational one – need empowered individuals in the organization to take the action
There is an issue of Return on Investment (ROI)
BI Methods and Techniques: BI Methods and Techniques Data warehousing – Making historical data available for analytics
Data preparation – Extraction, transformation and loading
Query - a collection of specifications that enables you to focus on a particular set of data.
Online Analytical Processing (OLAP) - a capability of information systems that supports interactive examination of large amounts of data from many perspectives.
Reporting - generates aggregated views of data to keep the management informed about the state of their business.
Data mining - extraction of knowledge by utilizing software that can isolate and identify previously unknown patterns or trends in large amounts of data.
Available Tools:
Microsoft SQL Server 2005
SAS Enterprise Guide v4.1
Other SAS BI Products
SAS Enterprise BI Server
Microsoft SQL Server: Microsoft SQL Server SQL Server is a client-server based, relational database engine. That puts it head-to-head with the likes of IBM’s DB2 and Oracle’s Oracle… or so Microsoft dearly wants us to believe.
The problem is that, while DB2 and Oracle are unquestionably enterprise-level products, SQL server has for years been dogged by the suspicion that it can’t really cut the mustard.
SQL Server Products
Microsoft SQL Server 2000
Microsoft SQL Server 2005
Microsoft SQL Server 2008
SQL Server 2005 Editions
SQL Server Express SQL Server Workgroup SQL Server Developer SQL Server Standard SQL Server Enterprise SQL Server Compact
Microsoft Gains Momentum in BI Market as to Launch Major BI Offerings - Microsoft Office PerformancePoint Server 2007 and Microsoft SQL Server 2008.: Microsoft Gains Momentum in BI Market as to Launch Major BI Offerings - Microsoft Office PerformancePoint Server 2007 and Microsoft SQL Server 2008. REDMOND, Wash. — Aug. 21, 2007 — Microsoft Corp. today announced key milestones achieved within the business intelligence (BI) marketplace, including IDC’s recognizing Microsoft as one of the fastest-growing BI vendors in 2006. In IDC’s report, “Worldwide Business Intelligence Tools 2006 Vendor Shares,”* analysts found that Microsoft had a growth rate of 28 percent, the highest among the top 10 industry vendors. In addition, Microsoft® SQL Server™ 2005 was acknowledged by The OLAP Report as the No. 1 online analytical processing (OLAP) server on the market.
SQL Server has established itself as an enterprise-class data platform. A recent BZ Research study found that 74.7% of enterprises use SQL Server, compared with 54.5% for the nearest competitor.
Microsoft will launch Microsoft Office PerformancePoint™ Server 2007 on Sept. 19, and SQL Server 2008 is scheduled to ship in the second quarter of 2008.
PerformancePoint Server 2007 helps organizations align their processes by streamlining into a single application the monitoring, analysis and planning activities needed to improve business performance.
SQL Server 2008 will help organizations deliver a more secure, reliable data platform for storing business-critical information and delivering the right information to all users, while reducing the time and cost of managing data.
Case 1 - NASDAQ: Case 1 - NASDAQ Because this information is needed in real time, NASDAQ needed a solution that would offer:
Enterprise-ready availability and performance.
Agility to enable NASDAQ’s internal developers to swiftly react to customer need.
Lower total cost of ownership to help NASDAQ provide the best value to its members.
Solution
Created its Market Data Dissemination System (MDDS) using Microsoft® SQL Server™ 2005 database running on the Microsoft Windows Server™ 2003 Enterprise Edition operating system. Both Windows Server 2003 and SQL Server 2005 are part of Microsoft Windows Server System™ integrated server software.
MDDS receives direct feeds from NASDAQ’s trade reporting system, and collects the data, storing it in SQL Server 2005. It is then available in real time for queries by market participants, including those using the NASDAQ Workstation, a Web-based tool that connects to NASDAQ trading systems.
Case 2 - Brazilian Stock Exchange -Bovespa: Case 2 - Brazilian Stock Exchange -Bovespa Starting in the late 1970s, the Brazilian São Paulo Stock Exchange (BOVESPA) ran its mission-critical applications on ever-larger, ever-more costly mainframe platforms. By the early 2000s, with daily trading volume growing by 50 percent each year, BOVESPA IT executives decided to evaluate their technology investments and infrastructure.
Solution
The initial phase of the project, to migrate the existing mainframe-based applications, ran from 2003 to 2005. It involved a team of more than 200 people—about 80 from BOVESPA, 100 from partner companies, and 30 from Microsoft Services and other Microsoft organizations—who generated upwards of 4 million lines of code. Technical experts from HP, the hardware provider, were also heavily involved.
With the help of Microsoft® Visual Studio® .NET, Microsoft Visual Studio 2005, COM+, and other technologies in the Microsoft .NET Framework, Microsoft Services helped the team utilize a service-oriented architecture, including XML and Microsoft Distributed Transaction Coordinator, for strengthening security and integration among applications that reside on opposite sides of multiple firewalls.
Other Leading BI Solution Provider: Other Leading BI Solution Provider Business Objects
A French company that develops enterprise software, specifically software that provides business intelligence (BI) to businesses. The company claims more than 42,000 customers worldwide. Their flagship product is BusinessObjects™ XI, with components that provide performance management, planning, reporting, query and analysis, and enterprise information management. Like many enterprise software companies, Business Objects also offers consulting and education services to help customers deploy their business intelligence projects.
Business Objects has dual headquarters in San Jose, California, and Paris, France. The company's stock is traded on both the Nasdaq and Euronext Paris (BOB) stock exchanges.
http://en.wikipedia.org/wiki/Business_Objects_(company)
Business Dimensional Lifecycle: Business Dimensional Lifecycle Project
Planning Business
Req’ts
definition Technical
Arch.
Design Product
Selection &
Installation Dimensional
Modeling Physical
Design BI Appl.
Specification BI
Application
Development ETL design
&
Development Deployment Maintenance Growth Project Management
The BI Team: The BI Team CC: Core Competence
Data, information, and knowledge: Data, information, and knowledge Data – a collection of raw value elements or facts used for calculating, reasoning, or measuring.
Information – the result of collecting and organizing data in a way that establishes relationship between data items, which thereby provides context and meaning
Knowledge – the concept of understanding information based on recognized patterns in a way that provides insight to information.
Data – the focus: Data – the focus Data is related to business processes
Data processing
Data management
Database, data mart, and data warehousing
Data preparation for business intelligence applications
What is Data Mining?: What is Data Mining? Many Definitions
Non-trivial extraction of implicit, previously unknown and potentially useful information from data
Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns
Origins of Data Mining: Draws ideas from machine learning/AI, pattern recognition, statistics, and database systems
Traditional Techniques may be unsuitable due to
Enormity of data
High dimensionality of data
Heterogeneous, distributed nature of data Origins of Data Mining Machine Learning/
Pattern Recognition Statistics/ AI Data Mining Database systems
Data Mining Tasks...: Data Mining Tasks... Classification [Predictive]
Clustering [Descriptive]
Association Rule Discovery [Descriptive]
Sequential Pattern Discovery [Descriptive]
Regression [Predictive]
Deviation Detection [Predictive]
Text Mining Tasks: Text Mining Tasks Exploratory Data Analysis
Using text to form hypotheses about diseases (Swanson and Smalheiser, 1997).
Information Extraction
(Semi)automatically create (domain specific) knowledge bases, and then use standard data-mining techniques.
Bootstrapping methods (Riloff and Jones, 1999).
Text Classification
Useful intermediary step for information extraction
Bootstrapping method using EM (Nigam et al., 2000).
Text Mining General Application Areas: Text Mining General Application Areas Information Retrieval
Indexing and retrieval of textual documents
Finding a set of (ranked) documents that are relevant to the query
Information Extraction
Extraction of partial knowledge in the text
Web Mining
Indexing and retrieval of textual documents and extraction of partial knowledge using the web
Classification
Predict a class for each text document
Clustering
Generating collections of similar text documents
Data Mining Process: Data Mining Process
Slide27: Fall 1 Spring 1 Summer 1 Fall 2 Spring 2 Summer 2 ISQS 6339
ISQS 3358
BI ISQS 6347
Data Mining ISQS 6338
Database ISQS 5349
Regression
Analysis ISQS 5359
Project Mgmt ISQS 7338
System Anal
& Design ISQS 7342
Business
Analytics ISQS 5343 Core Tool Elective ISQS 5345
or
ISQS 5347 ISQS 5338
Info Tech &
E-Business ISQS 6349
Adv Business
Forecasting MKT 5355
Research
Method
Slide28: Fall 1 Spring 1 Summer 1 Fall 2 Spring 2 Summer 2 ISQS 6347
Data Mining ISQS 6338
Database ISQS 5359
Project Mgmt ISQS 7338
System Anal
& Design ISQS 6339
Data Mgmt
& BI ISQS 7342
Business
Analytics Core Tool Elective ISQS 5349
Regression
Analysis ISQS 5343 ISQS 5345
or
ISQS 5347 ISQS 5338
Info Tech &
E-Business ISQS 6349
Adv Business
Forecasting MKT 5355
Research
Method For those who have taken undergrad database courses – Speedy schedule
CAABI: CAABI Center for Advanced Analytics and Business Intelligence initially started in 2004 by Dr. Peter Westfall, ISQS, Rawls College of Business.
Looking to offer support to companies in developing BI capabilities.
Lots of technical expertise.