Data Analysis :Data Analysis Presented By:
Shankar Kumud
Kushwaha Sikander
Data Analysis is :Data Analysis is Data analysis is a process of:
Gathering;
Modeling; and
Transforming of data.
With the goal of highlighting useful
Information;
Suggesting conclusions; and
Supporting decision making. 11/6/2009 XIDAS, Jabalpur 2
Slide 3:11/6/2009 XIDAS, Jabalpur 3
Slide 4:11/6/2009 XIDAS, Jabalpur 4
Slide 5:11/6/2009 XIDAS, Jabalpur 5
Slide 6:11/6/2009 XIDAS, Jabalpur 6
Slide 7:11/6/2009 XIDAS, Jabalpur 7
Slide 8:11/6/2009 XIDAS, Jabalpur 8
Slide 9:11/6/2009 XIDAS, Jabalpur 9
Slide 10:11/6/2009 XIDAS, Jabalpur 10
Slide 11:11/6/2009 XIDAS, Jabalpur 11
Slide 12:Thank You 11/6/2009 XIDAS, Jabalpur 12
Major Data Analysis Techniques :Major Data Analysis Techniques Correlation Analysis;
Regression Analysis;
Factor Analysis;
Cluster Analysis;
Correspondence Analysis (Brand Mapping);
Conjoint Analysis;
CHAID Analysis;
Discriminant /Logistic Regression Analysis;
Multidimensional Scaling; and
Structural Equation Modeling. 11/6/2009 XIDAS, Jabalpur 13
CORRELATION ANALYSIS :CORRELATION ANALYSIS Correlation analysis, expressed by correlation coefficients, measures the degree of linear relationship between two variables.
Feature of Correlation coefficient:
Between + and – 1;
The sign of the correlation coefficient (+, -) defines the direction of the relationship, +tive or –tive;
A positive correlation coefficient means that as the value of one variable increases, the value of the other variable also increases; as one decreases the other decreases; and
A negative correlation coefficient indicates that as one variable increases, the other decreases, and vice-versa. 11/6/2009 XIDAS, Jabalpur 14
Cont.. :Cont.. The absolute value of the correlation coefficient measures the strength of the relationship.
A correlation coefficient of r=0.50 indicates a stronger degree of linear relationship than one of r=0.40.
Correlation coefficient of zero (r=0.0) indicates the absence of a linear relationship.
Correlation coefficients of r=+1.0 and r=-1.0 indicate a perfect linear relationship. 11/6/2009 XIDAS, Jabalpur 15
Diagrammatic presentation “r” :Diagrammatic presentation “r” 11/6/2009 XIDAS, Jabalpur 16 R=0.5 R=1 R= -0.5
Regression analysis :Regression analysis Regression analysis measures the:
strength of a relationship between a variable (e.g. overall customer satisfaction)
one or more explaining variables (e.g. satisfaction with product quality and price).
Correlation provides a single numeric summary of a relation (called the correlation coefficient), while regression analysis results in a "prediction" equation.
The regression equation describes the relation between the variables. If the relationship is strong (expressed by the Rsquare value), it can be used to predict values of one variable given the other variables have known values. 11/6/2009 XIDAS, Jabalpur 17
Cont.. :Cont.. For example: how will the overall satisfaction score change if satisfaction with product quality goes up from 6 to 7. 11/6/2009 XIDAS, Jabalpur 18
Factor Analysis :Factor Analysis Factor analysis aims to describe a large number of variables or questions by only using a reduced set of underlying variables, called factors.
It explains a pattern of similarity between observed variables. Questions which belong to one factor are highly correlated with each other. 11/6/2009 XIDAS, Jabalpur 19 Types of Factor Analysis
Use of Factor Analysis :Use of Factor Analysis Factor analysis is often used in customer satisfaction studies to identify underlying service dimensions, and in profiling studies to determine core attitudes.
For example, as part of a national survey on political opinions, respondents may answer three separate questions regarding environmental policy, reflecting issues at the local, regional and national level.
Factor analysis can be used to establish whether the three measures do, in fact, measure the same thing.
It is can also prove to be useful when a lengthy questionnaire needs to be shortened, but still retain key questions.
Factor analysis will indicate which questions can be omitted without losing too much information. 11/6/2009 XIDAS, Jabalpur 20
CLUSTER ANALYSIS :CLUSTER ANALYSIS Cluster analysis is an exploratory tool designed to reveal natural groupings within a large group of observations. Cluster analysis segments the survey sample, i.e. respondents or companies, into a small number of groups. 11/6/2009 XIDAS, Jabalpur 21
BRAND MAPPING (CORRESPONDENCE ANALYSIS) :BRAND MAPPING (CORRESPONDENCE ANALYSIS) Correspondence analysis is a technique which:
Allows rows and columns of a data matrix,
E.g. average satisfaction scores for several products, to be displayed as points in a two dimensional space or map. It reduces a complicated set of data to a graphical display which is immediately and easily interpretable. Brand maps are based on correspondence analysis.
Brand maps are often used to illustrate customers' images of the market by placing products and attributes together on a map. This allows close interpretation of company perceptions with a variety of product and service attributes simultaneously. 11/6/2009 XIDAS, Jabalpur 22
Slide 23:11/6/2009 XIDAS, Jabalpur 23
CONJOINT ANALYSIS :CONJOINT ANALYSIS Conjoint analysis is a technique for measuring respondent preferences about the attributes of a product or service.
It is the ideal tool for new/improved product development.
The conjoint analysis task asks the respondents to make choices in the same fashion as consumers normally do, by trading off features one against the other, either by ranking or choosing one of several product combinations.
E.g. a task could be: do you prefer a "flight that is cramped, costs £250 and has one stop" or a "flight that is spacious, costs £500 and is direct"? 11/6/2009 XIDAS, Jabalpur 24
Example of C A :Example of C A 11/6/2009 25 XIDAS, Jabalpur Importance Of Printer Features, Plus Simulator
Cont.. :Cont.. 11/6/2009 XIDAS, Jabalpur 26
CHAID ANALYSIS :CHAID ANALYSIS CHAID (Chi Squared Automatic Interaction Detection) is used to build:
a predictive model, based on a classification system.
The analysis subdivides the sample into a series of subgroups that :
1) share similar characteristics towards a specific response variable and that
2) maximises our ability to predict the values of the response variable. 11/6/2009 XIDAS, Jabalpur 27
: 11/6/2009 28 XIDAS, Jabalpur The output is a tree of which the branches are the predictor variables that split the sample in discriminating groups.
DISCRIMINANT/LOGISTIC REGRESSION ANALYSIS :DISCRIMINANT/LOGISTIC REGRESSION ANALYSIS Discriminant and logistic regression analysis are statistical techniques that point out the differences between two or more groups based on several characteristics (most often rating scales when Discriminant analysis, while logistic regression can handle any type of variable) Is often used :
to determine which customers are likely to buy a company's product
to decide whether a bank should offer a loan to a new company or
to identify patients which may be at high risk for medical problems 11/6/2009 XIDAS, Jabalpur 29
Diagrammatic Presentation :Diagrammatic Presentation 11/6/2009 XIDAS, Jabalpur 30
MULTIDIMENSIONAL SCALING :MULTIDIMENSIONAL SCALING Multidimensional scaling (MDS) can be considered to be an alternative to factor analysis.
In general, the goal of the analysis is to detect meaningful underlying dimensions that allow the researcher to explain observed similarities or dissimilarities between the investigated objects. In factor analysis, the similarities between objects (e.g. variables) are expressed in the correlation matrix.
With MDS one may analyse any kind of similarity or dissimilarity matrix, in addition to correlation matrices. 11/6/2009 XIDAS, Jabalpur 31
MDS methods are applicable to a wide variety of research designs. :MDS methods are applicable to a wide variety of research designs. 11/6/2009 XIDAS, Jabalpur 32
STRUCTURAL EQUATION MODELING :STRUCTURAL EQUATION MODELING Structural Equation Modeling (SEM) is a very general, very powerful multivariate analysis technique that includes a number of other traditional analysis methods as special cases.
It effectively includes a whole range of standard multivariate analysis methods, such as regression, factor analysis and analysis of variance.
A structural equation model can exist with several regression and factor analysis models, which are estimated simultaneously. 11/6/2009 XIDAS, Jabalpur 33
E.g. of a CRM model using survey :E.g. of a CRM model using survey 11/6/2009 XIDAS, Jabalpur 34
Slide 35:Thank You 11/6/2009 XIDAS, Jabalpur 35