# Power point presentation

Views:

## Presentation Description

I wish to see the power point presentation

## Presentation Transcript

### Slide 1:

Introduction to Biostatistics Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College of Human Medicine Michigan State University

### What does “STATISTICS” mean? :

What does “STATISTICS” mean? The word “Statistics” has several meanings: It is frequently used in referring to recorded data  Statistics also denotes characteristics calculated for a set of data, for example, sample mean Statistics also refers to statistical methodology, techniques and procedures dealing with the design of experiments, collection, organization, analysis of the information contained in a data set to make inferences about the population parameters

### What do statisticians do? :

What do statisticians do? To guide the design of an experiment or survey prior to the data collection   To analyze data using proper statistical procedures and techniques To present and interpret results to the researchers and other decision makers including the government and industries

### WHY STUDY STATISTICS? :

WHY STUDY STATISTICS? Knowledge of statistics is essential for people going into research, management or graduate study   Basic understanding of statistics is useful for conducting investigations and an effective presentation Understanding of statistics can help anyone discriminate between fact and fancy in daily life   A course in statistics should help one know when, and for what, a statistician should be consulted

### Definition of Population & Sample :

Definition of Population & Sample A population is a set of measurements of interest to the researcher. Examples: 1. Income of households living in Karachi  2. The number of children in families living Pakistan  3. The health status of adults in a community A subset of the population is called sample. A sample is usually selected such that it is representative of the population

### Descriptive & Inferential Statistics :

Descriptive & Inferential Statistics 1. Descriptive Statistics deal with the enumeration, organization and graphical representation of data 2. Inferential Statistics are concerned with reaching conclusions from incomplete information, that is, generalizing from the specific sample An example of inferential statistics include using available information about the health status of people in a sample to draw inferences about the underlying population from which the sample is selected

### INFERENTIAL STATISTICS :

INFERENTIAL STATISTICS The objective of inferential statistics is to make inference about the population parameters based on the information contained in the sample. Estimation (e.g., Estimating the prevalence of hypertension among adults living in Karachi) Testing Hypothesis (e.g., Testing the effectiveness of a new drug for reducing cholesterol levels)

### Sources of Data :

Sources of Data Data may come from different sources: Surveillance systems (e.g., NIH) Planned surveys (Government, Universities, NGOs) Experiments (Pharmaceutical Companies) Health Organizations (Administrative Data sets) Private sector (Banks, Companies, etc) Government (All government agencies) Here we will focus on surveys and experiments What is the difference between a survey and an experiment?

### Difference between Surveys & Experiments :

Difference between Surveys & Experiments A Survey Data represent observations of events or phenomena over which few, if any, controls are imposed. (e.g., Assessing the association between different lifestyles and heart disease) In an experiment we design a research plan purposely to impose controls over the amount of exposure (treatment) to a drug. (e.g., Clinical Trials)

### Sampling Methods :

Sampling Methods Random Sampling (Simple) Systematic Sampling Stratified Sampling Cluster Sampling Convenience Sampling More complex sampling

### Some Epidemiologic Studies :

Some Epidemiologic Studies Retrospective Studies: Retrospective Studies gather past data from selected cases and controls to determine difference, if any, in the exposure to a suspected factor. They are commonly referred to as case-control studies Prospective Studies: Prospective studies are usually cohort studies in which one enrolls a group of healthy people and follows them over a certain period to determine the frequency with which a disease develops

### Qualitative and Quantitative Variables :

Qualitative and Quantitative Variables Examples of qualitative variables are occupation, sex, marital status, and etc Variables that yield observations that can be measured are considered to be quantitative variables. Examples of quantitative variables are weight, height, and age   Quantitative variables can further be classified as discrete or continuous

### Slide 13:

VARIABLES TYPES Categorical variables (e.g., Sex, Marital Status, income category) Continuous variables (e.g., Age, income, weight, height, time to achieve an outcome) Discrete variables (e.g.,Number of Children in a family) Binary or Dichotomous variables (e.g., response to all Yes or No type of questions)

### Slide 14:

VARIABLES SCALE SCALE OF VARIABLE Nominal Scale Ordinal Scale Interval Scale Interval Ratio Scale

### Scale of Data :

Scale of Data 1. Nominal: These data do not represent an amount or quantity (e.g., Marital Status, Sex) 2. Ordinal: These data represent an ordered series of relationship (e.g., level of education) 3. Interval: These data is measured on an interval scale having equal units but an arbitrary zero point. (e.g.: Temperature in Fahrenheit) 4. Interval Ratio: Variable such as weight for which we can compare meaningfully one weight versus another (say, 100 Kg is twice 50 Kg)

### VARIABLES IN THE PROTOCOL :

VARIABLES IN THE PROTOCOL TYPES OF VARIABLE independent dependent intermediate confounding

### Independent Variable :

Independent Variable The characteristic being observed and/or measured that is hypothesized to influence an event or outcome (dependent variable). NOTE The independent variable is not influenced by the event or outcome, but may cause it or contribute to its variation.

### Dependent Variable :

Dependent Variable A variable whose value is dependent on the effect of other variables (ie., “independent variables”) in the relationship being studied. Synonyms: outcome or response variable. NOTE an event or outcome whose variation we seek to explain or account for by the influence of independent variables.

### Intermediate Variable :

Intermediate Variable A variable that occurs in a causal pathway from an independent to a dependent variable. Synonyms: intervening, mediating NOTES it produces variation in the dependent variable, and is caused to vary by the independent variable. such a variable is “associated” with both the dependent and independent variables.

### Confounding Variable :

Confounding Variable A factor (that is itself a determinant of the outcome), that distorts the apparent effect of a study variable on the outcome. NOTE such a factor may be unequally distributed among the exposed and the unexposed, and thereby influence the apparent magnitude and even the direction of the effect.

### Organizing Data :

Organizing Data Frequency Table Frequency Histogram Relative Frequency Histogram Frequency polygon Relative Frequency polygon Bar chart Pie chart stem-and-leaf display Box Plot

### Frequency Table :

Frequency Table Suppose we are interested in studying the number of children in the families living in a community. The following data has been collected based on a random sample of n = 30 families from the community. 2, 2, 5, 3, 0, 1, 3, 2, 3, 4, 1, 3, 4, 5, 7, 3, 2, 4, 1, 0, 5, 8, 6, 5, 4 , 2, 4, 4, 7, 6 Organize this data in a Frequency Table!

### Frequency Table :

Frequency Table Now suppose we need to construct a similar frequency table for the age of patients with Heart related problems in a clinic. The following data has been collected based on a random sample of n = 30 patients who went to the emergency room of the clinic for Heart related problems. The measurements are: 42, 38, 51, 53, 40, 68, 62, 36, 32, 45, 51, 67, 53, 59, 47, 63, 52, 64, 61, 43, 56, 58, 66, 54, 56, 52, 40, 55, 72, 69.

### Measures of Central Tendency :

Measures of Central Tendency Where is the heart of distribution? 1. Mean 2. Median 3. Mode

### Sample Mean :

Sample Mean The arithmetic mean (or, simply, mean) is computed by summing all the observations in the sample and dividing the sum by the number of observations. For a sample of five household incomes, 6000, 10,000, 10,000, 14000, 50,000 the sample mean is,

### Sample Median :

Sample Median In a list ranked from smallest measurement to the highest, the median is the middle value In our example of five household incomes, first we rank the measurements   6,000, 10,000, 10,000, 14,000, 50,000 Sample Median is 10,000

### Measures of Dispersion or Variability :

Measures of Dispersion or Variability Range Variance Standard deviation

### Formula for Sample Variance & Standard deviation S :

Formula for Sample Variance & Standard deviation S Standard deviation = S

### Calculation of Variance and Standard deviation :

Calculation of Variance and Standard deviation

### Empirical Rule :

Empirical Rule For a Normal distribution approximately, a) 68% of the measurements fall within one standard deviation around the mean b) 95% of the measurements fall within two standard deviations around the mean c) 99.7% of the measurements fall within three standard deviations around the mean

### Suppose the reaction time of a particular drug has a Normal distribution with a mean of 10 minutes and a standard deviation of 2 minutes :

Suppose the reaction time of a particular drug has a Normal distribution with a mean of 10 minutes and a standard deviation of 2 minutes Approximately, a) 68% of the subjects taking the drug will have reaction tome between 8 and 12 minutes b) 95% of the subjects taking the drug will have reaction tome between 6 and 14 minutes c) 99.7% of the subjects taking the drug will have reaction tome between 4 and 16 minutes