BASICS ……..Biostats…: BASICS ……..Biostats… Dr. Naveen Danesh Department of Env . Science Gulbarga
…….STATISTICS…….: …….STATISTICS……. VARIABLE :It is the characteristic of person, object or phenomenon that can take on any value. A characteristic that is observed or manipulated Can take on different values DATA: It is the set of values of one or more variables recorded on one or more individuals. Primary Data: Census is an example of collecting primary data from population Secondary Data: Already existing data about problem / population example from hospital record, to use the census data.
Slide 3: TYPES OF DATA . QUALITATIVE DATA : It is the data which shows individual values falling into separate classes, these classes may have no numerical relationship with one another. Example: hair color , severity of disease. : QUANTITATIVE DATA : It is the data which shows some numerical value. Example: Family size , heig ht , weight.
Populations: Populations A population is the group from which a sample is drawn e.g., headache patients in a office; automobile crash victims in an emergency room In research, it is not practical to include all members of a population Thus, a sample (a subset of a population) is taken
STATISTICS: STATISTICS STATISTICS: It is the science of collection, summarizing , analyzing , interpreting and presentation of data. BIOSTATISTICS: Branch of statistics-Deals with the application of statistical methods to the information related to biological sciences.
Slide 6: Environmental statistics is the application of statistical methods to environmental science. It covers procedures for dealing with questions concerning both the natural environment in its undisturbed state and the interaction of humans with the environment. Thus weather, climate, air and water quality are included, as are studies of plant and animal populations. Environmental statistics covers a number of types of study: Baseline studies to document the present state of an environment to provide background in case of unknown changes in the future; Targeted studies to describe the likely impact of changes being planned or of accidental occurrences; Regular monitoring to attempt to detect changes in the environment
TYPES…..: TYPES….. DESCRIPTIVE BIOSTATISTICS deal with the enumeration, organization and graphical representation of data. INFERENTIAL BIOSTATISTICS are concerned with reaching conclusions from incomplete information, that is, generalizing from the specific sample.
Applications: Applications Using statistics is a helpful way to study different situations… Easy to interpret after statistical analysis… Organizing Data : Frequency Table, Frequency Histogram ,Relative Frequency ,Histogram Frequency, Box Plot. Presentation of Data : Tabulation Simple T Frequency distribution T Charts & Diagrams
TOPICS………: TOPICS……… The mean (or average) is found by taking the sum of the numbers and then dividing by how many numbers you added together. The number that occurs most frequently is the mode. When the number are arranged in numerical order, the middle one in the median.
Formula to calculate the mean: Formula to calculate the mean Mean of a sample Mean of a population refers to the mean of a sample and refers to the mean of a population E X is a command that adds all of the X values n is the total number of values in the series of a sample and N is the same for a population
Topic One: Topic One The mean (or average) is found by adding all the numbers and then dividing by how many numbers you added together. Example: 3,4,5,6,7 3+4+5+6+7= T divided by n = ? The mean is ?
Measures of central tendency (cont.): Measures of central tendency (cont.) Mode The most frequently occurring value in a series The modal value is the highest bar in a histogram Mode
Topic Two: Topic Two The number that occurs most frequently is the mode . Example: 2,2,2,4,5,6,7,7,7,7,8
Measures of central tendency (cont.): Measures of central tendency (cont.) Median The value that divides a series of values in half when they are all listed in order The median is the middle value
Topic Three: Topic Three When numbers are arranged in numerical order, the middle one is the median . Example: 3,6,2,5,7 Arrange in order …………. The number in the middle is ? The median is ?
Averaging Grades: Averaging Grades Lowest 55 60 75 80 80 80 83 83 93 93 93 93 93 Highest
Find The mean of the following set of grades: Find The mean of the following set of grades First add all the grades together. The total equals ……. Now divide T by n (total grades The answer is ….. The mean is ….. Lowest 55 60 75 80 80 80 83 83 93 93 93 93 93 Highest
Find the median of the following numbers: Find the median of the following numbers The median is the number in the middle of numbers which are in order from least to greatest. If we count from both sides the number in the middle is ……. The median is …… Lowest 55 60 75 80 80 80 83 83 93 93 93 93 93 Highest
Find the mode of the following grades: Find the mode of the following grades The mode is the number which occurs most often. The number which occurs most often is… The mode is … Lowest 55 60 75 80 80 80 83 83 93 93 93 93 93 Highest
Applications…….: If these were your math grades, what would you learn by analysing them? The mean was 81.61. In order to raise your grades, you would have to make higher than an 81.61 on the rest of your assignments. The mode was 93 which was your highest grade. You could look at these papers to see why you made this grade the most. The median is a 83. This means that most of your grades were higher than your average. Find your week area and try to improve. Applications…….
Appl……: Appl…… Knowing the mean, median, and mode will help you better understand the scores on your report card. By analyzing the data you can find your average, the grade you received most often, and the grade in the middle of your subject area. Better understanding your grades may lead to better study habits.
Applications of Statistics: Applications of Statistics Statistics provides simple yet instant information on the matter it centers on. Statistical methods are useful tools in aiding researches and studies in different fields such as economics, social sciences, business, medicine and many others. Provides a vivid presentation of collected and organized data through the use of figures, charts, diagrams and graphs. Helps provide more critical analyses of information.
Statistics in Science: Statistics in Science Endangered species of different wildlife could be protected through regulations and laws developed using statistics. Epidemics and diseases are monitored with the aid of statistics. Helps in the evaluation of certain medical practices and the effectiveness of drugs.
App……..: App…….. Statistical methods are explored and applied to population growth, disease detection and treatment, genetic and genomic research, drug development, clinical trials, screening and prevention, and the assessment of rehabilitation, recovery, and quality of life. These topics are explored in contributions written by more than 100 leading academics, researchers, and practitioners who utilize various statistical practices, such as election bias, survival analysis, missing data techniques, and cluster analysis for handling the wide array of modern issues in the life and health sciences.
Applic……: Applic…… The field of environmental statistics study is rapidly growing in sophistication due to a combination of advances in methods, data collection systems, and interactive software packages for analysis of this data. Methods and Applications of Statistics in the Atmospheric and Earth Sciences explores the commonly-used statistical methods, from time series analysis and plateau modeling to sampling and provides valuable insight into the application of these techniques to understanding research in various fields, such as:
Slide 26: Geography Geology Animal science Agriculture Global warming Air quality and hazardous-waste remediation
Measure of Dispersion: Measure of Dispersion Gauges the variability that exists…… To form a judgment about how well average differentiates data…….. To learn the Scatter…….to control the existing variation …….. Measure of D are Range,Mean AD, SD…..
Range……: Range…… Mere difference between H n L value… Report the Moment of prices over a period of time……. Weather reports typically the high and low range temperature for 24 hours, period……… RANGE=L-S MAD: Avg of absolute deviation f the individual items about their M….. Difference b/n each data in a set is called D…
Standard deviation (SD) : Standard deviation (SD) SD is a measure of the variability of a set of data The mean represents the average of a group of scores, with some of the scores being above the mean and some below This range of scores is referred to as variability or spread Variance ( S 2 ) is another measure of spread
SD (cont.): SD (cont.) In effect, SD is the average amount of spread in a distribution of scores The next slide is a group of 10 patients whose mean age is 40 years Some are older than 40 and some younger
SD (cont.): SD (cont.) Ages are spread out along an X axis The amount ages are spread out is known as dispersion or spread
Distances ages deviate above and below the mean : Distances ages deviate above and below the mean Adding deviations always equals zero Etc.
Calculating S2: Calculating S 2 To find the average, one would normally total the scores above and below the mean, add them together, and then divide by the number of values However, the total always equals zero Values must first be squared, which cancels the negative signs
Calculating S2 cont. : Calculating S 2 cont. Symbol for SD of a sample for a population S 2 is not in the same units (age), but SD is
SD: SD It is used with the mean and is generally most important and useful in measure of dispersion……. The SD is the square root of the average of squared deviation of the individual data items about their mean…… AM computed…. M is subtracted from each individual items Deviations of the individual items about the mean r squared, to eliminate …….. Squared deviations are then totaled…. Divide the total by its D f Frdm…. Variance By taking the S root f the variance SD can be computed…..
Calculating SD with Excel: Evidence-based Chiropractic Calculating SD with Excel Enter values in a column
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) Click Data Analysis on the Tools menu
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) Select Descriptive Statistics and click OK
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) Click Input Range icon
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) Highlight all the values in the column
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) Check if labels are in the first row Check Summary Statistics Click OK
SD with Excel (cont.): Evidence-based Chiropractic SD with Excel (cont.) SD is calculated precisely Plus several other DSs
Standard error of mean SE: Standard error of mean SE A measure of variability among means of samples selected from certain population SE (Mean) = S n
Frequency……: Frequency…… In statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.
FD……: FD…… A frequency distribution shows us a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class. It is a way of showing unorganized data e.g. to show results of an election, income of people for a certain region, sales of a product within a certain period, student loan amounts of graduates, etc. Some of the graphs that can be used with frequency distributions are histograms, line graphs, bar charts and pie charts. Frequency distributions are used for both qualitative and quantitative data.
AFD……: AFD…… Managing and operating on frequency tabulated data is much simpler than operation on raw data. There are simple algorithms to calculate median, mean, standard deviation etc. from these tables. Statistical hypothesis testing is founded on the assessment of differences and similarities between frequency distributions. This assessment involves measures of central tendency or averages , such as the mean and median , and measures of variability or statistical dispersion , such as the standard deviation or variance .
Numerical presentation: Numerical presentation Tabular presentation (simple – complex) Simple frequency distribution Table Title Name of variable (Units of variable) Frequency % - - Categories - Total