logging in or signing up lecture 03 Dante Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 558 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 16, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Statistics 201 - Lecture 3: Statistics 201 - Lecture 3 Slide2: Last class: histograms, measures of center, percentiles and measures of spread…well, we shall finish these today Will have completed Chapters 1 and 2 Assignment #1: 1.6, 1.14, 1.24, 1.26, 2.4, 2.6, 2.16, 2.24, 2.34 Due: Wednesday, January 16 Suggested problem: 1.34 (do this one!!!)Measures of Spread (cont.): Measures of Spread (cont.) 5 number summary often reported: Min, Q1, Q2 (Median), Q3, and Max Summarizes both center and spread What proportion of data lie between Q1 and Q3? Box-Plot: Box-Plot Displays 5-number summary graphically Box drawn spanning quartiles Line drawn in box for median Lines extend from box to max. and min values. Some programs draw whiskers only to 1.5*IQR above and below the quartilesSlide8: Can compare distributions using side-by-side box-plots What can you see from the plot? Other Common Measure of Spread: Sample Variance: Other Common Measure of Spread: Sample Variance Sample variance of n observations: Can be viewed as roughly the average squared deviation of observations from the sample mean Units are in squared units of dataSample Standard Deviation: Sample Standard Deviation Sample standard deviation of n observations: Can be viewed as roughly the average deviation of observations from the sample mean Has same units as dataExercise: Exercise Compute the sample standard deviation and variance for the Muzzle Velocity ExampleSlide12: Variance and standard deviation are most useful when measure of center is As observations become more spread out, s : increases or decreases? Both measures sensitive to outliers 5 number summary is better than the mean and standard deviation for describing (I) skewed distributions; (ii) distributions with outliersEmpirical Rule for Bell-Shaped Distributions: Empirical Rule for Bell-Shaped Distributions Approximately 68% of the data lie in the interval 95% of the data lie in the interval 95% of the data lie in the interval Can use these to help determine range of typical values or to identify potential outliers Example…Putting this all together: Example…Putting this all together A geyser is a hot spring that becomes unstable and erupts hot gases into the air. Perhaps the most famous of these is Wyoming's Old Faithful Geyser. Visitors to Yellowstone park most often visit Old Faithful to see it erupt. Consequently, it is of great interest to be able to predict the interval time of the next eruption. Example…Putting this all together: Example…Putting this all together Consider a sample of 222 interval times between eruptions (Weisberg, 1985). The first few lines of the available data are: Goal: Help predict the interval between eruptions. Consider a variety of plots that may shed some light upon the nature of the intervals between eruptions Example…Putting this all together: Example…Putting this all together Goal: Help predict the interval between eruptions Consider a histogram to shed some light upon the nature of the intervals between eruptions Example…Putting this all together: Example…Putting this all together Example…Putting this all together: Example…Putting this all together What does the box-plot show? Is a box-plot useful at showing the main features of these data? What does the empirical rule tell us about 95% of the data? Is this useful? We will come back to this in a minute… Scatter-Plots: Scatter-Plots Help assess whether there is a relationship between 2 continuous variables, Data are paired (x1, y1), (x2, y2), ... (xn, yn) Plot X versus Y If there is no natural pairing…probably not a good idea! What sort of relationships might we see?Example…Putting this all together: Example…Putting this all together What does this plot reveal? Example…Putting this all together: Example…Putting this all together Example…Putting this all together: Example…Putting this all together Suppose an eruption of 2.5 minutes had just taken place. What would you estimate the length of the next interval to be? Suppose an eruption of 3.5 minutes had just taken place. What would you estimate the length of the next interval to be? You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
lecture 03 Dante Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 558 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: January 16, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Statistics 201 - Lecture 3: Statistics 201 - Lecture 3 Slide2: Last class: histograms, measures of center, percentiles and measures of spread…well, we shall finish these today Will have completed Chapters 1 and 2 Assignment #1: 1.6, 1.14, 1.24, 1.26, 2.4, 2.6, 2.16, 2.24, 2.34 Due: Wednesday, January 16 Suggested problem: 1.34 (do this one!!!)Measures of Spread (cont.): Measures of Spread (cont.) 5 number summary often reported: Min, Q1, Q2 (Median), Q3, and Max Summarizes both center and spread What proportion of data lie between Q1 and Q3? Box-Plot: Box-Plot Displays 5-number summary graphically Box drawn spanning quartiles Line drawn in box for median Lines extend from box to max. and min values. Some programs draw whiskers only to 1.5*IQR above and below the quartilesSlide8: Can compare distributions using side-by-side box-plots What can you see from the plot? Other Common Measure of Spread: Sample Variance: Other Common Measure of Spread: Sample Variance Sample variance of n observations: Can be viewed as roughly the average squared deviation of observations from the sample mean Units are in squared units of dataSample Standard Deviation: Sample Standard Deviation Sample standard deviation of n observations: Can be viewed as roughly the average deviation of observations from the sample mean Has same units as dataExercise: Exercise Compute the sample standard deviation and variance for the Muzzle Velocity ExampleSlide12: Variance and standard deviation are most useful when measure of center is As observations become more spread out, s : increases or decreases? Both measures sensitive to outliers 5 number summary is better than the mean and standard deviation for describing (I) skewed distributions; (ii) distributions with outliersEmpirical Rule for Bell-Shaped Distributions: Empirical Rule for Bell-Shaped Distributions Approximately 68% of the data lie in the interval 95% of the data lie in the interval 95% of the data lie in the interval Can use these to help determine range of typical values or to identify potential outliers Example…Putting this all together: Example…Putting this all together A geyser is a hot spring that becomes unstable and erupts hot gases into the air. Perhaps the most famous of these is Wyoming's Old Faithful Geyser. Visitors to Yellowstone park most often visit Old Faithful to see it erupt. Consequently, it is of great interest to be able to predict the interval time of the next eruption. Example…Putting this all together: Example…Putting this all together Consider a sample of 222 interval times between eruptions (Weisberg, 1985). The first few lines of the available data are: Goal: Help predict the interval between eruptions. Consider a variety of plots that may shed some light upon the nature of the intervals between eruptions Example…Putting this all together: Example…Putting this all together Goal: Help predict the interval between eruptions Consider a histogram to shed some light upon the nature of the intervals between eruptions Example…Putting this all together: Example…Putting this all together Example…Putting this all together: Example…Putting this all together What does the box-plot show? Is a box-plot useful at showing the main features of these data? What does the empirical rule tell us about 95% of the data? Is this useful? We will come back to this in a minute… Scatter-Plots: Scatter-Plots Help assess whether there is a relationship between 2 continuous variables, Data are paired (x1, y1), (x2, y2), ... (xn, yn) Plot X versus Y If there is no natural pairing…probably not a good idea! What sort of relationships might we see?Example…Putting this all together: Example…Putting this all together What does this plot reveal? Example…Putting this all together: Example…Putting this all together Example…Putting this all together: Example…Putting this all together Suppose an eruption of 2.5 minutes had just taken place. What would you estimate the length of the next interval to be? Suppose an eruption of 3.5 minutes had just taken place. What would you estimate the length of the next interval to be?