logging in or signing up lecture10 Matild Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 392 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 14, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript The Art of Graphical Presentation: The Art of Graphical Presentation Types of Variables Guidelines for Good Graphics Charts Common Mistakes in Graphics Pictorial Games Special-Purpose ChartsTypes of Variables: Types of Variables Qualitative Ordered (e.g., modem, Ethernet, satellite) Unordered (e.g., CS, math, literature) Quantitative Discrete (e.g., number of terminals) Continuous (e.g., time)Charting Based on Variable Types: Charting Based on Variable Types Qualitative variables usually work best with bar charts or Kiviat graphs If ordered, use bar charts to show order Quantitative variables work well in X-Y graphs Use points if discrete, lines if continuous Bar charts sometimes work well for discreteGuidelines for Good Graphics Charts: Guidelines for Good Graphics Charts Principles of graphical excellence Principles of good graphics Specific hints for specific situations Aesthetics FriendlinessPrinciplesof Graphical Excellence: Principles of Graphical Excellence Graphical excellence is the well-designed presentation of interesting data: Substance Statistics DesignGraphical Excellence (2): Graphical Excellence (2) Complex ideas get communicated with: Clarity Precision EfficiencyGraphical Excellence (3): Graphical Excellence (3) Viewer gets: Greatest number of ideas In the shortest time With the least ink In the smallest spaceGraphical Excellence (4): Graphical Excellence (4) Is nearly always multivariate Requires telling truth about dataPrinciples of Good Graphics: Principles of Good Graphics Above all else show the data Maximize the data-ink ratio Erase non-data ink Erase redundant data ink Revise and editAbove All ElseShow the Data: Above All Else Show the DataAbove All ElseShow the Data: Above All Else Show the DataMaximize theData-Ink Ratio: Maximize the Data-Ink RatioMaximize theData-Ink Ratio: Maximize the Data-Ink RatioErase Non-Data Ink: Erase Non-Data InkErase Non-Data Ink: Erase Non-Data Ink East West NorthErase Redundant Data Ink: Erase Redundant Data Ink East West NorthErase Redundant Data Ink: Erase Redundant Data Ink East West NorthRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditSpecific Things to Do: Specific Things to Do Give information the reader needs Limit complexity and confusion Have a point Show statistics graphically Don’t always use graphics Discuss it in the textGive Informationthe Reader Needs: Give Information the Reader Needs Show informative axes Use axes to indicate range Label things fully and intelligently Highlight important points on the graphGiving Informationthe Reader Needs: Giving Information the Reader NeedsGiving Informationthe Reader Needs: Giving Information the Reader NeedsLimit Complexityand Confusion: Limit Complexity and Confusion Not too many curves Single scale for all curves No “extra” curves No pointless decoration (“ducks”)Limiting Complexityand Confusion: Limiting Complexity and ConfusionLimiting Complexityand Confusion: Limiting Complexity and ConfusionHave a Point: Have a Point Graphs should add information not otherwise available to reader Don’t plot data just because you collected it Know what you’re trying to show, and make sure the graph shows itHaving a Point: Having a Point Sales were up 15% this quarter:Having a Point: Having a PointHaving a Point: Having a PointHaving a Point: Having a PointShow Statistics Graphically: Show Statistics Graphically Put bars in a reasonable order Geographical Best to worst Even alphabetic Make bar widths reflect interval widths Hard to do with most graphing software Show confidence intervals on the graph Examples will be shown laterDon’t AlwaysUse Graphics: Don’t Always Use Graphics Tables are best for small sets of numbers e.g., 20 or fewer Also best for certain arrangements of data e.g., 10 graphs of 3 points each Sometimes a simple sentence will do Always ask whether the chart is the best way to present the information And whether it brings out your messageText Would HaveBeen Better: Text Would Have Been BetterDiscuss It in the Text: Discuss It in the Text Figures should be self-explanatory Many people scan papers, just look at graphs Good graphs build interest, “hook” readers But text should highlight and aid figures Tell readers when to look at figures Point out what figure is telling them Expand on what figure has to sayAesthetics: Aesthetics Not everyone is an artist But figures should be visually pleasing Elegance is found in Simplicity of design Complexity of dataPrinciples of Aesthetics: Principles of Aesthetics Use appropriate format and design Use words, numbers, drawings together Reflect balance, proportion, relevant scale Keep detail and complexity accessible Have a story about the data (narrative quality) Do a professional job of drawing Avoid decoration and chartjunkUse AppropriateFormat and Design: Use Appropriate Format and Design Don’t automatically draw a graph We’ve covered this before Choose graphical format carefully Sometimes a “text graphic” works best Use text placement to communicate numbers Very close to being a tableUsing Text as a Graphic: About a year ago, eight forecasters were asked for their predictions on some key economic indicators. Here’s how the forecasts stack up against the probable 1978 results (shown in the black panel). (New York Times, Jan. 2, 1979) Using Text as a GraphicThe Stem-and-Leaf Plot: The Stem-and-Leaf Plot From Tukey, via Tufte, heights of volcanoes in feet: 0|98766562 1|97719630 2|99987766544422211009850 3|876655412099551426 4|9998844331929433361107 5|97666666554422210097731 6|898665441077761065 7|98855431100652108073 8|653322122937Choosinga Graphical Format: Choosing a Graphical Format Many options, more being invented all the time Examples will be given later See Jain for some commonly useful ones Tufte shows ways to get creative Choose a format that reflects your data Or that helps you analyze it yourselfUse Words, Numbers, Drawings Together: Use Words, Numbers, Drawings Together Put graphics near or in text that discusses them Even if you have to murder your word processor Integrate text into graphics Tufte: “Data graphics are paragraphs about data and should be treated as such”Reflect Balance, Proportion, Relevant Scale: Reflect Balance, Proportion, Relevant Scale Much of this boils down to “artistic sense” Make sure things are big enough to read Tiny type is OK only for young people! Keep lines thin But use heavier lines to indicate important information Keep horizontal larger than vertical About 50% larger works wellPoor Balanceand Proportion: Poor Balance and Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarterBetter Proportion: Better Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarterKeep Detail and Complexity Accessible: Keep Detail and Complexity Accessible Make your graphics friendly: Avoid abbreviations and encodings Run words left-to-right Explain data with little messages Label graphic, don’t use elaborate shadings and a complex legend Avoid red/green distinctions Use clean, serif fonts in mixed caseAn Unfriendly Graph: An Unfriendly GraphA Friendly Version: A Friendly VersionEven Friendlier: Even FriendlierHave a Story About the Data (Narrative Quality): Have a Story About the Data (Narrative Quality) May be difficult in technical papers But think about why you are drawing graph Example: Performance is controlled by network speed But it tops out at the high end And that’s because we hit a CPU bottleneckShowing a StoryAbout the Data: Showing a Story About the DataDo a Professional Jobof Drawing: Do a Professional Job of Drawing This is easy with modern tools But take the time to do it right Align things carefully Check the final version in the format you will use I.e., print the Postscript one last time before submission Or look at your slides on the projection screenAvoid Decorationand Chartjunk: Avoid Decoration and Chartjunk Powerpoint, etc. make chartjunk easy Avoid clip art, automatic backgrounds, etc. Remember: the data is the story Statistics aren’t boring Uninterested readers aren’t drawn by cartoons Interested readers are distracted Does removing it change the message? If not, leave it outExamples of Chartjunk: Examples of Chartjunk Gridlines! Vibration Pointless Fake 3-D Effects Filled “Floor” Clip Art In or out? Filled “Walls” Borders and Fills Galore Unintentional Heavy or Double Lines Filled LabelsCommon Mistakes in Graphics: Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectlyExcess Information: Excess Information Sneaky trick to meet length limits Rules of thumb: 6 curves on line chart 10 bars on bar chart 8 slices on pie chart Extract essence, don’t cram things inWay Too Much Information: Way Too Much InformationWhat’s ImportantAbout That Chart?: What’s Important About That Chart? Times for cp and rcp rise with number of replicas Most other benchmarks are near constant Exactly constant for rmThe Right Amountof Information: The Right Amount of InformationMultiple Scales: Multiple Scales Another way to meet length limits Basically, two graphs overlaid on each other Confuses reader (which line goes with which scale?) Misstates relationships Implies equality of magnitude that doesn’t existSome Especially Bad Multiple Scales: Some Especially Bad Multiple ScalesUsing Symbolsin Place of Text: Using Symbols in Place of Text Graphics should be self-explanatory Remember that the graphs often draw the reader in So use explanatory text, not symbols This means no Greek letters! Unless your conference is in Athens...It’s All Greek To Me...: It’s All Greek To Me...Explanation is Easy: Explanation is EasyPoor Scales: Poor Scales Plotting programs love non-zero origins But people are used to zero Fiddle with axis ranges (and logarithms) to get your message across But don’t lie or cheat Sometimes trimming off high ends makes things clearer Brings out low-end detailNonzero Origins(Chosen by Microsoft): Nonzero Origins (Chosen by Microsoft)Proper Origins: Proper OriginsA Poor Axis Range: A Poor Axis RangeA Logarithmic Range: A Logarithmic RangeA Truncated Range: A Truncated RangeUsing Lines Incorrectly: Using Lines Incorrectly Don’t connect points unless interpolation is meaningful Don’t smooth lines that are based on samples Exception: fitted non-linear curvesIncorrect Line Usage: Incorrect Line UsagePictorial Games: Pictorial Games Non-zero origins and broken scales Double-whammy graphs Omitting confidence intervals Scaling by height, not area Poor histogram cell sizeNon-Zero Originsand Broken Scales: Non-Zero Origins and Broken Scales People expect (0,0) origins Subconsciously So non-zero origins are a great way to lie More common than not in popular press Also very common to cheat by omitting part of scale “Really, Your Honor, I included (0,0)”Non-Zero Origins: Non-Zero OriginsThe Three-Quarters Rule: The Three-Quarters Rule Highest point should be 3/4 of scale or moreDouble-Whammy Graphs: Double-Whammy Graphs Put two related measures on same graph One is (almost) function of other Hits reader twice with same information And thus overstates impactOmittingConfidence Intervals: Omitting Confidence Intervals Statistical data is inherently fuzzy But means appear precise Giving confidence intervals can make it clear there’s no real difference So liars and fools leave them outGraph WithoutConfidence Intervals: Graph Without Confidence IntervalsGraph WithConfidence Intervals: Graph With Confidence IntervalsConfidence Intervals: Confidence Intervals Sample mean value is only an estimate of the true population mean Bounds c1 and c2 such that there is a high probability, 1-a, that the population mean is in the interval (c1,c2): Prob{ c1 < m < c2} =1-a where a is the significance level and 100(1-a) is the confidence level Overlapping confidence intervals is interpreted as “not statistically different”Graph WithConfidence Intervals: Graph With Confidence IntervalsScaling by HeightInstead of Area: Scaling by Height Instead of Area Clip art is popular with illustrators: Women in the WorkforceThe Troublewith Height Scaling: The Trouble with Height Scaling Previous graph had heights of 2:1 But people perceive areas, not heights So areas should be what’s proportional to data Tufte defines a lie factor: size of effect in graphic divided by size of effect in data Not limited to area scaling But especially insidious there (quadratic effect)Scaling by Area: Scaling by Area Here’s the same graph with 2:1 area: Women in the WorkforcePoor Histogram Cell Size: Poor Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:Principles ofGraphics Integrity (Tufte): Principles of Graphics Integrity (Tufte) Proportional representation of numbers Clear, detailed, thorough labeling Show data variation, not design variation Use deflated money units Don’t have more dimensions than data has Don’t quote data out of contextProportional Representationof Numbers: Proportional Representation of Numbers Maintain a lie factor of 1.0 Use areas, not heights, with clip art Avoiding “decorative” graphs will do wonders This isn’t too hard for most engineersClear, Detailed,Thorough Labeling: Clear, Detailed, Thorough Labeling Goal is to defeat distortion and ambiguity Write explanations on graphic itself Label important events in the dataShow Data Variation,Not Design Variation: Show Data Variation, Not Design Variation Use one design for the entire graphic In papers, try to use one design for all graphs Again, artistic license is the big culpritUse Deflated Money Units: Use Deflated Money Units Often necessary to show money over time Even in computer science E.g., price/performance over time Or expected future cost of a disk Nominal dollars are meaningless Derate by some standard inflation measure That’s what the WWW is for!Don’t Have More Dimensions Than Data Has: Don’t Have More Dimensions Than Data Has This gets back to the Lie Factor 1-D data (e.g., money) should occupy one dimension on the graph: not Clip art is prohibited by this rule But if you have to, use an area measure $1.00 $2.00Don’t Quote DataOut of Context: Don’t Quote Data Out of Context Tufte’s example:The Same Data in Context: The Same Data in ContextSpecial-Purpose Charts: Special-Purpose Charts Histograms Scatter plots Gantt charts Kiviat graphsTukey’s Box Plot: Tukey’s Box Plot Shows range, median, quartiles all in one: Tufte can’t resist improvements: or or even minimum maximum quartile quartile medianHistograms: HistogramsScatter Plots: Scatter Plots Useful in statistical analysis Also excellent for huge quantities of data Can show patterns otherwise invisibleBetter Scatter Plots: Better Scatter Plots Again, Tufte improves the standard But it can be a pain with automated tools Can use modified Tukey box plot for axesGantt Charts: Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous Each level after first follows FTTF patternKiviat Graphs: Kiviat Graphs Also called “star charts” or “radar plots” Useful for looking at balance between HB and LB metricsUseful Reference Works: Useful Reference Works Edward R. Tufte, The Visual Display of Quantitative Information, Graphics Press, Cheshire, Connecticut, 1983. Edward R. Tufte, Envisioning Information, Graphics Press, Cheshire, Connecticut, 1990. Edward R. Tufte, Visual Explanations, Graphics Press, Cheshire, Connecticut, 1997. Darrell Huff, How to Lie With Statistics, W.W. Norton & Co., New York, 1954 You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
lecture10 Matild Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 392 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: February 14, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript The Art of Graphical Presentation: The Art of Graphical Presentation Types of Variables Guidelines for Good Graphics Charts Common Mistakes in Graphics Pictorial Games Special-Purpose ChartsTypes of Variables: Types of Variables Qualitative Ordered (e.g., modem, Ethernet, satellite) Unordered (e.g., CS, math, literature) Quantitative Discrete (e.g., number of terminals) Continuous (e.g., time)Charting Based on Variable Types: Charting Based on Variable Types Qualitative variables usually work best with bar charts or Kiviat graphs If ordered, use bar charts to show order Quantitative variables work well in X-Y graphs Use points if discrete, lines if continuous Bar charts sometimes work well for discreteGuidelines for Good Graphics Charts: Guidelines for Good Graphics Charts Principles of graphical excellence Principles of good graphics Specific hints for specific situations Aesthetics FriendlinessPrinciplesof Graphical Excellence: Principles of Graphical Excellence Graphical excellence is the well-designed presentation of interesting data: Substance Statistics DesignGraphical Excellence (2): Graphical Excellence (2) Complex ideas get communicated with: Clarity Precision EfficiencyGraphical Excellence (3): Graphical Excellence (3) Viewer gets: Greatest number of ideas In the shortest time With the least ink In the smallest spaceGraphical Excellence (4): Graphical Excellence (4) Is nearly always multivariate Requires telling truth about dataPrinciples of Good Graphics: Principles of Good Graphics Above all else show the data Maximize the data-ink ratio Erase non-data ink Erase redundant data ink Revise and editAbove All ElseShow the Data: Above All Else Show the DataAbove All ElseShow the Data: Above All Else Show the DataMaximize theData-Ink Ratio: Maximize the Data-Ink RatioMaximize theData-Ink Ratio: Maximize the Data-Ink RatioErase Non-Data Ink: Erase Non-Data InkErase Non-Data Ink: Erase Non-Data Ink East West NorthErase Redundant Data Ink: Erase Redundant Data Ink East West NorthErase Redundant Data Ink: Erase Redundant Data Ink East West NorthRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditRevise and Edit: Revise and EditSpecific Things to Do: Specific Things to Do Give information the reader needs Limit complexity and confusion Have a point Show statistics graphically Don’t always use graphics Discuss it in the textGive Informationthe Reader Needs: Give Information the Reader Needs Show informative axes Use axes to indicate range Label things fully and intelligently Highlight important points on the graphGiving Informationthe Reader Needs: Giving Information the Reader NeedsGiving Informationthe Reader Needs: Giving Information the Reader NeedsLimit Complexityand Confusion: Limit Complexity and Confusion Not too many curves Single scale for all curves No “extra” curves No pointless decoration (“ducks”)Limiting Complexityand Confusion: Limiting Complexity and ConfusionLimiting Complexityand Confusion: Limiting Complexity and ConfusionHave a Point: Have a Point Graphs should add information not otherwise available to reader Don’t plot data just because you collected it Know what you’re trying to show, and make sure the graph shows itHaving a Point: Having a Point Sales were up 15% this quarter:Having a Point: Having a PointHaving a Point: Having a PointHaving a Point: Having a PointShow Statistics Graphically: Show Statistics Graphically Put bars in a reasonable order Geographical Best to worst Even alphabetic Make bar widths reflect interval widths Hard to do with most graphing software Show confidence intervals on the graph Examples will be shown laterDon’t AlwaysUse Graphics: Don’t Always Use Graphics Tables are best for small sets of numbers e.g., 20 or fewer Also best for certain arrangements of data e.g., 10 graphs of 3 points each Sometimes a simple sentence will do Always ask whether the chart is the best way to present the information And whether it brings out your messageText Would HaveBeen Better: Text Would Have Been BetterDiscuss It in the Text: Discuss It in the Text Figures should be self-explanatory Many people scan papers, just look at graphs Good graphs build interest, “hook” readers But text should highlight and aid figures Tell readers when to look at figures Point out what figure is telling them Expand on what figure has to sayAesthetics: Aesthetics Not everyone is an artist But figures should be visually pleasing Elegance is found in Simplicity of design Complexity of dataPrinciples of Aesthetics: Principles of Aesthetics Use appropriate format and design Use words, numbers, drawings together Reflect balance, proportion, relevant scale Keep detail and complexity accessible Have a story about the data (narrative quality) Do a professional job of drawing Avoid decoration and chartjunkUse AppropriateFormat and Design: Use Appropriate Format and Design Don’t automatically draw a graph We’ve covered this before Choose graphical format carefully Sometimes a “text graphic” works best Use text placement to communicate numbers Very close to being a tableUsing Text as a Graphic: About a year ago, eight forecasters were asked for their predictions on some key economic indicators. Here’s how the forecasts stack up against the probable 1978 results (shown in the black panel). (New York Times, Jan. 2, 1979) Using Text as a GraphicThe Stem-and-Leaf Plot: The Stem-and-Leaf Plot From Tukey, via Tufte, heights of volcanoes in feet: 0|98766562 1|97719630 2|99987766544422211009850 3|876655412099551426 4|9998844331929433361107 5|97666666554422210097731 6|898665441077761065 7|98855431100652108073 8|653322122937Choosinga Graphical Format: Choosing a Graphical Format Many options, more being invented all the time Examples will be given later See Jain for some commonly useful ones Tufte shows ways to get creative Choose a format that reflects your data Or that helps you analyze it yourselfUse Words, Numbers, Drawings Together: Use Words, Numbers, Drawings Together Put graphics near or in text that discusses them Even if you have to murder your word processor Integrate text into graphics Tufte: “Data graphics are paragraphs about data and should be treated as such”Reflect Balance, Proportion, Relevant Scale: Reflect Balance, Proportion, Relevant Scale Much of this boils down to “artistic sense” Make sure things are big enough to read Tiny type is OK only for young people! Keep lines thin But use heavier lines to indicate important information Keep horizontal larger than vertical About 50% larger works wellPoor Balanceand Proportion: Poor Balance and Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarterBetter Proportion: Better Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarterKeep Detail and Complexity Accessible: Keep Detail and Complexity Accessible Make your graphics friendly: Avoid abbreviations and encodings Run words left-to-right Explain data with little messages Label graphic, don’t use elaborate shadings and a complex legend Avoid red/green distinctions Use clean, serif fonts in mixed caseAn Unfriendly Graph: An Unfriendly GraphA Friendly Version: A Friendly VersionEven Friendlier: Even FriendlierHave a Story About the Data (Narrative Quality): Have a Story About the Data (Narrative Quality) May be difficult in technical papers But think about why you are drawing graph Example: Performance is controlled by network speed But it tops out at the high end And that’s because we hit a CPU bottleneckShowing a StoryAbout the Data: Showing a Story About the DataDo a Professional Jobof Drawing: Do a Professional Job of Drawing This is easy with modern tools But take the time to do it right Align things carefully Check the final version in the format you will use I.e., print the Postscript one last time before submission Or look at your slides on the projection screenAvoid Decorationand Chartjunk: Avoid Decoration and Chartjunk Powerpoint, etc. make chartjunk easy Avoid clip art, automatic backgrounds, etc. Remember: the data is the story Statistics aren’t boring Uninterested readers aren’t drawn by cartoons Interested readers are distracted Does removing it change the message? If not, leave it outExamples of Chartjunk: Examples of Chartjunk Gridlines! Vibration Pointless Fake 3-D Effects Filled “Floor” Clip Art In or out? Filled “Walls” Borders and Fills Galore Unintentional Heavy or Double Lines Filled LabelsCommon Mistakes in Graphics: Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectlyExcess Information: Excess Information Sneaky trick to meet length limits Rules of thumb: 6 curves on line chart 10 bars on bar chart 8 slices on pie chart Extract essence, don’t cram things inWay Too Much Information: Way Too Much InformationWhat’s ImportantAbout That Chart?: What’s Important About That Chart? Times for cp and rcp rise with number of replicas Most other benchmarks are near constant Exactly constant for rmThe Right Amountof Information: The Right Amount of InformationMultiple Scales: Multiple Scales Another way to meet length limits Basically, two graphs overlaid on each other Confuses reader (which line goes with which scale?) Misstates relationships Implies equality of magnitude that doesn’t existSome Especially Bad Multiple Scales: Some Especially Bad Multiple ScalesUsing Symbolsin Place of Text: Using Symbols in Place of Text Graphics should be self-explanatory Remember that the graphs often draw the reader in So use explanatory text, not symbols This means no Greek letters! Unless your conference is in Athens...It’s All Greek To Me...: It’s All Greek To Me...Explanation is Easy: Explanation is EasyPoor Scales: Poor Scales Plotting programs love non-zero origins But people are used to zero Fiddle with axis ranges (and logarithms) to get your message across But don’t lie or cheat Sometimes trimming off high ends makes things clearer Brings out low-end detailNonzero Origins(Chosen by Microsoft): Nonzero Origins (Chosen by Microsoft)Proper Origins: Proper OriginsA Poor Axis Range: A Poor Axis RangeA Logarithmic Range: A Logarithmic RangeA Truncated Range: A Truncated RangeUsing Lines Incorrectly: Using Lines Incorrectly Don’t connect points unless interpolation is meaningful Don’t smooth lines that are based on samples Exception: fitted non-linear curvesIncorrect Line Usage: Incorrect Line UsagePictorial Games: Pictorial Games Non-zero origins and broken scales Double-whammy graphs Omitting confidence intervals Scaling by height, not area Poor histogram cell sizeNon-Zero Originsand Broken Scales: Non-Zero Origins and Broken Scales People expect (0,0) origins Subconsciously So non-zero origins are a great way to lie More common than not in popular press Also very common to cheat by omitting part of scale “Really, Your Honor, I included (0,0)”Non-Zero Origins: Non-Zero OriginsThe Three-Quarters Rule: The Three-Quarters Rule Highest point should be 3/4 of scale or moreDouble-Whammy Graphs: Double-Whammy Graphs Put two related measures on same graph One is (almost) function of other Hits reader twice with same information And thus overstates impactOmittingConfidence Intervals: Omitting Confidence Intervals Statistical data is inherently fuzzy But means appear precise Giving confidence intervals can make it clear there’s no real difference So liars and fools leave them outGraph WithoutConfidence Intervals: Graph Without Confidence IntervalsGraph WithConfidence Intervals: Graph With Confidence IntervalsConfidence Intervals: Confidence Intervals Sample mean value is only an estimate of the true population mean Bounds c1 and c2 such that there is a high probability, 1-a, that the population mean is in the interval (c1,c2): Prob{ c1 < m < c2} =1-a where a is the significance level and 100(1-a) is the confidence level Overlapping confidence intervals is interpreted as “not statistically different”Graph WithConfidence Intervals: Graph With Confidence IntervalsScaling by HeightInstead of Area: Scaling by Height Instead of Area Clip art is popular with illustrators: Women in the WorkforceThe Troublewith Height Scaling: The Trouble with Height Scaling Previous graph had heights of 2:1 But people perceive areas, not heights So areas should be what’s proportional to data Tufte defines a lie factor: size of effect in graphic divided by size of effect in data Not limited to area scaling But especially insidious there (quadratic effect)Scaling by Area: Scaling by Area Here’s the same graph with 2:1 area: Women in the WorkforcePoor Histogram Cell Size: Poor Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:Principles ofGraphics Integrity (Tufte): Principles of Graphics Integrity (Tufte) Proportional representation of numbers Clear, detailed, thorough labeling Show data variation, not design variation Use deflated money units Don’t have more dimensions than data has Don’t quote data out of contextProportional Representationof Numbers: Proportional Representation of Numbers Maintain a lie factor of 1.0 Use areas, not heights, with clip art Avoiding “decorative” graphs will do wonders This isn’t too hard for most engineersClear, Detailed,Thorough Labeling: Clear, Detailed, Thorough Labeling Goal is to defeat distortion and ambiguity Write explanations on graphic itself Label important events in the dataShow Data Variation,Not Design Variation: Show Data Variation, Not Design Variation Use one design for the entire graphic In papers, try to use one design for all graphs Again, artistic license is the big culpritUse Deflated Money Units: Use Deflated Money Units Often necessary to show money over time Even in computer science E.g., price/performance over time Or expected future cost of a disk Nominal dollars are meaningless Derate by some standard inflation measure That’s what the WWW is for!Don’t Have More Dimensions Than Data Has: Don’t Have More Dimensions Than Data Has This gets back to the Lie Factor 1-D data (e.g., money) should occupy one dimension on the graph: not Clip art is prohibited by this rule But if you have to, use an area measure $1.00 $2.00Don’t Quote DataOut of Context: Don’t Quote Data Out of Context Tufte’s example:The Same Data in Context: The Same Data in ContextSpecial-Purpose Charts: Special-Purpose Charts Histograms Scatter plots Gantt charts Kiviat graphsTukey’s Box Plot: Tukey’s Box Plot Shows range, median, quartiles all in one: Tufte can’t resist improvements: or or even minimum maximum quartile quartile medianHistograms: HistogramsScatter Plots: Scatter Plots Useful in statistical analysis Also excellent for huge quantities of data Can show patterns otherwise invisibleBetter Scatter Plots: Better Scatter Plots Again, Tufte improves the standard But it can be a pain with automated tools Can use modified Tukey box plot for axesGantt Charts: Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous Each level after first follows FTTF patternKiviat Graphs: Kiviat Graphs Also called “star charts” or “radar plots” Useful for looking at balance between HB and LB metricsUseful Reference Works: Useful Reference Works Edward R. Tufte, The Visual Display of Quantitative Information, Graphics Press, Cheshire, Connecticut, 1983. Edward R. Tufte, Envisioning Information, Graphics Press, Cheshire, Connecticut, 1990. Edward R. Tufte, Visual Explanations, Graphics Press, Cheshire, Connecticut, 1997. Darrell Huff, How to Lie With Statistics, W.W. Norton & Co., New York, 1954