logging in or signing up royalkapila Jancis Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 139 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 23, 2007 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript What’s on Wikipedia, and What’s Not…?Completeness of Information on the Online Collaborative Encyclopedia : What’s on Wikipedia, and What’s Not…? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor Texas State University School of Journalism and Mass Communication Deepina Kapila Graduate Student Texas State University School of Journalism and Mass CommunicationIntroduction - Wikipedia: Introduction - Wikipedia Wikipedia (www.wikipedia.com), deemed “the free encyclopedia,” was launched on the web in 2001. Since then, it has become the Web’s 3rd most popular news and information source It uses the Wiki software format, which allows a community of users to develop and monitor content Wikipedia operates under the assumption that the public will act as a policing force, keeping content reliable and up to date.Introduction - Research: Introduction - Research Denning et al. (2005) listed the risks inherent in Wikipedia’s model: accuracy, motives, uncertain expertise, volatility, coverage, sources. Bopp and Smith (2001) state that coverage in an encyclopedia should be “Even across all subjects” Shoemaker and Reese (1995) identified the individual as a news influencer. Web users and content creators tend to be young. Tankard/Royal (2005) – inherent biases in Web content, based on systematic searches.Research Questions : Research Questions This project measures the content of Wikipedia against various indexes or standards of completeness to identify and uncover potential inherent biases. We are asking: 1. Are there some systematic gaps or biases in the overall presentation of information made available on Wikipedia? 2. Is recency (or currency) a predictor of amount of information on Wikipedia? 3. Is importance of information a predictor of amount of information on Wikipedia? 4. Is population a predictor of amount of information about particular countries on Wikipedia? 5. Is economic power a predictor of amount of information about individual corporations on Wikipedia?Method: Method Using predictors of recency, importance, country population, and economic power, several systematic searches on Wikipedia were conducted Each article for each topic was visited, the relevant content highlighted, and the selection’s words were counted Word counts were captured in a spreadsheet, and items were plotted on charts Ascending order Predictor variableTopics Covered : Topics Covered Years (1900-2010) Academy Award Winning Films Time Magazine’s Person of the Year #1 Song on Billboard Top 100 (1940-2006) Encyclopedia Terms Countries in the United Nations Fortune 1000 companies Results - Years: Results - Years Ascending Order Chronological Order -Backward L-shaped curve -Clear progression of length of article with year; dramatic increase in years after 2001 -Years in the future displayed understandably shorter word counts -Spearman Correlation between variables: .79 Results - Films : Results - Films Ascending Order Chronological Order -Backward L-shaped curve is apparent. -With few exceptions (ie. Gone with the Wind, 1939 and Casablanca, 1943) the results show progression favoring more current films. Recency is important, but certain films transcend time and are deemed important for other reasons. -Average word count for films since 2001 was 80% higher than word count before 2001. -Spearman correlation between variables: .49; increased to .62 simply by removing 2 outliersResults - Person of the Year: Results - Person of the Year Ascending Order Chronological Order Softer backward-shaped L curve Even distribution shows bias is unrelated to recency, measured by another variable of importance Spearman Correlation between variables: O-there was no relationship with time.Results - Billboard Top 100: Results - Billboard Top 100 Ascending Order Chronological Order -Backward L-shaped curve -Although Average word count was 32% higher for artists since 1990, distribution shows trend similar to movies in that some artists transcend time. -Spearman correlation between variables: .40 (by eliminating 2 outliers)Encyclopedia Terms: Encyclopedia Terms Ascending Order -Comparison between Encyclopedia Britannica and Wikipedia articles -Backward L-shaped distribution apparent -Spearman correlation used to compare inches of content in Encyclopedia Britannica with word count in Wikipedia: .26 -Of 100 terms, 14 were not represented in WikipediaResults - UN Countries: Results - UN Countries Ordered by population Ascending Order -Backward L-shaped curve - although fairly evenly distributed, a SHARP increase appears for the top 22 countries. -Gradual upward curve in 2nd chart shows that as population increases, so does word count -Average word count for top 10% of countries was 63% higher than the rest on the list -Spearman correlation between variables: .55Results - Fortune 1000: Results - Fortune 1000 Ascending Order Ordered by Revenue -Backward L-shaped curve -SHARP increase for top 10% of companies by revenue -Top 10% of companies by revenue counted for 30% of total word count on companies -Spearman correlation between variables: .49Conclusion: Conclusion -Information on Wikipedia is volatile, dynamic and constantly changing over time -Wikipedia’s purpose is to serve as a general reference source, but the content is weighted due to its contributors’ demographics In each search performed for the dimensions, strong biases were evident and strong correlations experienced: Currency/Recency: the more current topics were covered the most Random Selection: Encyclopedia terms showed clear bias towards more common or popular terms Relevancy: Wikipedia’s word count correlates to inches in a traditional encyclopedia, showing a strong agenda by each publication Population: the larger the country and the larger its population, the higher the word count Revenue: The larger the revenue, the higher the word count You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
royalkapila Jancis Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 139 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 23, 2007 This Presentation is Public Favorites: 1 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript What’s on Wikipedia, and What’s Not…?Completeness of Information on the Online Collaborative Encyclopedia : What’s on Wikipedia, and What’s Not…? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor Texas State University School of Journalism and Mass Communication Deepina Kapila Graduate Student Texas State University School of Journalism and Mass CommunicationIntroduction - Wikipedia: Introduction - Wikipedia Wikipedia (www.wikipedia.com), deemed “the free encyclopedia,” was launched on the web in 2001. Since then, it has become the Web’s 3rd most popular news and information source It uses the Wiki software format, which allows a community of users to develop and monitor content Wikipedia operates under the assumption that the public will act as a policing force, keeping content reliable and up to date.Introduction - Research: Introduction - Research Denning et al. (2005) listed the risks inherent in Wikipedia’s model: accuracy, motives, uncertain expertise, volatility, coverage, sources. Bopp and Smith (2001) state that coverage in an encyclopedia should be “Even across all subjects” Shoemaker and Reese (1995) identified the individual as a news influencer. Web users and content creators tend to be young. Tankard/Royal (2005) – inherent biases in Web content, based on systematic searches.Research Questions : Research Questions This project measures the content of Wikipedia against various indexes or standards of completeness to identify and uncover potential inherent biases. We are asking: 1. Are there some systematic gaps or biases in the overall presentation of information made available on Wikipedia? 2. Is recency (or currency) a predictor of amount of information on Wikipedia? 3. Is importance of information a predictor of amount of information on Wikipedia? 4. Is population a predictor of amount of information about particular countries on Wikipedia? 5. Is economic power a predictor of amount of information about individual corporations on Wikipedia?Method: Method Using predictors of recency, importance, country population, and economic power, several systematic searches on Wikipedia were conducted Each article for each topic was visited, the relevant content highlighted, and the selection’s words were counted Word counts were captured in a spreadsheet, and items were plotted on charts Ascending order Predictor variableTopics Covered : Topics Covered Years (1900-2010) Academy Award Winning Films Time Magazine’s Person of the Year #1 Song on Billboard Top 100 (1940-2006) Encyclopedia Terms Countries in the United Nations Fortune 1000 companies Results - Years: Results - Years Ascending Order Chronological Order -Backward L-shaped curve -Clear progression of length of article with year; dramatic increase in years after 2001 -Years in the future displayed understandably shorter word counts -Spearman Correlation between variables: .79 Results - Films : Results - Films Ascending Order Chronological Order -Backward L-shaped curve is apparent. -With few exceptions (ie. Gone with the Wind, 1939 and Casablanca, 1943) the results show progression favoring more current films. Recency is important, but certain films transcend time and are deemed important for other reasons. -Average word count for films since 2001 was 80% higher than word count before 2001. -Spearman correlation between variables: .49; increased to .62 simply by removing 2 outliersResults - Person of the Year: Results - Person of the Year Ascending Order Chronological Order Softer backward-shaped L curve Even distribution shows bias is unrelated to recency, measured by another variable of importance Spearman Correlation between variables: O-there was no relationship with time.Results - Billboard Top 100: Results - Billboard Top 100 Ascending Order Chronological Order -Backward L-shaped curve -Although Average word count was 32% higher for artists since 1990, distribution shows trend similar to movies in that some artists transcend time. -Spearman correlation between variables: .40 (by eliminating 2 outliers)Encyclopedia Terms: Encyclopedia Terms Ascending Order -Comparison between Encyclopedia Britannica and Wikipedia articles -Backward L-shaped distribution apparent -Spearman correlation used to compare inches of content in Encyclopedia Britannica with word count in Wikipedia: .26 -Of 100 terms, 14 were not represented in WikipediaResults - UN Countries: Results - UN Countries Ordered by population Ascending Order -Backward L-shaped curve - although fairly evenly distributed, a SHARP increase appears for the top 22 countries. -Gradual upward curve in 2nd chart shows that as population increases, so does word count -Average word count for top 10% of countries was 63% higher than the rest on the list -Spearman correlation between variables: .55Results - Fortune 1000: Results - Fortune 1000 Ascending Order Ordered by Revenue -Backward L-shaped curve -SHARP increase for top 10% of companies by revenue -Top 10% of companies by revenue counted for 30% of total word count on companies -Spearman correlation between variables: .49Conclusion: Conclusion -Information on Wikipedia is volatile, dynamic and constantly changing over time -Wikipedia’s purpose is to serve as a general reference source, but the content is weighted due to its contributors’ demographics In each search performed for the dimensions, strong biases were evident and strong correlations experienced: Currency/Recency: the more current topics were covered the most Random Selection: Encyclopedia terms showed clear bias towards more common or popular terms Relevancy: Wikipedia’s word count correlates to inches in a traditional encyclopedia, showing a strong agenda by each publication Population: the larger the country and the larger its population, the higher the word count Revenue: The larger the revenue, the higher the word count