Carnegie Mellon: Signals from Twitter Mimic Traditional Public Opinion Polls

Twitter could become the new mechanism for pollsters to gauge public opinion as natural language processing improves, according to research by Carnegie Mellon. A team of people from the School of Computer Science used simple text analysis on a billion microblog messages posted to Twitter during 2008 and 2009--posts averaging 11 words long--to identify messages about the economy or politics and then to find words within the text that indicated positive or negative sentiments.

Computer analysis of sentiments showed that they were fairly similar to those of well established public opinion polls, such as Consumer Sentiment (ICS) from Reuters/University of Michigan Surveys of Consumers, Pollster.com, and the Gallup Organization's Economic Confidence Index.

"The findings suggest that analyzing the text found in streams of tweets could become a cheap, rapid means of gauging public opinion on at least some subjects," the university reported in a statement.

The measurement of opinions derived from Twitter was much more volatile day to day than the polling data. But when the researchers "smoothed" the results by averaging them over a period of days, the results often correlated closely with the polling data, said Brendan O'Connor, a graduate student in Carnegie Mellon's Language Technologies Institute and one of the authors of the study. As an example, consumer confidence followed the same general slide through 2008 and the same rebound in February/March of 2009 as was seen in the poll data. The researchers said the ICS and Gallup data had a correlation of 86 percent over the period; the Twitter-generated sensibilities had between 72 percent and 79 percent correlation with the Gallup data, depending on the number of days averaged to smooth the data.

"With 7 million or more messages being tweeted each day, this data stream potentially allows us to take the temperature of the population very quickly," said Noah Smith, assistant professor of language technologies and machine learning in the School of Computer Science. "The results are noisy, as are the results of polls. Opinion pollsters have learned to compensate for these distortions, while we're still trying to identify and understand the noise in our data. Given that, I'm excited that we get any signal at all from social media that correlates with the polls."

"The Web is so mainstream now that there's no question that the Web is representative somehow of the population," O'Connor said. But pinning down Web demographics is still difficult, he noted, pointing out that Twitter traffic alone increased by a factor of 50 during the two-year span of the study.

Improved natural language processing tools, as well as query-driven analysis and use of demographic and time stamp data available on some social media sites, could increase the sophistication and reliability of microblog analysis, the researchers reported.

A paper on the topic, "From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series," will be presented at the International Conference on Weblogs and Social Media in Washington, DC in late May.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • abstract data flow

    Google Intros New Gemini Enterprise Agent Platform

    Google Cloud has announced a new platform for building and managing enterprise AI agents, as the company seeks to turn its Gemini models and Vertex AI tooling into a broader system for automating business workflows.

  • Neon blue security locks with a single red highlight

    AI Shifts Cybersecurity Focus from Finding Flaws to Fixing Them

    For decades, one of cybersecurity's most difficult challenges has been finding vulnerabilities before attackers do. A growing number of security professionals now say artificial intelligence is changing that equation, shifting the focus from discovering flaws to fixing them quickly enough to prevent exploitation.

  • digital lock with circuit patterns

    IBM Announces New AI-Powered Cybersecurity Tools

    IBM has announced an expanded portfolio of AI-powered cybersecurity products, positioning the company to compete more aggressively in a rapidly evolving market where enterprises are increasingly turning to artificial intelligence to defend against automated cyber threats.

  • abstract smartphone translucent screen displaying AI interface

    Apple Introduces Redesigned Siri AI

    At its recent Worldwide Developers Conference, Apple introduced Siri AI, a redesigned version of its voice assistant that Apple describes in its own announcement as "a profoundly more capable and personal assistant." The update is intended to make Siri more conversational, more context-aware, and more useful across iPhone, iPad, Mac, Apple Watch, and Vision Pro.