This is a guest blog by Brian Southwell from our collaborators, RTI.
Much of the world, it seems, has been atwitter about social media in recent years. Researchers are no exception. Rather than needing to solicit insight from people with telephone calls during dinner or mailing surveys that largely end up in the trash, social scientists now have readily available tools to observe people’s thoughts and ideas, posted publicly. We also can now easily track, at least in the aggregate, what information people are seeking. As Google has emerged as close confidante to many of us, we collectively can track concerns about the flu, interest in political party conventions, and what questions people have about nutrition. All of these developments suggest a veritable gold mine for social science.
Researchers have responded in earnest. As Senior Editor for Health Communication, I have noticed a distinct uptick in the percentage of submitted papers that rely in some fashion on electronic surveillance rather than formal solicitation of survey respondents. I have even joined the party myself a number of times. A few years ago, for example, former graduate student Brian Weeks and I looked at search interest in (completely unsubstantiated) rumors about Barack Obama, as measured by Google search data, and its direct (if ephemeral) correspondence to television and print news coverage. Whether we should rush headlong toward this research approach without caveats, though, is an open question.
Mounting empirical evidence suggests that we vary substantially in our engagement with social media and yet the exact nature of that variation is not fully understood or appreciated by researchers. Much has been made of the so-called digital divide, which suggests the role of socioeconomic factors in explaining Internet use and a gap between those with access to technology and those without. The electronic media landscape has changed since the 1990s, however, and economic factors may not be the most powerful predictor of social media technology any longer. Spokespeople from IBM have forecast the imminent closure of the digital divide as more and more people from a range of socioeconomic backgrounds adopt mobile technology that allows ready access to the Internet. Despite these changes, we cannot say that people do not differ fundamentally in using social media. A recent paper suggests that our basic personality is evident in our pattern of engagement with Facebook, for example.
What we also know is that the public display of information and information sharing between people vary as a function of topic, circumstance, and even available social network ties. A few years ago, collaborators and I found that viral marketing for a free mammography program was constrained by the social ties available in one’s immediate community. In a different example, colleagues and I recently found in a study of household energy tip sharing between people that relatively few people opted to post such information via social media (as opposed to other means of interpersonal communication). As I outline in a new book – Sharing Disparities: Social Networks and Popular Understanding of Science and Health – to be published this year by RTI Press, information itself is not equal in its tendency to be shared. Emotionally provocative information or information that addresses a pressing situation of uncertainty, for example, seem more prone to sharing than other types of information (hence the proliferation of rumors relative to dry expository information). Moreover, Pew recently reported substantial discrepancy between Twitter sentiment and that assessed through other public opinion measurement.
What does all of this mean for social scientists interested in leveraging our electronic forays as evidence of generalizable thoughts and sentiments? It does not suggest that there is no utility in such data; far from it. Research using such datasets is noteworthy and has proven useful in detecting the emergence of urgent concern, e.g., searches for flu symptoms. Nonetheless, we need to be cautious in suggesting that the only generalizability limitation for Internet-based research involves socioeconomic disparity. Who publicly posts, what and when they post, who forwards content and to whom they forward, and even who searches are all constrained by fluctuation in individual circumstance, topical salience, social norms, and the availability of technology and social network resources. We need more research regarding these constraints to better understand when, and how much of, the glittering mine of big data from social media is actually valuable and what and whom it represents.