SYDNEY: Is the ‘news cycle’ just our perception of what’s going on in the media, or is it a real phenomenon? For the first time, researchers – who tracked 90 million articles over the three-month period – have an answer.
Led by Jon Kleinberg, a professor of computer science at Cornell University in New York State, the researchers tracked 1.6 million online news sites, including 20,000 mainstream media sites and a vast array of blogs, over the three-month period leading up to the 2008 presidential election – a total of 90 million articles, one of the largest analyses anywhere of online news.
They found a consistent rhythm as stories rose into prominence and then fell off over just a few days, with a ‘heartbeat’ pattern of hand-offs between blogs and mainstream media. But it also showed that the mainstream media lead and blogs follow.
In mainstream media, they found, a story rises to prominence slowly then dies quickly. But in the blogosphere, stories rise in popularity very quickly but then stay around longer, as discussion goes back and forth. Eventually though, almost every story is pushed aside by something newer.
“The movement of news to the Internet makes it possible to quantify something that was otherwise very hard to measure – the temporal dynamics of the news,” said Kleinberg.
“We want to understand the full news ecosystem, and online news is now an accurate enough reflection of the full ecosystem to make this possible. This is one [very early] step toward creating tools that would help people understand the news, where it’s coming from and how it’s arising from the confluence of many sources.”
The researchers also say their work suggests an answer to a longstanding question: Is the “news cycle” just a way to describe our perception of what’s going on in the media, or is it a real phenomenon that can be measured? They opt for the latter, and offer a mathematical explanation of how it works.
The research was presented at the Association for Computing Machinery’s Conference on Knowledge Discovery and Data Mining Conference in Paris earlier this month.
The ideal, Kleinberg said, would be to track ‘memes’, or ideas, through cyberspace, but deciding what an article is about is still a major challenge for computing. The researchers sidestepped that obstacle by tracking quotations that appear in news stories, since quotes remain fairly consistent even though the overall story may be presented in very different ways by different writers.
Even quotes may change slightly or ‘mutate’ as they pass from one article to another, so the researchers developed an algorithm that could identify and group similar but slightly different phrases.
In simple terms, the computer identified short phrases that were part of longer phrases, using those connections to create ‘phrase clusters’. Then they tracked the volume of posts in each phrase cluster over time.
In the August and September 2008 data, they found threads rising and falling on a more or less weekly basis, with major peaks corresponding to the Democratic and Republican conventions, the “lipstick on a pig” discussion, rising concern over the financial crisis and discussions of a bank bailout plan.
By far the biggest spike was the “lipstick on a pig” comment during the U.S. presidential campaign. On September 9, then Democratic contender Barack Obama likened his Republican rival John McCain’s attempt to distance himself from the Republican Bush Administration as “putting lipstick on a pig” – a rhetorical expression meant to imply that cosmetic changes do not disguise an item’s essential nature. Republicans accused Obama of a sexist attack on McCain’s female vice-presidential running mate, Sarah Palin. Obama denied the charge.
The slow rise of a new story in the mainstream, the researchers suggest, results from imitation: as more sites carried a story, other sites were more likely to pick it up. But the life of a story is limited, as new stories quickly push out the old. A mathematical model based on the interaction of imitation and how recently a story appeared predicted the pattern fairly well, the researchers said. Ppredictions based on either imitation or ‘recency’ alone couldn’t come close.
Watching how stories moved between mainstream media and blogs revealed a sharp dip and rise the researchers described as a “heartbeat”. When a story first appears, there is a small rise in activity in both spheres; as mainstream activity increases, the proportion blogs contribute becomes small; but soon the blog activity shoots up, peaking an average of 2.5 hours after the mainstream peak.
Almost all stories started in the mainstream media. Only 3.5 per cent of the stories tracked appeared first dominantly in the blogosphere and then moved to the mainstream.
The mathematical model needs to be refined, the researchers said, and they suggested further study of how stories move between sites with opposing political orientation.
“It will be useful to further understand the roles different participants play in the process,” the researchers concluded, “as their collective behaviour leads directly to the ways in which all of us experience news and its consequences.”
Kleinberg’s team included postdoctoral researcher Jure Leskovec and graduate student Lars Backstrom.