Link

Following the social media crowd to discover news stories

With Janette Lehmann (UPF), Mounia Lalmas (Yahoo!) and Ethan Zuckerman (MIT Civic Media), we developed an automatic method (pdf, blog post) that groups together all the users who tweet a particular news item, and later detects new contents posted by them that are related to the original news item.

We call each such group a transient news crowd. The beauty of this approach, in addition to being fully automatic, is that there is no need to pre-define topics and the crowd becomes available immediately, allowing journalists to cover news beats incorporating the shifts of interest of their audiences.

Continue reading at crowdresearch.org »

Best paper award @ ISCRAM 2013

Congratulations to my colleagues M. Imran (QCRI), S. Elbassuoni (Beirut University), F. Diaz (Microsoft) and P. Meier (QCRI) for a best paper award at the ISCRAM conference. ISCRAM is the main international conference on systems for crisis response and management.

Our work, described in the two papers below (specially on the first one), describes a method to extract information nuggets from tweets related to emergencies. For instance, we can go beyond detecting that a tweet is about a donation to identify which is the item being donated (e.g. clothes, money, etc.).

Official announcement at QCRI website.

Nowhere to hide: The next manhunt will be crowdsourced

New Scientist 2914, 23 April 2013 (free registration required) describes our proyect Veri.ly:

... A big problem with theories floated on social media is that information can go viral simply because it is popular, whether or not it is true. Patrick Meier of the Qatar Computing Research Institute (QCRI) in Doha is building Verily, a system that allows users to submit verification requests for information they are interested in. Each request prompts a crowd of online workers to set off into their networks to figure it out. The system gathers evidence for and against the claim, though it won't pass judgement.

...By training machine learning algorithms on huge data sets, Meier is building up profiles of the classes of digital evidence that tend to be credible, and those that are not.

As an example, Meier points to a recent study of misinformation on Twitter after the 2010 Chilean earthquake. Carlos Castillo of the QCRI and colleagues showed that non-credible tweets tend to spark responses that question or rebuke them – a trait software can be trained to recognise. "Non-credible information propagates across the twittersphere leaving very specific ripples behind," says Meier. "You could absolutely start having a probability – a percentage chance that particular tweets are not credible."

Full article in New Scientist (free registration required) »

Signal or Noise? Credibility and Quality Issues on the Web and Social Media

I am glad to announce the third edition of the Web Quality workshop, to be held on May 13th, 2013 in Rio de Janeiro, Brazil. The workshop is co-located with the World Wide Web conference.

This year's theme is the question: Signal or Noise?. The Web and social media keep on growing and playing an ever increasing role in our lives. In this context, finding relevant, timely and trustworthy content in a sea of seemingly irrelevant chatter remains a challenging research issue.

The workshop will bring together practitioner and researchers working on key problem areas such as modelling trust and author reputation, detecting abuse and spam, finding high-quality content, uncovering plagiarism, among other topics.

Website: WebQuality 2013 »

Social media hoaxes [Slate]

Slate.com writes about our upcoming study on Internet Research extending our findings presented in "Information Credibility on Twitter" [pdf].

Social media hoaxes: Could machine-learning algorithms help debunk Twitter rumors before they spread?

...

In a new paper, to be published in the journal Internet Research next month, the authors of the Chile earthquake study—Carlos Castillo, Marcelo Mendoza, and Barbara Poblete—test out their algorithm on fresh data sets and find that it works pretty well. According to Meier, their machine-learning classifier had an AUC, or “area under the curve,” of 0.86. That means that, when presented with a random false tweet and a random true tweet, it would assess the true tweet as more credible 86 percent of the time. (An AUC of 1 is perfect; an AUC of 0.5 is no better than random chance.)

My guess is that a knowledgeable and experienced human Twitter user could do better than that under most circumstances. And of course, if a given algorithm became widespread, committed trolls like the Hurricane Sandy villain @ComfortablySmug could find ways to game it. Still, an algorithm has the potential to work much faster than a human, and as it improves, it could evolve into an invaluable "first opinion" for flagging news items on Twitter that might not be true.

...

Source: Slate.com

Pages

Subscribe to RSS - Link