Profession and academia

Decoding our Twitter chatter [WSJ]

Want to monitor an earthquake, track political activity or predict the ups and downs of the stock market? Researchers have found a bonanza of real-time data in the torrential flow of Twitter feeds.

When the magnitude 8.8 Chilean earthquake hit last year, researchers found that on Twitter the truth often won out over misinformation. "When a rumor is true, it spreads faster," said computer analyst Barbara Poblete at the University of Chile in Santiago.

Ms. Poblete and her colleagues analyzed how survivors of the earthquake used the messaging service in lieu of more conventional communications that had been knocked out. They discovered that in the crisis, Twitter crowds reflexively sorted facts from falsehoods, exercising a collective wisdom on the fly. She found enough measurable differences in language, citations and posting patterns to devise a way to assess the credibility of Twitter texts automatically, with an accuracy of about 70%.

"The network itself can provide a filter for valid information," Ms. Poblete said.

Full article: Decoding Our Twitter Chatter by Robert Lee Hotz at WSJ, 2011-10-01.

Image: Ars Technica.

"Viscous Democracy" for Social Networks

Decision-making procedures in online social networks should reflect participants' political influence within the network.

Direct-democracy voting in large online communities may not be the best choice. The degree of commitment of different participants in online communities and collaboration systems varies greatly. In a community in which there are a few core members with long-term commitments to the project, and many other members joining and leaving the project rapidly, egalitarian democracy is neither expected nor appropriate. Thus, the decision-making mechanism is often meritocratic.

In this work, we propose a middle-ground between direct democracy (citizens vote on every issue) and representative democracy (citizens elect representatives that decide on their behalf on every issue). Our proposal, a type of delegative democracy, allows them to express their opinion directly or to delegate their power on a proxy.

Proxy delegation can be transitive: a proxy can delegate in another proxy. However, as our vote travels farther away through a delegation chain, we would like to introduce some reluctance in the way the power is transferred to other people we may not know directly. In that sense, we include a dampening factor (like PageRank does) to reduce the amount of power delegated through long chains. Technically, our system of viscous democracy is a system of transitive proxy voting with exponential damping.

Details appear in the virtual extension of the June 2011 issue of Communications of ACM: Viscous Democracy for Social Networks, by Paolo Boldi, Francesco Bonchi, Carlos Castillo and Sebastiano Vigna.

Read the authors' copy [pdf] »

Information Credibility on Twitter (presentation)

Here is the presentation I gave on this paper:

C. Castillo, M. Mendoza, B. Poblete: "Information Credibility on Twitter". Proc. of WWW 2011, Hyderabad, India. ACM Press.

TAXOMO sequence-mining tool available

I am glad to announce that today we released the TAXOMO sequence mining software under a BSD license.

TAXOMO is a data-mining tool for sequences. It takes as input a set of sequences and a taxonomy, and generates a succinct description of the sequences (specifically, a Markov chain with lumped states).

The input sequences may represent any kind of data, e.g.: trajectories on a map, web pages visited by a user, etc. The taxonomy should be defined over the states in the sequences. In the case of a map, for instance, they can be regions and sub-regions for the points in the map. In the case of a web site, they can be categories and sub-categories for the pages.

Taxomo was developed at Yahoo! Research Barcelona, and it is described in:

Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis: "Taxonomy-driven lumping for sequence mining". Data Mining and Knowledge Discovery, Springer, Volume 19, Issue 2, p.227-244 (2009)

For more information and download, see: http://taxomo.sourceforge.net/

Finding the "best" and the "worst" on the Web and Social Media

Call for papers: Workshop on Web Quality (joint WICOW/AIRWeb workshop)

In conjunction with the 20th International World Wide Web Conference in Hyderabad, India. DEADLINE: 31/Jan/2011

The objective of the workshop is to provide the research communities working on web spam, abuse, credibility, and reputation topics with a survey of current problems and potential solutions. It will present an opportunity for close interaction between practitioners who may have focused on more isolated sub-areas previously. We also want to gather crucial feedback for the academic community from participants representing major industry players on how web content quality research can contribute to practice.

On one hand, the joint workshop will cover the more blatant and malicious attempts that deteriorate web quality such as spam, plagiarism, or various forms of abuse and ways to prevent them or neutralize their impact on information retrieval. On the other hand, it will also provide a venue for exchanging ideas on quantifying finer-grained issues of content credibility and author reputation, and modeling them in web information retrieval.

See the workshop topics and more information »»

Pages

Subscribe to RSS - Profession and academia