TAXOMO sequence-mining tool available

I am glad to announce that today we released the TAXOMO sequence mining software under a BSD license.

TAXOMO is a data-mining tool for sequences. It takes as input a set of sequences and a taxonomy, and generates a succinct description of the sequences (specifically, a Markov chain with lumped states).

The input sequences may represent any kind of data, e.g.: trajectories on a map, web pages visited by a user, etc. The taxonomy should be defined over the states in the sequences. In the case of a map, for instance, they can be regions and sub-regions for the points in the map. In the case of a web site, they can be categories and sub-categories for the pages.

Taxomo was developed at Yahoo! Research Barcelona, and it is described in:

Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis: "Taxonomy-driven lumping for sequence mining". Data Mining and Knowledge Discovery, Springer, Volume 19, Issue 2, p.227-244 (2009)

For more information and download, see: http://taxomo.sourceforge.net/