Publications on Web Mining and Search
Submitted or in preparation
David Laniado, Andreas Kaltenbrunner, Carlos Castillo, Mayo Fuster-Morell: "Emotions and dialogue in a peer-production community: the case of Wikipedia". [request by mail]
Robert West, Ingmar Weber, Carlos Castillo: "Smart but Fun: A Data-Driven Portrait of Wikipedia Editors".
Carlos Castillo, Marcelo Mendoza, Barbara Poblete: "Information Credibility in Time-Sensitive Social Media". Submitted for publication. [request by mail]
Published, or to be published in 2012 (2)
Conference articles
- Aris Anagnostopoulos, Carlos Castillo, Aristides Gionis, Luca Becchetti, Stefano Leonardi: "Online Team Formation in Social Networks". In Proc. of WWW, pp. 839-848. Lyon, France, 2012. ACM Press. [acm|bib|www]
- Eduardo Ruiz, Vagelis Hristidis, Carlos Castillo, Aristides Gionis and Alejandro Jaimes: "Correlating Financial Time Series with Micro-Blogging Data". In WSDM, Seattle, Washington. pp. 513-522, ACM Press. 2012. [bib|slides|y!|acm]
Poster
Robert West, Ingmar Weber, Carlos Castillo: A Data-Driven Sketch of Wikipedia Editors. WWW Posters, 2012 [photo].
Published in 2011 (6)
Monograph
- Carlos Castillo and Brian D. Davison: "Adversarial Web Search". In Foundations and Trends in Information Retrieval, Vol. 4, No 5, pp 377-486. Now Publishers. 2010. [now|amazon|amazon uk|bib]
Journal articles
- Paolo Boldi, Francesco Bonchi, Carlos Castillo, and Sebastiano Vigna: "Viscous Democracy for Social Networks". In Communications of ACM, No 6, June 2011 [slides|acm|y|bib|cacm].
- Francesco Bonchi, Carlos Castillo, Aristides Gionis and Alejandro Jaimes: "Social Network Analysis and Mining for Business Applications". ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 2 Issue 3, April 2011. [acm|y|bib]
- Dino Ienco, Francesco Bonchi, Carlos Castillo: "Meme Ranking to Maximize Posts Virality in Microblogging Platforms". Journal of Intelligent Information Systems. October 2011. [springer]
Conference articles
- Michael Mathioudakis, Francesco Bonchi, Carlos Castillo, Aristides Gionis, Antti Ukkonen: "Sparsification of Influence Networks". In proceedings of KDD, pp. 529-537. San Diego, CA, USA. 2011. [acm|y!|slides]
- Carlos Castillo, Marcelo Mendoza, Barbara Poblete: "Information Credibility on Twitter". In Proceedings of WWW conference, pp. 675-684. Hyderabad, India. 2011. [bib|slides (complete, prezi)|slides (partial, pdf)|acm|ars technica|wsj]. The labels used are available on request: [request by mail]
Workshop report
Carlos Castillo, Zoltán Gyöngyi, Adam Jatowt, Katsumi Tanaka: Joint WICOW/AIRWeb workshop on web quality (WebQuality 2011). WWW (Companion Volume), pp. 313-314, 2011. [acm]
Published in 2010 (9)
Book chapter
-
Carlos Castillo, Ricardo Baeza-Yates, Berthier Ribeiro-Neto: "Web Crawling". Chapter 12 in Ricardo Baeza-Yates and Berthier Ribeiro-Neto, "Modern Information Retrieval, Second Edition". 2010.
Journal articles
- Paolo Boldi, Francesco Bonchi, Carlos Castillo, Sebastiano Vigna: "Query Reformulation Mining: Models, Patterns and Applications". In Information Retrieval, Springer. 2010. [springer|bib]
- Jacob Abernethy, Olivier Chapelle, Carlos Castillo: "Graph Regularization Methods for Web Spam Detection". In Machine Learning Journal, vol. 81, no. 2, pp. 207-225. Springer. Was: "WITCH: A New Approach to Web Spam Detection", Yahoo! Technical Report 2008-01. [VIDEO|bib|springer]
- Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis: "Efficient Algorithms for Large-Scale Local Triangle Counting". ACM Transactions on Knowledge Discovery from Data, Volume 4, Issue 3. ACM Press. [acm|bib]
Conference articles
- Aris Anagnostopoulos, Carlos Castillo, Aristides Gionis, Luca Becchetti, Stefano Leonardi: "Power in Unity: Forming Teams in Large-Scale Community Systems". Proc. of CIKM 2010, pp. 599-608.Toronto, Canada. ACM Press. [bib|acm|slides]
- Ilija Subasic, Carlos Castillo: "The Effects of Query Bursts on Web Search Results". In Proc. of ACM/IEEE Web Intelligence 2010, pp. 374-381. Toronto, Canada. Best student paper award [ieee|bib|y!]
- Ingmar Weber, Carlos Castillo: "The Demographics of Web Search". In SIGIR, pp. 523-530. Geneva, Switzerland, 2010. ACM Press. [errata|slides|bib|y!|acm|new scientist|slashdot|the economist]
- Ilaria Bordino, Carlos Castillo, Debora Donato and Aristides Gionis: "Query Similarity by Projections on the Query-Flow Graph". In SIGIR, pp. 515-522. Geneva, Switzerland, 2010. ACM Press. [bib|acm|slides]
- Aristides Anagnostopoulos, Luca Becchetti, Carlos Castillo and Aristides Gionis: "An Optimization Framework for Query Recommendation". In proceedings of Web Search and Data Mining (WSDM), pp. 161-170, New York, USA. 2010. [acm|slides|bib|talk blogpost]
Workshop articles and talks
Dino Ienco, Francesco Bonchi, Carlos Castillo: "The Meme Ranking Problem: Maximizing Microblogging Virality". In SIASP workshop. Sydney, Australia. [ieee|bib]
Marcelo Mendoza, Barbara Poblete, Carlos Castillo: "Twitter Under Crisis: Can we trust what we RT?". In SOMA 2010: KDD Workshop on Social Media Analytics, Washington, DC. July 2010. [acm|bib|soma|VIDEO|wall street journal|scientific american]
Ranieri Baraglia, Carlos Castillo, Debora Donato, Franco Maria Nardini, Raffaele Perego and Fabrizio Silvestri: "The Effects of Time on Query Flow Graph-based Models for Query Suggestion". In proceedings of RIAO. Paris, France, 2010. [slides]
Carlos Castillo, Aristides Gionis, Ronny Lempel, Yoelle Maarek: "When no clicks are good news". Industry track, SIGIR 2010. Geneva, Switzerland. [slides|video (teaser)]
Encyclopedia Entry
Carlos Castillo and Ricardo Baeza-Yates: "Web Retrieval and Mining". In Encyclopedia of Library and Information Sciences, Third Edition. Taylor & Francis, pp.5615-5622, 2010. [bib|request by mail]
Course Materials (in Spanish)
Mari Carmen Marcos. Entrevista a Carlos Castillo [on line]. "Hipertext.net", núm. 8, 2010.
Published in 2009 (5)
Journal articles
- Francesco Bonchi, Carlos Castillo, Debora Donato and Aristides Gionis: "Taxonomy-driven lumping for sequence mining". Data Mining and Knowledge Discovery Journal, vol. 19, no. 2, pp. 227-244, 2009. Springer. [TAXOMO|VIDEO|bib|springer|slides|abstract]
Conference articles
- Paolo Boldi, Francesco Bonchi, Carlos Castillo, Sebastiano Vigna: "Voting in social networks". In CIKM 2009, pp. 777-786. ACM Press. Was TR RI 327-09 Università degli Studi di Milano. [slides|bib|acm]
- Michalis Potamias, Francesco Bonchi, Carlos Castillo, Aristides Gionis: "Fast shortest path distance estimation in large networks". In CIKM 2009, pp. 867-876. ACM Press. Best student paper award [bib|slides|y!|acm]
- Paolo Boldi, Francesco Bonchi, Carlos Castillo and Sebastiano Vigna: "From 'dango' to 'japanese cakes': Query Reformulation Models and Patterns".In IEEE/ACM Web Intelligence, 2009. IEEE Cs Press. Best paper award. [slides|bib|ieee|y!]
- Ricardo Baeza-Yates, Christian Middleton, Carlos Castillo: "The Geographical Life of Search". To appear in IEEE/ACM Web Intelligence, 2009. IEEE Cs Press. [bib|ieee|slides]
Conference article (short paper)
Ranieri Baraglia, Carlos Castillo, Debora Donato, Franco Maria Nardini, Raffaele Perego, Fabrizio Silvestri: "Aging effects on Query Flow Graph for Query Suggestion" (short paper). In CIKM 2009, pp. 1947-1950. ACM Press. [bib|poster|acm]
Workshop articles
Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Sebastiano Vigna: "Query Suggestions Using Query-Flow Graphs". Workshop on Web Search Click Data (WSCD), pp. 56-63, 2009. [acm|slides|bib]
Marcin Sydow, Francesco Bonchi, Carlos Castillo, Debora Donato: "Optimising Topical Query Decomposition". Workshop on Web Search Click Data (WSCD), pp. 43-47, 2009. [acm|bib]
Talks
Video: Minería de logs de consulta (in Spanish). Universidad de Oviedo, 2009-05-27
Video: 'Análisis de enlaces y detección de spam en la Web (in Spanish). Universidad de Oviedo, 2009-05-28. Press Coverage @ La Nueva España
Query-log Mining. Universidade Federal de Minas Gerais, 2009-03-19
Published in 2008 (8)
Journal Articles
- Patrizia Andronico, Marina Buzzi, Carlos Castillo and Barbara Leporini: "Evaluating a Modified Google User Interface Via Screen Reader". Journal of Universal Access in the Information Society, Vol. 7, No, 3, pp. 155-177. 2008. Springer. [bib|springer] (Extends "Testing Google interfaces modified for the blind" poster in WWW2006)
- Luca Becchetti, Carlos Castillo, Debora Donato, Ricardo Baeza-Yates, Stefano Leonardi: "Link Analysis for Web Spam Detection". ACM Transactions on the Web, Vol. 2, No. 1, Art. 2, 2008. ACM Press. [bib|acm] (extends "Link-based characterization ..." in AIRWeb'06 and "Using rank propagation..." in WebKDD'06).
- Josiane Xavier-Parreira, Carlos Castillo, Debora Donato, Sebastian Michel, Gerhard Weikum: "The JXP Method for Robust PageRank Approximation in a Peer-to-Peer Web Search Network". The VLDB Journal, Vol. 17, No. 2, pp. 291-313, 2008. (extends Parreira et al.'s JXP algorithm in VLDB'06 and "Computing trusted authority..." in AIRWeb'06). Springer. [bib|springer]
Conference Articles
- Barbara Poblete, Aristides Gionis, Carlos Castillo: "Dr. Searcher and Mr. Browser: a unified hyperlink-click graph". Proceedings of CIKM, pp. 1123-1132.Napa Valley, CA, USA, October 2008. ACM Press. [slides||acm]
- Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, Sebastiano Vigna: "The query-flow graph: model and applications". Proceedings of CIKM, pp. 609-618. Napa Valley, CA, USA, October 2008. ACM Press. [slides|bib|acm]
- Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis: "Topical query decomposition". In Proceedings of ACM KDD, pp. 52-60. Las Vegas, USA, August 2008. [bib|slides|acm]
- Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis: "Efficient Semi-Streaming Algorithms for Local Triangle Counting in Massive Graphs". In Proceedings of ACM KDD, pp. 16-24. Las Vegas, USA, August 2008. ACM Press. [bib|slides|acm] Was tech. report RI 316-07, Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano.
- Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, Gilad Mishne : "Finding high quality content in social media, with an application to community-based question answering". Proceedings of Web Search and Data Mining (WSDM), pp. 183-194. Stanford, California, USA, 2008. ACM Press. [bib|mirror|y!|slides|acm|VIDEO]. Was technical Report YR-2007-005, Yahoo! Research.
Workshop articles
Carlos Castillo, Claudio Corsi, Debora Donato, Paolo Ferragina, Aristides Gionis: "Query log mining for detecting polysemy and spam". To appear in Proceedings of WebKDD, Las Vegas, USA, 2008. Springer. [bib]
Carlos Castillo, Claudio Corsi, Debora Donato, Paolo Ferragina and Aristides Gionis: "Query-log mining for detecting spam". Proceedings of AIRWeb 2008, pp. 17-20. Beijing, China. ACM Press. [bib|acm]
Jacob Abernethy, Olivier Chapelle and Carlos Castillo: "Webspam Identification Through Content and Hyperlinks". Proceedings of AIRWeb 2008, pp. 41-44. Beijing, China. [bib|acm]
Workshop/Project Report
Carlos Castillo, Kumar Chellapilla, Brian Davison, "AIRWeb'07 Workshop report". SIGIR Forum, June 2008, pp. 68-72. [bib|acm|sigirf]
Carlos Castillo, Kumar Chellapilla, Dennis Fetterly, "Fourth international workshop on Adversarial Information Retrieval on the Web (AIRWeb 2008)". In WWW Workshops, April 2008. [bib|acm]
Luca Becchetti, Carlos Castillo, Debora Donato, Stefano Leonardi and Ricardo Baeza-Yates: "Web spam detection: Link-based and content-based techniques". In Friedhelm Meyer (Ed.), The European Integrated Project Dynamically Evolving, Large Scale Information Systems (DELIS): proceedings of the final workshop, pp. 99-113. Heinz-Nixdorf Institut, Universität Paderborn. [bib]
Poster
Antti Ukkonen, Carlos Castillo, Debora Donato, Aristides Gionis: "Searching the Wikipedia with contextual information". Proceedings of CIKM, pp. 1351-1352. Napa Valley, CA, USA, October 2008. ACM Press. [bib|acm]
Book Chapter
Marcin Sydow, Jakub Piskorski, Dawid Weiss, Carlos Castillo: "Fighting Web Spam". In F. Fogelman-Soulié et al. (eds.): Mining Massive Data Sets for Security, Vol. 19 of NATO SPSS Series D., pp. 134-153. IOS Press, 2008. [VIDEO|bib|request by mail]
Invited Column
Carlos Castillo, Yiyu Yao: "EvalWare: Granular Computing for Web Applications". IEEE Signal Processing Magazine, Vol. 25, No. 2, pp. 142-143, March 2008. [ieee|bib]
Published in 2007 (6)
Journal Articles
- Ricardo Baeza-Yates, Carlos Castillo and Efthimis N. Efthimiadis:"Characterization of National Web Domains". ACM Transactions on Internet Technology, Vol. 7, No. 2, Art. 9. May 2007. ACM Press. [bib|acm]
- Ricardo Baeza-Yates and Carlos Castillo: "Crawling the Infinite Web". Journal of Web Engineering, Vol. 6, No. 1, pp. 49-72. February 2007. Rinton Press (Extends our paper in WAW'04) [bib|rinton]
- Gabriel Tolosa, Fernando Bordignon, Ricardo Baeza-Yates, Carlos Castillo: "Characterization of the Argentinian Web". Cybermetrics, Vol. 11, No. 1, P. 7. July 2007. [bib]
Conference Articles
- Carlos Castillo, Debora Donato, Aristides Gionis, Vanessa Murdock, Fabrizio Silvestri: "Know your Neighbors: Web Spam Detection using the Web Topology". In Proceedings of SIGIR, pp. 423-430. Amsterdam, Netherlands, 2007. ACM Press. [acm|y!|bib|delis|talk in spanish at ojobuscador] Was DELIS technical report DELIS-TR-0458.
- Carlos Castillo, Debora Donato, Aristides Gionis: "Estimating the Number of Citations using Author Reputation". String Processing and Information Retrieval Symposium (SPIRE), pp. 107-117. Santiago, Chile, 2007. Springer. [y!|bib|springer]
- Gabriel H. Tolosa, Fernando R. A. Bordignon, Ricardo Baeza-Yates, Carlos Castillo: "Distinctive Features of the Argentinian Web". To appear in Proceedings of LA-WEB. Santiago, Chile, 2007. IEEE CS Press.
Workshop Articles
Josiane-Xavier Parreira, Debora Donato, Carlos Castillo, Gerhard Weikum: "Computing Trusted Authority Scores in Peer-to-Peer Networks". Workshop on Adversarial Information Retrieval on the Web (AIRWeb), pp. 73-80. Banff, Canada. 2007. [bib|y!|airweb|acm]
Debora Donato, Mario Paniccia, Maddalena Selis, Carlos Castillo, Giovanni Cortese, Stefano Leonardi: "New Metrics for Reputation Management in P2P Networks". Workshop on Adversarial Information Retrieval on the Web (AIRWeb), pp. 65-72. Banff, Canada. 2007. [bib|y!|airweb|acm]
Invited Paper
Ricardo Baeza-Yates, Carlos Castillo, Flavio Junqueira, Vassilis Plachouras, Fabrizio Silvestri: "Challenges on Distributed Information Retrieval" (Invited Paper). International Conference on Data Engeneering (ICDE). Istanbul, Turkey, April 2007. IEEE CS Press. [bib|talk|y!|ieee]
Workshop Proceedings
Carlos Castillo, Kumar Chellapilla, Brian D. Davison (chairs/editors): "Proceedings of the 3rd international workshop on Adversarial information retrieval on the web". ACM ICPS, Vol. 215. 2007. [bib|acm]
National Journal
Carlos Castillo, Bartlomiej Starosta, Marcin Sydow "Crawl.pl: Measuring Statistical and Structural Properties of the Polish Web", Studia Informatica, 1(8), pp. 43-73, PL ISSN : 1731-2264, Academy of Podlasie Press, 2007. [bib]
Regional Conference
Gabriel H. Tolosa, Fernando R. A. Bordignon, Ricardo Baeza-Yates, Carlos Castillo: "Caracterización del Espacio Web de Argentina" (in spanish). To be presented in CLEI. Costa Rica, 2007.
Published in 2006 (7)
Journal Articles
- Ricardo Baeza-Yates, Paolo Boldi, Carlos Castillo: "Generic Damping Functions for Propagating Importance in Link-Based Ranking". Journal of Internet Mathematics, Vol. 3, No. 4, pp. 445-478, 2006. A K Peters. [bib|jim] (extends "Generalizing PageRank ..." in SIGIR'06)
- Patrizia Andronico, Marina Buzzi, Carlos Castillo and Barbara Leporini: "Improving Search Engine Interfaces for Blind Users: a Case Study". Journal of Universal Access in the Information Society, special issue on Information Systems Accessibility. Vol. 5, No.1, pp. 23-40, June 2006. Springer. [bib|springer]
- Ricardo Baeza-Yates, Carlos Castillo and Vicente López: "Características de la Web de España" (in spanish). El Profesional de la Información, Vol. 15, No. 1. January-February, pp. 6-17 2006. [bib|metapress]
Conference Articles
- Luciana Buriol, Carlos Castillo, Debora Donato, Stefano Leonardi and Stefano Millozzi: "Temporal Analysis of the Wikigraph". In Proceedings of the Web Intelligence Conference, pp. 45-51. Hong Kong, December 2006. IEEE CS Press. [bib|acm]
- Carlos Castillo, Alberto Nelli and Alessandro Panconesi: "A Memory-Efficient Strategy for Exploring the Web". In Proceedings of the Web Intelligence Conference, pp. 680-686. Hong Kong, December 2006. IEEE CS Press. [bib]
- Ricardo Baeza-Yates, Paolo Boldi and Carlos Castillo: "Generalizing PageRank: Damping Functions for Link-Based Ranking Algorithms". In Proceedings of ACM SIGIR, pp. 308-315. Seattle, Washington, USA, August 2006. [acm|bib |talk@pisa] (See also TR N. 305-05 Univ. of Milano, 2005).
Encyclopedic Article
- Ricardo Baeza-Yates and Carlos Castillo: "Web Searching". In Keith Brown, (Editor-in-Chief), Encyclopedia of Language and Linguistics, Second Edition, Vol. 13, pp. 527-537. Oxford: Elsevier, 2006.
Workshop Articles
Luca Becchetti, Carlos Castillo, Debora Donato and Adriano Fazzone: "A Comparison of Sampling Techniques for Web Characterization". In Proceedings of the Workshop on Link Analysis (LinkKDD). Philadelphia, USA, August 2006. ACM Press. [bib|linkkdd]
Luca Becchetti, Carlos Castillo, Debora Donato, Stefano Leonardi, Ricardo Baeza-Yates: "Using Rank Propagation and Probabilistic Counting for Link-Based Spam Detection". In Proceedings of the Workshop on Web Mining and Web Usage Analysis (WebKDD). Philadelphia, USA, August 2006. ACM Press. [bib|webkdd|acm|VIDEO] (See also DELIS TR-0341).
Luca Becchetti, Carlos Castillo, Debora Donato, Stefano Leonardi, Ricardo Baeza-Yates: "Link-Based Characterization and Detection of Web Spam". Workshop on Adversarial Information Retrieval on the Web (AIRWeb). Seattle, USA, August 2006. [bib|airweb|talk@bcn]
Gemma Boleda, Stefan Bott, Carlos Castillo, Rodrigo Meza, Toni Badia, Vicente López: "CUCWeb: a Catalan corpus built from the Web". 2nd Workshop on the Web as a Corpus at EACL'06. Trento, Italy, April 2006. [bib|eacl]
Newsletter
Carlos Castillo, Debora Donato, Luca Becchetti, Paolo Boldi, Massimo Santini, Sebastiano Vigna: "A Reference Collection for Web Spam". SIGIR Forum, Vol. 40, No. 2, December 2006. [www|sigirf|bib|y!|acm]. DELIS technical report DELIS-TR-0405.
Posters
Luca Becchetti and Carlos Castillo: "The Distribution of PageRank Follows a Power-Law only for Particular Values of the Damping Factor". World Wide Web Conference (posters), pp. 941-942. Edinburgh, Scotland, May 2006. [www2006|acm]
Ricardo Baeza-Yates and Carlos Castillo: "Relationship between Links and Trade". World Wide Web Conference (posters), pp. 927-928. Edinburgh, Scotland, May 2006. [delis-tr-0253|www2006|acm]
Patrizia Andronico, Marina Buzzi, Carlos Castillo and Barbara Leporini: "Testing Google Interfaces Modified for the Blind". World Wide Web Conference (posters), pp. 873-874. Edinburgh, Scotland, May 2006. [www2006|acm]
Published in 2005 (2)
Journal Article
- Ricardo Baeza-Yates, Carlos Castillo and Vicente López: "Characteristics of the Web of Spain". Cybermetrics, Vol. 9, No. 1, 2005. [cybermetrics| website|bib]
Conference Article
- Ricardo Baeza-Yates, Carlos Castillo, Mauricio Marin and Andrea Rodriguez: "Crawling a Country: Better Strategies than Breadth-First for Web Page Ordering". WWW Conference / Industrial Track, ACM, pp. 864-872. Chiba, Japan, 2005. [talk|bib|acm]
Workshop Articles
Ricardo Baeza-Yates, Carlos Castillo and Vicente López: "Pagerank Increase under Different Collusion Topologies". Workshop on Adversarial Information Retrieval on the Web (AIRWeb). Chiba, Japan, 2005. [airweb|talk|bib]
Ricardo Baeza-Yates and Carlos Castillo: "Link Analysis in National Web Domains". Workshop on Open Source Web Information Retrieval (OSWIR), pp. 15-18. Compiegne, France, September 2005. [bibtex|oswir|talk] (extended in "Characterization of National Web Domains" 2006)
Carlos Castillo and Ricardo Baeza-Yates: "WIRE: an Open-Source Web Information Retrieval Environment". Workshop on Open Source Web Information Retrieval (OSWIR), pp. 27-30. Compiegne, France, September 2005 . [bib|oswir|website|talk]
Albert Bifet, Carlos Castillo, Paul-Alexandru Chirita and Ingmar Weber: "An Analysis of Factors Used in a Search Engine's Ranking". Workshop on Adversarial Information Retrieval on the Web (AIRWeb), synopsis. Chiba, Japan, 2005. [bib]. Reprinted in 2007 as a chapter of the book "Internet Search Engines -- An Introduction" edited by Ravi Kumar Jain B.; Chapter 5, pp. 76-95, ICFAI University Press.
National Conference
Marco Modesto, Álvaro R. Pereira Jr., Nivio Ziviani, Carlos Castillo and Ricardo Baeza-Yates: "Un Novo Retrato da Web Brasileira" (in portuguese) , SEMISH Symposium, pp. 2005-2017. São Leopoldo, Brazil. July 2005. [bib]
Abstract
Carlos Castillo: "Effective Web Crawling (Doctoral Abstract)". ACM SIGIR Forum Vol.39 No. 1, pp. 55-56. June 2005. [acm]
Technical Reports
Carlos Castillo and Ricardo Baeza-Yates: "Practical Web Crawling". Technical Report, 2005.
Carlos Castillo and Ricardo Baeza-Yates: "Visualizing the European Trade Graph". Technical re port DELIS-TR-0252, DELIS (Dynamically Evolving Large-scale Information Systems), 2005. [delis]
Ricardo Baeza-Yates, Paolo Boldi and Carlos Castillo: "The Choice of a Damping Factor for Propagating Importance in Link-Based Ranking". Technical report RI-DSI N. 305-05 , Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano, September 2005. [bib|unimi|talk@pisa] (reviewed and published in 2006 in SIGIR)
Ricardo Baeza-Yates and Carlos Castillo: "Caracterización de la Web Chilena" (in spanish). Technical report, Center for Web Research, Universidad de Chile, 2005. [website]
Patrizia Andronico, Marina Buzzi, Carlos Castillo and Barbara Leporini: "Search Engine UIs: remote usability test with blind persons". Technical report TR-15/2005, Istituto di Informatica e Telematica (IIT), Consiglio Nazionale delle Ricerche (CNR). Pisa, Italy, 2005. [request by e-mail]
Published in 2004 (5)
Book Chapter
- Ricardo Baeza-Yates, Carlos Castillo and Felipe Saint-Jean: "Web Dynamics, Structure and Page Quality". In M. Levene and A. Poulovassilis (eds.) "Web Dynamics", Springer, pp. 93-109. 2004. [bib|springer]
Journal Article
- A. Jaimes, J. Ruiz-del-Solar, R. Verschae, R. Baeza-Yates, C. Castillo, D. Yaksic and E. Davis: "On the Image Content of a Web Segment: Chile as a Case Study". Journal of Web Engineering, Vol. 3 No. 2, pp. 153-168. 2004. [bib|rinton]
Conferences and Workshops with Proceedings
- Carlos Castillo, Mauricio Marin, Andrea Rodriguez and Ricardo Baeza-Yates: "Scheduling Algorithms for Web Crawling". WebMedia/LA-WEB 2004, IEEE Cs. Press, pp. 10-17. Ribeirão Preto-SP, Brazil, 2004. [talk|bib|ieee]
- R. Baeza-Yates, J. Ruiz-del-Solar, R. Verschae, C. Castillo and C. Hurtado: "Content-based Image Retrieval and Characterization on Specific Web Collections". Conference on Image and Video Retrieval (CIVR), Springer LNCS, pp. 189-198. Dublin, Ireland, 2004. [bib|springer]
- Ricardo Baeza-Yates and Carlos Castillo: "Crawling the Infinite Web: Five Levels are Enough". Workshop of Algorithms on Web Graphs (WAW), Springer LNCS, pp. 156-167. Rome, Italy, 2004. [talk|bib|springer] (extended version available, see year 2005)
National Conferences
G. Boleda, S. Bott, B. Poblete, C. Castillo, M.E. Fuenmayor, T. Badia, V. López: "CuCWeb, un corpus del català construït a partir de la web" (in catalan). Congrés Societat del Coneixement. Barcelona, España, 2004. [html]
Poster
Efthimis N. Efthimiadis, Carlos Castillo: "Charting the Greek Web". ASIST Conference (Poster), Providence, Rhode Island, USA, 2004. [bibtex]
Thesis
Carlos Castillo: "Efficient Web Crawling". PhD Thesis. Universidad de Chile, 2004. [bib]
Technical Reports
Ricardo Baeza-Yates, Felipe Lalanne, Carlos Castillo, Georges Dupret: "Comparing the characteristics of the Korean and the Chilean Web". Technical report, ITCC, DCC, University of Chile, 2004.
Ricardo Baeza-Yates, Carlos Castillo and Efthimis Efthimiadis: "Comparing the characteristics of the Chilean and the Greek Web". Technical report, Universidad de Chile, 2004.
Published in 2003 (1)
Conferences
- A. Jaimes, J. Ruiz-del-Solar, R. Verschae, D. Yaksic, R. Baeza-Yates, E. Davis and C. Castillo: "On the Image Content of the Chilean Web". Latin American Web Conference (LA-WEB), IEEE Cs. Press, pp.72-83. Santiago, Chile, 2003. [bib|ieee]
Poster
Carlos Castillo: "Cooperation schemes between a Web server and a Web search engine". Latin American Web Conference LA-WEB (Extended Poster), IEEE Cs. Press, pp. 31a-35a. Santiago, Chile, 2003. [bib|ieee]
Technical Reports
Vicente López, Carlos Castillo and Joan Codina: "Information Retrieval in Mail Archives". Technical report, Cátedra Telefónica de Producción Multimedia, Universitat Pompeu Fabra, 2003.
Carlos Castillo: "Estudio de idiomas en las páginas Web españolas (dominio .ES)" (in spanish).Technical report, Cátedra Telefónica de Producción Multimedia, Universitat Pmpeu Fabra, 2003.
Published in 2002 (3)
Journal Article
- Ricardo Baeza-Yates and Carlos Castillo: "Balancing Volume, Quality and Freshness in Web Crawling". in A. Abraham, J. Ruiz-del-solar, M. Köppen (Eds.), Soft-Computing Systems: Design, Management and Applications, Frontiers in Artificial Intelligence and Applications 97, IOS Press, pp. 565-572, 2002. [talk|bib|ios]
Conferences
- Ricardo Baeza-Yates, Felipe Saint-Jean and Carlos Castillo: "Web Structure, Age, and Page Quality". Proceedings of String Processing and Information Retrieval (SPIRE), Springer LNCS, pp. 117-130, 2002. Lisbon, Portugal. Also presented in 2nd Web Dynamics Workshop, Hawaii, 2002. [bib|springer] (see also year 2004)
- Carlos Castillo: "A Model for the Design and Implementation of Web Sites". IADIS International WWW/Internet Conference (ICWI), pp. 452-460. Lisbon, Portugal, 2002. [bib]
Poster
Carlos Castillo and Ricardo Baeza-Yates: "A New Model for Web Crawling". World Wide Web Conference (Poster). Honololulu, USA, 2002. [bib]
Published in 2001 (1)
Conference
- Ricardo Baeza-Yates and Carlos Castillo: "Relating Web characteristics with link based Web page ranking". Proceedings of String Processing and Information Retrieval (SPIRE), IEEE Cs. Press, pp 21-32. Laguna San Rafael, Chile, 2001. [talk|bib|ieee] (see also year 2003)
National Conference
Carlos Castillo: "Newtenberg: Un Modelo e Implementación de un sistema de Publicaciones Digitales en la Web" (in spanish). Encuentro Chileno de Ciencias de la Computación. Punta Arenas, Chile. 2001.
Poster
Ricardo Baeza-Yates and Carlos Castillo: "Relating Web Structure and User Behavior". World Wide Web Conference (Poster). Hong Kong, 2001.
Technical Report
Ricardo Baeza-Yates and Carlos Castillo: "Analysis of Link-Based Ranking for the Web". Technical report, University of Chile, 2001.
Published in 2000
National Conference
Ricardo Baeza-Yates and Carlos Castillo: "Caracterizando la Web Chilena". (in spanish) Encuentro Chileno de Ciencias de la Computación, año 2000. [bib]
Thesis
Carlos Castillo: "Características de la Web Chilena y Extensiones a un Buscador Web" (in spanish), Memoria de título, Universidad de Chile, año 2000.
See also: DBLP - Google Scholar - CSB - PubZone - Microsoft Academic Search - ACM - CiteULike - DBLife.
Notes: (i) articles published by ACM/IEEE/Springer are the author's version, and can be downloaded from this page for personal use, but not posted in other web sites or mailing lists (ii) the numbers in parenthesis are the number of peer-reviewed works published that year in extenso in books, international journals or conferences with proceedings (iii) key papers are in boldface.
LinkedIn
Scholar
Facebook

