Watch the video teaser for our KDD'18 paper on Algorithms for Hiring and Outsourcing

PhD Opportunities in Crisis Informatics and Algorithmic Discrimination (UPF/Barcelona/2018)

The Web Science and Social Computing Group at Universitat Pompeu Fabra in Barcelona, which I lead, is inviting applications for PhD students interested in crisis informatics and algorithmic discrimination, and in general on social computing applications that address issues of social significance.

The crisis informatics student will seek to deepen our understanding of how community-generated content in social platforms can be used to improve emergency/disaster response, and to improve the resilience of societies. A student willing to pursuit this topic must have an orientation to practical problems, excellent programming skills, and be motivated to create new applications to use time-sensitive, sometimes life-saving data.

The algorithmic discrimination student will seek to deepen our understanding of how algorithms can embody and sometimes exacerbate biases against less advantaged groups in society. A student willing to pursuit this topic must have an excellent background on statistics, data mining, or machine learning, good data management skills, and be motivated to work in applications for education, justice, medicine, and other areas in which algorithms are used in the public sector.

Both positions are funded through DTIC fellowships.

For more information, visit Web Science and Social Computing » PhD Opportunities.

Improving disaster response efforts through data

Extreme weather events put the most vulnerable communities at high risk.

How can data analytics strengthen early warning systems and and support relief efforts for communities in need?

The size and frequency of natural disasters is growing at an alarming pace. In 2016 earthquakes, wildfires and other natural events caused US$210bn in global economic losses, according to a UK-based insurance broker, Aon. The year 2017 may tally an even higher figure, as a series of floods, earthquakes and hurricanes struck various areas of the world.

Developing economies, especially those located closer to the equator, are expected to bear the greatest toll from extreme weather events. These countries are the most vulnerable and least equipped to withstand these types of events, as they have fewer resources to prevent damage and protect citizens who are at risk.

Data and analytics can support relief and response initiatives for communities in need. From the use of satellite images and crowd-sourced mapping tools to predict and help prepare for disasters, to on-the-ground reports from drone footage, emergency responders, governments and non-government organisations (NGOs) are adopting data analytics as a critical tool to strengthen early warning systems and aid relief efforts in the aftermath of a disastrous event.

Full article »

Java Developer sought for developing Open Source ElasticSearch plug-ins


Skills required:

  1. Java development experience of 3+ years.
  2. Familiarity with search engines such as Apache SOLR or ElasticSearch, a significant plus.
  3. Experience in a research environment, a plus.

Description:

  • Our research team has been awarded a prestigious grant from the Data Transparency Lab. The grant is for "FA*IR: A tool for fair rankings in search," which is a new ranking method proposed by our team to avoid discrimination by gender, race, or other protected characteristics. The team includes researchers from Universitat Pompeu Fabra in Barcelona (Dr. C. Castillo, Dr. R. Baeza-Yates), TU Berlin (Mrs. M. Zehlike), NTENT Hispania (Dr. R. Baeza-Yates, Dr. Sara Hajian), and ISI Torino (Dr. F. Bonchi).
  • Within this grant, we are searching for a Java developer for writing a series of plug-ins for ElasticSearch (or alternatively, for SOLR) and interact with our research team. The plug-ins will implement re-ranking strategies for queries in which the documents correspond to descriptions of people (e.g., resumes). We have two groups of plug-ins that will implement algorithms parametrized by a configuration file.
    1. The first group of plug-ins will implement a series of criteria that must be fulfilled by every response to a query (e.g., that for every query, the resulting list of documents must contain a minimum proportion of women in the first positions). These criteria will be based on the paper by Zehlike et al. 2017 at CIKM 2017.
    2. The second group of plug-ins will implement a learning-to-rank re-ranking strategy. They will receive a set of training documents, in which the ranking has been manually established, and will learn how to rank new, unseen documents, based on these training documents and criteria of fairness to be established during the research.
  • In both cases, the plug-ins should not be detrimental to the performance of the search engine, i.e., at most a small extra latency can be incurred. We expect that efficient fair ranking plug-ins will be a significant contribution to ElasticSearch, and given that they will be released as Open Source software, they will have significant impact in the huge user base of ElasticSearch.

Location:

  • The developer will meet a team member once per week to report progress. Ideally at least half of the meetings must be in person, the other half can be remote. The team is based in Barcelona and Berlin, so the developer should be able to attend the in-person meetings in one of these cities. A developer based in the Barcelona or Berlin area will be preferred, while a developer located elsewhere is also acceptable.

Timing:

  • The project will start in February or March and end in July 2018 (5-6 months). The first group of plug-ins can be implemented immediately. The second group can be implemented from April'18, as the research of the research team progresses.
  • Bids will be reviewed from February 1st, 2018 and reviewed until a suitable developer is found.

What we offer:

  • Interaction with a team of international researchers.
  • Working on an application for social good, to mitigate or remove discrimination.
  • Contributing to Open Source software.

How to bid:

  • Questions may be asked by e-mail to Carlos Castillo carlos.castillo@upf.edu; please include the word "FA*IR" in the subject.
  • To bid, use this form.
    1. Include your CV with 2-3 recent relevant projects and your role on them
    2. Include your bid consisting of a work plan consisting of 2-3 phases for the project, the estimated number of work hours and timeline for each phase, and the cost of each phase. After the completion of each phase, a payment will be issued.
  • Contracting will be done directly between the developer and the Technical University of Berlin.

Salary and expenses: the total project cost should not exceed 24,000€ -- including all applicable taxes or deductions.

To bid, use this form

Fairness-Measures.org: a new resource of data and code for algorithmic fairness

Decisions that are partially or completely based on the analysis of large datasets are becoming more common every day. Data-driven decisions can bring multiple benefits, including increased efficiency and scale. Decisions made by algorithms and based on data also carry an implicit promise of "neutrality." However, this supposed algorithmic neutrality has been brought into question by both researchers and practitioners.

Algorithms are not really "neutral." They embody many design choices, and in the case of data-driven algorithms, include decisions about which datasets to use and how to use them. One particular area of concern are datasets containing patterns of past and present discrimination against disadvantaged groups, such as hiring decisions made in the past and containing subtle or not-so-subtle discriminatory practices against women or minority races, to name just two main concerns. These datasets, when used to train new machine-learning based algorithms, can contribute to deepen and perpetuate these disadvantages. There can be potentially many sources of bias, including platform affordances, written and unwritten norms, different demographics, and external events, among many others.

The study of algorithmic fairness can be understood as two interrelated efforts: first, to detect discriminatory situations and practices, and second, to mitigate discrimination. Detection is necessary for mitigation and hence a number of methodologies and metrics have been proposed to find and measure discrimination. As these methodologies and metrics multiply, comparing across works is becoming increasingly difficult.

We have created a new website, where we would like to collaborate with others to create benchmarks for algorithmic fairness. To start, we have implemented a number of basic and statistics measures in Python, and prepared several example datasets so the same measurements can be extracted across all of them.

We invite you to check the data and code available in this website, and let us know what do you think. We would love to hear your feedback: http://fairness-measures.org/.

Contact e-mail: Meike Zehlike, TU Berlin meike.zehlike@tu-berlin.de.

Meike Zehlike, Carlos Castillo, Francesco Bonchi, Ricardo Baeza-Yates, Sara Hajian, Mohamed Megahed (2017): Fairness Measures: Datasets and software for detecting algorithmic discrimination. http://fairness-measures.org/

Pages

Subscribe to ChaTo (Carlos Castillo) RSS