Learning from Crowds with Variational Gaussian Processes | IVPL

REFERENCE

Pablo Ruiz, Pablo Morales-Álvarez, Rafael Molina, and Aggelos K. Katsaggelos, “Learning from Crowds with Variational Gaussian Processes”, Pattern Recognition, vol. 88, 298-311, 2019.

DOI:10.1016/j.patcog.2018.11.021

ABSTRACT

Solving a supervised learning problem requires to label a training set. This task is traditionally performed by an expert, who provides a label for each sample. The proliferation of social web services (e.g., Amazon Mechanical Turk) has introduced an alternative crowdsourcing approach. Anybody with a computer can register in one of these services and label, either partially or completely, a dataset. The effort of labeling is then shared between a great number of annotators. However, this approach introduces scientifically challenging problems such as
combining the unknown expertise of the annotators, handling disagreements on the annotated samples, or detecting the existence of spammer and adversarial annotators. All these problems require
probabilistic sound solutions which go beyond the naive use of majority voting plus classical classification methods. In this work we introduce a new crowdsourcing model and inference procedure which trains a Gaussian Process classifier using the noisy labels provided by the annotators. Variational Bayes inference is used to estimate all unknowns. The proposed model can predict the class of new samples and assess the expertise of the involved annotators. Moreover, the Bayesian treatment allows for a solid uncertainty quantification. Since when predicting the class of a new sample we might have access to some annotations for it, we also show how our method can naturally incorporate this additional information. A comprehensive experimental section evaluates the proposed method with synthetic and real experiments, showing that it consistently outperforms other state-of-the-art crowdsourcing approaches.

HIGHLIGHTS

Gaussian Processes are used to address the crowdsourcing problem.
Variational inference is used for the first time to train the model.
Annotations provided for test instances can be integrated into the prediction.
We provide an experimental comparison with state-of-the-art crowdsourcing methods in both synthetic and real datasets.
The proposed method outperforms all state-of-the-art methods it was compared against.

A SYNTHETIC EXAMPLE

We introduce a controlled one-dimensional example to show the behavior of the proposed method. Figure 1a) shows the underlying synthetic classification dataset used. The features are uniformly sampled in the interval [-π, π]. The real labels are assigned according to the sign of the cosine function on each sample: class C₁ (resp. class C₀) if the cosine is positive (resp. negative).


a)	b)

c)	d)

e)	f)
Figure 1. a) Original data set labeled using sign of cosine function. b) – f) Labels provided by annotators 1,2,3,4 and 5 respectively.

Our goal is to learn an automatic classifier which distinguishes between samples belonging to class C₁ and samples belonging to C₀. The first step is to label a training set. Unlike a classical classification problem where only one or two experts annotates the whole training dataset, in this example we assume that this effort is shared by 5 annotators with different levels of expertise.

The 5 annotators are simulated by fixing the values of sensitivity and specificity (α and β in Fig. 1(b-f)). That is, if the true label of a given sample is 1 (resp. 0), the annotator assigns it to class C₁ (resp. C₀) with probability α (resp. β). In Fig. 1(b-f) we show the labels assigned by each annotator. As expected from the sensitivity and specificity values, annotators 1, 2, 3, and 5 make fewer mistakes than annotator 4, who assigns most samples to the opposite class (it has an adversarial behavior).

During the training step, the proposed method learns the underlying probabilistic model using ONLY the information provided by the annotator. That is:

Given a new sample, the proposed method can predict its label, as well as the uncertainty about this prediction.
The unknown true labels of the training set are estimated during training.
The proposed method estimates the unknown sensitivity and specificity values of each annotator, detecting spammer and adversarial behaviors.
If one or several annotators provide labels for a test sample, the proposed method includes this additional information in a natural way to produce a combined prediction human-machine.

DATASETS

The proposed method is evaluated on three different types of datasets: Synthetic (samples and crowdsourcing annotations are synthetically generated), semi-synthetic (samples are real but crowdsourcing annotations are synthetic), and real (samples and crowdsourcing annotations are real).

Synthetic data can be downloaded here.
Semi-synthetic data was obtained from the UCI Machine Learning Repository: Heart and Sonar. The processed data and the annotations generated for our experiments can be downloaded here.
The real examples can be downloaded from the author’s website [1].

MATLAB CODE

A MATLAB implementation of the proposed method can be downloaded here. In the experimental section of the paper the proposed method is compared against the following state-of-the-art methods [1] Rodrigues, [2] Raykar and [3] Yan. In these links we provide our own MATLAB implementation of Raykar and Yan. A MATLAB implementation for Rodrigues can be downloaded from the author’s website.

Software also available in GitHub: https://github.com/pablomorales92/VGPCR

REFERENCES

[1] F. Rodrigues, F. Pereira, B. Ribeiro, Gaussian process classification and active learning with multiple annotators, in: ICML, 2014, pp. 433–441.

[2] V. Raykar, S. Yu, L. Zhao, G. Hermosillo-Valadez, C. Florin, L. Bogoni, L. Moy, Learning from crowds, J. Mach. Learn. Res. 11 (2010), pp. 1297–1322.

[3] Y. Yan, R. Rosales, G. Fung, M. Schmidt, G. Hermosillo-Valadez, L. Bogoni, L. Moy, J. Dy, Modeling annotator expertise: Learning when everybody knows a bit of something, in: AISTATS, 2010, pp. 932–939.