5.2 Turning Tweets into Knowledge: An Introduction to Text Analytics | 5.2 Turning Tweets into Knowledge: An Introduction to Text Analytics | 5 Text Analytics | The Analytics Edge | Sloan School of Management

Quick Question

For each tweet, we computed an overall score by averaging all five scores assigned by the Amazon Mechanical Turk workers. However, Amazon Mechanical Turk workers might make significant mistakes when labeling a tweet. The mean could be highly affected by this.

Which of the three alternative metrics below would best capture the typical opinion of the five Amazon Mechanical Turk workers, would be less affected by mistakes, and is well-defined regardless of the five labels?

An overall score equal to the median (middle) score

An overall score equal to the majority score

An overall score equal to the minimum score

Explanation

The correct answer is the first one - the median would capture the typical opinion of the workers and tends to be less affected by significant mistakes. The majority score might not have given a score to all tweets because they might not all have a majority score (consider a tweet with scores 0, 0, 1, 1, and 2). The minimum score does not necessarily capture the typical opinion and could be highly affected by mistakes (consider a tweet with scores -2, 1, 1, 1, 1).