Identifying emotions in email with human-level accuracy

As part of the Master’s degree Business Analytics at the VU Amsterdam, I completed my master thesis at Anchormen: “Identifying effective affective email responses; Predicting customer affect after email conversation”.

When customers contact a company with regards to queries and complaints, often they prefer to use email. Handling these emails is a massive task for the Customer Support department. Automating email handling can help improve quality, reduce costs and shorten response time. However, awareness of customer emotion during the conversation is an important aspect in effective email handling.

In my research, I used sentiment analysis  on incoming customer emails to determine the initial emotion of a customer. Furthermore, affect analysis was applied to predict the customer’s emotion after the response email from Customer Support . Both analyses were executed using supervised machine learning which trains computer models based on labelled data. This required manual labelling of a set of emails with sentiment (None, Neg, Pos, Mix) and emotions (Anger, Disgust, Fear, Joy, Sadness).

Manual labelling revealed that humans find it very difficult to determine emotions in email. Still, using majority vote, a reliable label set could be determined. Applying machine learning on the labelled data resulted in human-level accuracy for Sentiment, Anger and Joy. For Disgust, the model even significantly outperforms human annotation. In both sentiment and emotions, the domain specific models trained on a small (742) set of emails outperforms a commercial model that was trained on millions of news sources.

Machine learning to predict customer affect showed low performance. Still, results are significantly better than the benchmarks. A more direct measurement of customer affect may however drastically improve performance.

The full thesis is available for download here. The presentation is available here.