Computer Science > Human-Computer Interaction
[Submitted on 1 Aug 2018]
Title:How Does Tweet Difficulty Affect Labeling Performance of Annotators?
View PDFAbstract:Crowdsourcing is a popular means to obtain labeled data at moderate costs, for example for tweets, which can then be used in text mining tasks. To alleviate the problem of low-quality labels in this context, multiple human factors have been analyzed to identify and deal with workers who provide such labels. However, one aspect that has been rarely considered is the inherent difficulty of tweets to be labeled and how this affects the reliability of the labels that annotators assign to such tweets. Therefore, we investigate in this preliminary study this connection using a hierarchical sentiment labeling task on Twitter. We find that there is indeed a relationship between both factors, assuming that annotators have labeled some tweets before: labels assigned to easy tweets are more reliable than those assigned to difficult tweets. Therefore, training predictors on easy tweets enhances the performance by up to 6% in our experiment. This implies potential improvements for active learning techniques and crowdsourcing.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.