Mining Twitter as a First Step toward Assessing the Adequacy of Gender Identification Terms on Intake Forms

AMIA Annu Symp Proc. 2015 Nov 5:2015:611-20. eCollection 2015.

Abstract

The Institute of Medicine (IOM) recommends that health care providers collect data on gender identity. If these data are to be useful, they should utilize terms that characterize gender identity in a manner that is 1) sensitive to transgender and gender non-binary individuals (trans* people) and 2) semantically structured to render associated data meaningful to the health care professionals. We developed a set of tools and approaches for analyzing Twitter data as a basis for generating hypotheses on language used to identify gender and discuss gender-related issues across regions and population groups. We offer sample hypotheses regarding regional variations in the usage of certain terms such as 'genderqueer', 'genderfluid', and 'neutrois' and their usefulness as terms on intake forms. While these hypotheses cannot be directly validated with Twitter data alone, our data and tools help to formulate testable hypotheses and design future studies regarding the adequacy of gender identification terms on intake forms.

MeSH terms

  • Data Mining / methods*
  • Female
  • Gender Identity*
  • Humans
  • Language
  • Male
  • Sexual and Gender Minorities
  • Social Media*
  • Transgender Persons*