What Students Consider to be Healthy: Focus Group Data Tracking the Language of Pre-Diabetic Populations


Health and the Human Condition

We are investigating a social issue that ties together two UTA strategic themes: Health and the Human Condition and Data-Driven Discovery, namely, how the concept of “being healthy” changes over time, and can be differently construed in multiple settings, whether those be institutional or private, formal or conversational. Such an investigation is vital, since health is not simply a physical property measurable by scientific indices, such as body mass or cholesterol count, but is also a linguistic and anthropological product, forged in conversations about what is healthy and what is not. By building on the overlapping expertise of our research team in tracking cultural meaning interpretations and behaviors in different context, our project will use data analysis to uncover word patterns that reveal trends in public health behavior, and from these, make concrete recommendations for at-risk communities in the larger UT Arlington community.

A pair of examples associated with health and body weight demonstrate how language choice and public health are intimately connected: First, previous analyses have shown that people conflate the adjective fat, used to describe bodies, with the food component fat. As a consequence, utterances reveal that people assume they cannot get a fat body if they eat fat-free food, while disregarding the possibility of becoming fat by eating other high-calorie ingredients. Likewise, for sugar, people tracking blood sugar levels often focus only on the sugar that they eat, and not on carbohydrates with other names, which can equally affect blood sugar levels. Each of these attested senses leads to the framing of very different public health problems. Unrecognized, conflated word senses can also lead individuals to very different preventive health behaviors. Our goal is to create and analyze a large, shareable corpus of health discourses, focusing in this project on materials relating to those at risk for diabetes, a growing population due to 21st increases in childhood obesity prevalent in the DFW Metroplex. From this corpus, we will track the most frequent terms used when speakers and writers discuss improving causing a risk to health. To examine the nature of health discourses, our larger project will document how those who are not medical specialists communicate about health, capturing conversations in, for example: exercise and sports settings; online forums for targeted illnesses; recommendations for home cures; dining conversations among friends and family; TV talk shows on health and cooking; magazine advice columns, and blogs and twitter discussions. Building this corpus of material will allow us to run data analyses to reveal trends in the way health is currently understood and discussed.

Previous corpus-based work on health topics has focused on speech produced by medical providers or terms used in medical research, e.g., to create software for teaching terminology to med students, to study the structure of complex medical terms for translators, or to extract medical information from reports. This project will fill a crucial gap by examining the ways that health information is conveyed among non-specialists—both to the public and among the public. Stvan’s pilot collection of texts, CADOH-the Corpus of American Discourses on Health, has yielded case studies of meaning conflation for individual word pairs (Stvan 2007, 2008, 2013), giving us the starting point for data collection and metadata types to capture. Garner’s work with community-based assessment of diabetes risk, will enable an understanding of the communities involved (Garner et al 2015). Unlike previous studies that employed bibliometric techniques to explicate the conceptual structure of medical informatics (e.g., Raghupathi and Nerur, 2008, 2010), our work will involve a systematic analysis of key terms and their relationships using a combination of open source tools (e.g., Python) and commercially available tools software such as SAS and Leximancer.


Our deliverables for this first stage are papers describing the spoken student speech in the corpus material. We anticipate presenting papers derived from this project at the following venues:

The 61st Annual Conference of the International Linguistic Association (Culinary Linguistics theme), at Hofstra University, Hempstead, NY, March 12, 2016.

Texas Digital Humanitites Conference, University of Austin, May 27, 2016.

The Texas Public Health Association Annual Conference, Fort Worth, TX, Spring 2017

The American Public Health Association (APHA) 145th Annual Meeting, Atlanta, GA November 4-8, 2017

People involved in this study

For in-depth study details or to participate in this study, please contact the Principal Investigator listed below.

Principal InvestigatorLaurel Stvanstvan@uta.edu
Co-InvestigatorSridhar Nerur,snerur@uta.edu
Co-InvestigatorRebecca Garnerbeckyg@uta.edu