Sexual and gender minorities (SGM) in the United States experience a number of health disparities and unique contributing factors to them. From a research perspective, survey design and implementation challenges, such as lack of inclusion of effective items for assessing SGM status and inadequate sampling methods, remain barriers to studying SGM. The purpose of this commentary is to describe, using the National Adult Tobacco Survey (NATS) survey items and datasets, the primary limitations we encountered when trying to describe SGM tobacco use. Our intent is to demonstrate through use of a national dataset around a specific health disparity, the imperative for researchers to change their data collection strategies and practices around tobacco use and other healthcare priorities. Our team utilized the 2009 NATS dataset as well as the 2012 iteration to highlight significant changes between them regarding demographics, tobacco use, and access to healthcare, in addition to methodological concerns regarding sampling strategies. It is critical that researchers strive to use items for survey research that accurately capture data on marginalized groups. Additionally, careful consideration is warranted regarding strategies to identify members of these populations, changes in item wording, and changes in questions asked over time in an effort to track changes in behavior over time.
As the visibility of Lesbian, Gay, Bisexual, and Transgender (LGBT) persons, broadly referred to as sexual and gender minorities (SGM), has increased within the United States, so too does our understanding of the magnitude and impact of health disparities experienced across and within these communities. Studies over the past two decades, in particular, have described such health inequities, including higher risk for disability [
Tobacco is not only the leading cause of preventable and premature death in the United States [
Even though smoking rates have declined to 17% in 2014 for US [
Although there is a solid research foundation on SGM tobacco related health disparities, there is still much to flesh out nationally, especially in subgroups for which collecting representative samples of data can be arduous, such as transgender individuals. Aforementioned, our research team recently confronted these issues and others when investigating SGM-related health disparities and tobacco economics using the NATS from 2009-2010 [
The NATS gauges the extent of tobacco use in adults, evaluates the amount that tobacco use varies as a function of demographics estimates, and the achievement of key short-, intermediate-, and long-term tobacco prevention and control outcome indicators. The NATS administrations were conducted using stratified sampling by state via landline and cell phone numbers. The 2009-2010 sample contained 118,581 adults and the 2012-2013 sample contained 60,192 adults. Demographics including age, gender, marital status, income, education level, state, sexual orientation and race/ethnicity are collected and used for statistical purposes.
The 2009-2010 survey identified smokers, either previously or currently, by asking if respondents have smoked 100 cigarettes in their lifetime to assess smoking status, as well as asking about multiple tobacco product use, cessation and chronic condition information, use of counseling services, and lifetime quit attempts. The 2012-2013 survey assessed tobacco use in many of the categories listed for 2009-2010 as well as for e-cigarettes. Changes from one dataset to the next, as they relate to our experiences with data analyses, are discussed in further detail hereafter.
In describing our team’s SGM-related data challenges with the NATS, it is crucial to underscore that our struggles are by no means unique to this survey; similar and additional challenges are evident in many national, publically- available health datasets, such as the National Youth Tobacco Survey (NYTS), and the Global Adult Tobacco Survey. In fact, the NATS included items about SGM status in both survey administrations and before many other non-SGM specific national health surveys. Still, as these challenges emerged specifically from our work with the NATS data, we can most clearly illustrate why and how changes in survey content and sampling methodology can be advantageous for clarifying and reducing SGM health disparities.
We review four primary ways in which our group confronted difficulties in trying to capture LGBT health disparities through the NATS: 1) categorization of sexual and gender identity, 2) significant changes in survey items between administrations; 3) sampling methodology, and 4) participant response. Suggested alternative approaches, applicable to other national and international datasets, are offered herein.
In survey research, determining which identity characteristics to collect such that they function as meaningful constructs to a study’s purpose is not a new challenge [
SGM invisibility occurs when no SGM items are asked in surveys, such as in the Youth Tobacco Survey and the NYTS. Conversely, the NATS collects SGM data in the 2009-2010 and 2012-2013 administrations, though not in the same way across years. In 2009-2010, participants have the following options in response to “Do you consider yourself to be” with the options of “1) heterosexual or straight; 2) gay or lesbian; 3) bisexual; 4) transgender; 5) respondent does not understand responses; 6) don’t know/not sure; and 7) refused?” Particularly problematic for transgender individuals, this survey strategy forces a choice between sexual orientation and gender identity, which ignores the reality of individuals having both.
In 2012-2013, “transgender” is no longer presented in the first set of response options. Instead, it is presented subsequently if “something else” is selected as the response choice for sexual orientation. If a transgender participant identified first as gay, straight, or bisexual, there is no further opportunity to provide gender identity. Should the participant state “something else,” transgender is grouped with other sexual orientation items, such as “you are not straight, but identify with another label such as queer, trisexual, omnisexual or pansexual.” Again, this approach conflates sexual orientation with gender identity, making it all but impossible to acquire data on tobacco use in transgender individuals.
One resolve has been to assess transgender identity within the gender/sex item. The NATS does not do this and instead largely reinforces the binary notion of gender. In either case, the long-standing awareness of gender identity and biological sex as discrete [
The term LGBT itself also may unintentionally reinforce that these data be collected within the same survey item, a practice we have used and was once thought acceptable and progressive relative to asking nothing about sexual orientation and gender identity. The shorthand of LGBT (and similar acronyms) serves these communities collectively to bring needed societal awareness and visibility; however, the true utility of these labels unravels when trying to understand the unique and sometimes very different health disparities experienced within SGM subpopulations and becomes all but obsolete at the individual level.
Support for better data collection has already begun, such as the American Lung Association (ALA) calling on the Health and Human Services Secretary to incorporate the proposed “Data Standards for Sex” by looking at sex from a social perspective rather than genetic and/or biological. They also stated that the new standard for public health surveys should include items on sexual and gender identity as part of core demographics [
The NATS would benefit from independent questions about birth sex, gender identity, and sexual orientation. We agree with the two-step assigned sex and gender identity protocol developed in 1997 by the Transgender Health Advocacy Coalition [
In terms of sexual orientation, the Williams Institute [
Though the evolution of SGM assessment will continue, we believe the above to be a low-burden solution to collection of SGM identity data for understanding tobacco health disparities. Logically, survey context matters, so additional items about sexual behavior and attraction among others may be appropriate dependent on the purpose of the survey or the target population. Similarly, studies specific to SGM individuals may require in-depth data collection about these communities that is impractical in national general population studies.
As access to healthcare increases in the United States, it is imperative to track how individuals utilize these resources over time. This is particularly true for SGM individuals, who are historically underserved. However, with the Affordable Care Act (ACA) passed into law in 2010, as well as increased awareness of SGM health disparities, makes it crucial to assess SGM healthcare in order to provide intervention specific to their needs [
Focusing specifically on tobacco use, the 2009-2010 NATS dataset asked several questions about access to healthcare and how care providers may assist in attempts to quit smoking, via counseling, prescriptions for nicotine replacement, or appropriate referrals to cessation programs. However, these items, or reasonable approximations of these items, are notably absent from the 2012-2013 survey, making trends unfeasible to track, particularly in light of federal changes to healthcare access. Further, as one of our goals was to assess potential increases in healthcare utilization for SGM following the ACA and disparities or changes in the role of care providers in cessation for SGM, the removal of these items eliminated opportunities for such inquiry.
Items are also significantly altered from one dataset to the next. For instance, in 2009-2010, participants were asked “How old were you when you smoked a whole cigarette for the first time?” while participants in 2012-2013 were asked about “part or all of a cigarette.” Though the latter captures more data, it also confounds identifying the initiation of tobacco use between surveys for comparison. This seemingly benign change in survey wording has large ramifications; research suggests that smoking during adolescence increases sensitivity to the rewarding aspects of nicotine, increasing the likelihood of nicotine addiction [
Not all item changes are negative; in fact, many positive changes are observed in the 2012-2013 NATS survey. A necessary addition, for example, was the inclusion of questions about e-cigarettes in the 2012-2013 dataset, which repre- sents a growing contingent of the population who utilize these as an alternative, or compliment, to cigarettes. Further, though we note the inconsistency in the sexual orientation items, we do commend the NATS research team for inclusion of items that attempt to capture data on these marginalized groups. Still, we caution that decisions to change items be made mindfully and with consideration for the impact that they may have on analyses which track trends over time.
Probability sampling methods tend to generate very small SGM samples. Data reported in one of the largest single-study surveys of SGM in the US [
Historically, sampling of minority groups has been a challenge, particularly when the goal of the research is to provide a fair representation of the population. Therefore, oversampling strategies that aim to compensate for small sample sizes are often utilized. The simplest oversampling approach [
Additional practical approaches for oversampling SGM can be network, or snowball sampling, and location sampling. The former asks the sampled persons to identify others who are of a certain demographic, while the latter samples persons in specific community locations where these individuals usually congregate. A more detailed discussion on various techniques and their advantages and limitations is provided in Kalton [
Finally, investigators may elect to oversample at block level. Blocks are small geographic areas that are known to be “rich” in the demographics of interest. For example, previous national surveys or polls, such as the recent Gallup poll [
Given that research suggests that respondents are becoming more open to providing SGM status information in surveys [
In an effort to capture data on sexual and gender identity, investigators may take for granted participants’ understanding of terms like “LGBT,” “heterosexual,” “transgender,” and other phrases used to identify SGM. For example, in the 2009-2010 NATS databases, an item assessing sexual orientation is presented to participants. One potential response, “Respondent does not understand responses,” was selected by 0.52% of the unweighted sample (n = 610). By its inclusion, the questionnaire developers acknowledge there may be a contingent of participants for whom this terminology is unclear; however, it does not appear that further effort to define these terms in order to obtain more specific data is provided. This particular issue is carried over into the 2012-2013 NATS database, in which a larger proportion of participants (2.3%; n = 1361) reported not understanding the potential responses.
Item wording further complicates participant response. For instance, in the 2012-2013 NATS database, participants may select “something else” as a response to the question on sexual orientation; 0.42% (n = 254) of participants selected this. They are then presented with a follow-up item allowing them to clarify what they meant by “something else;” it is here that participants are presented with “transgender” as a response choice. Given that participants do not have the opportunity to see forthcoming items and options, they may not realize that a response choice better reflecting their self-identity is nested within “something else.” Consequently, they may default to response from the set of options that does not “fit.”
An additional consideration is allowing the participant to choose their own description. Open-ended responses, while providing an opportunity to allow participants to select a response that may not be included elsewhere, can often lead to redundant or unusable data. For example, though provided with ample opportunity to decline to respond (“refused” is a viable selection, 2.69%; n = 1619), several individuals responded with variations of refusal (“does not want to explain,” “do not want to answer,” etc.). Also mixed in with these responses are items that were offered previously but were not selected, such as “heterosexual,” gender identifiers such as “man” or “female,” and responses that were irrelevant, such as “alien” and “flying unicorn.”
The Williams Institute [
During the development of this piece, our team recognized our advocacy of clearer, more distinct categories for SGM populations, and anticipated a potential critique regarding the limitation of participant identification, specifically in light of the fluidity of both language and identity. For example, we highlight labels that participants use to self-identify that may not be the most helpful in the discussion of sexual and gender identity, such as “alien” and “flying unicorn,” and only serve to remove people from specific subsamples or create new subsamples with incredibly small numbers. We recognize that labels such as gay, lesbian, and transgender are socially constructed and ultimately mean little outside of their use within our flawed classification systems of individuals. Still, these groups, as we culturally understand them at this historical time point, face specific challenges, have unique experiences specific to their identities, and deal with specific health disparities. As such, while we recognize that these labels often pose limits, they also provide opportunities to learn more about these specific groups and reduce health disparities.
As researchers continue to investigate the health disparities and health behaviors of SGM, our approach to conceptualizing constructs, asking meaningful questions, identifying target individuals, and collecting and analyzing data must shift accordingly. As has been mentioned, adopting language that can be used across survey samples would help to ensure that we are measuring the same SGM constructs [
Stepleman, L.M., Lopez, E.J., Rawlins, W. and Heboyan, V. (2017) Smoking out Health Disparities in Sexual and Gender Minorities: Lessons from the National Adult Tobacco Survey. Open Journal of Social Sciences, 5, 1-13. https://doi.org/10.4236/jss.2017.58001