Stereotype threat and gender differences in statistics

Stereotype threat (ST) has been extensively explored as an explanation for gender disparities in achievement and participation in mathematics. However, there is a lack of research evaluating ST in statistics. The present study evaluated the impact of ST on gender differences in student performance, self-efficacy, and anxiety in statistics using a four-group, quasi-experimental design. Specifically, 102 elementary statistics students at a university in the Southeast United States were randomly assigned to one of four ST conditions including an explicit ST condition, an implicit ST condition, a reverse ST condition, and a nullified ST condition. Results indicated that there were no gender differences by ST condition in statistics self-efficacy, test anxiety, and performance. Analyses of student responses to open-ended questions indicated that females were more likely than males to report that they had fewer opportunities to achieve in statistics. Implications of our findings and suggestions for future research are discussed.


INTRODUCTION
There is a large amount of research that attempts to explain gender disparities in K-16 mathematics achievement and participation (Pilotti, 2021).Explanations for why outcomes favor males have included motivation, strategy use, metacognition, social support, socioeconomic status, and stereotype threat (ST) (Brown et al., 2016;Carr & Jessup, 1997;Jacobs, 2005;Pilotti, 2021;Witherspoon & Schunn, 2020).Much less research has evaluated gender disparities in performance and participation in statistics, biostatistics, and data science despite reports of gender disparities in participation in statistics related occupations.For example, a recent report documented that less than 17% of data analytics jobs are filled by women (Boas, 2020).This lack of research in statistics is, in part, because students typically do not get the opportunity to take a statistics course until late in high school and/or in college (Yates et al., 2021).A second explanation is that in many progress reports on degrees awarded, mathematics and statistics outcomes are combined (NSF, 2017).However, statistics is different from mathematics in its approach and application and should be evaluated as its own area of study (Rossman et al., 2006).Specifically, mathematics follows a rigid theorem and proof structure whereas statistics is a discipline in which real-life data is handled.Further, in mathematics, space, measures, and structures in their rudimentary form are considered, whereas in statistics raw data is collected, sorted, interpreted, and represented.Finally, mathematics is a subject of more absolute conclusions whereas statistics predictions are uncertain and vary with context (Mathnasium, 2022).
Understanding the role of gender in statistics performance and participation is essential given that, in the last decade, there has been a great need for talented individuals who can organize and analyze the enormous amount of data that is being collected (Legaki et al., 2020).This continuous inflow of data has put statistics, biostatistics, and data science at the forefront of desired majors and careers and has increased the need for more analysts (Stern et al., 2020).For example, Forbes (2020) listed statistics as one of the 15 most valuable college majors.In 2020, Glassdoor (2020) listed data scientist as one of the best jobs in America.In addition, there is a need for diversity in analysts.In her book, Invisible women: Exposing data bias in a world designed for men, Criado-Perez (2019) describes the disadvantages to women that result from biased data analyses that is from a primarily male perspective.Criado-Perez (2019) explains that because data is mainly analyzed by men and for men, this creates systematic bias and discrimination again women, causing inequities in industry, medical, and technological data analyses and applications.When data is unbalanced and excludes certain groups, models are biased.To support bringing more women into analytics related occupations, the Women in Data Science (WiDS) organization has put forth the initiative of "30 by 30" or having 30 percent of data scientists be women by the year 2030 (from the 2020 documented 17% of WiDS).While there is a clear gap in women's participation in statistics and data science related occupations, little is known about why these disparities in participation exist.It is unclear

OPEN ACCESS
whether the same causes for gender disparities in mathematics, such as differences in motivation, hold true in statistics.We cannot assume this to be the case.Research is needed to study the contributing factors to gender disparities in statistics, specifically.

STEREOTYPE THREAT
ST research developed from studies of racial differences in performance on standardized tests (Steele & Aronson, 1995).ST is the notion that beliefs held about a particular group may cause the confirmation of the judgment about one's group, and in turn, impact learning and performance (Johnson et al., 2012).
Much of what we know about gender and ST comes from the research in mathematics (Spencer et al., 1999).Different combinations of experimental or quasi-experimental conditions have been implemented to evaluate the impact of ST on gender disparities in mathematics.ST has been studied in conditions where the threat is made implicit (e.g., just being in a typical mathematics testing situation), explicit (e.g., students are told men perform better than women on a test), or nullified (e.g., equating the groups by telling students there are no gender differences) (Smith & White, 2001, 2002).Usually, implicit or explicit ST conditions are compared with a nullified condition (e.g., O'Brien & Crandall, 2003).For example, Spencer et al. (1999) compared college-level men and women's mathematics performance across two ST conditions.Students were told, prior to taking a mathematics test, that there were no gender differences on the test (a nullified condition) or were given no information regarding gender differences on the test (implicit ST condition).Results indicated that the men outperformed the women in the implicit ST condition, but that gender differences disappeared in the nullified ST condition.The authors concluded that women underperformed in the implicit condition because of the existing implicit stereotype that women are less competent than men in mathematics.This is a consistent finding and interpretation across the research on ST in mathematics (O'Brien & Crandall, 2003;Quinn & Spencer, 2001).
We found one study that evaluated the impact of ST on women's performance and anxiety in statistics.Kapitanoff and Pandey (2017) examined whether gender of instructor was related to college, elementary-statistics students' anxiety, performance, and ST endorsement.For men, anxiety and performance were not linked to gender of their instructor; also, the men's anxiety was not linked to their ST endorsement.For women, having a female instructor initially resulted in worsened performance that disappeared over the term.The women's anxiety was linked to their ST endorsement.A limitation of the study was that the authors measured ST endorsement in mathematics rather than statistics.In addition, the anxiety questions measured anxiety in general rather than statistics specific anxiety.
The research on ST conditions on gender differences in mathematics has tended to focus almost exclusively on performance as an outcome.ST researchers have urged the study of how ST may impact motivation (Fogliati & Bussey, 2013).Research by Davies et al. (2002) found that women exposed to ST reported less interest in pursuing college majors and careers in quantitative domains (Davies et al., 2002).Fogliati and Bussey (2013) found that college women in a stereotyped condition (students were told men outperform women on a test) were less motivated than women in a nullified condition (students were told men and women perform equally well on the test) to use feedback to revise and improve their mathematics work.There is a dearth of research evaluating the impact of ST on specific motivational constructs such as self-efficacy and anxiety, both of which have shown to have a significant impact on mathematics achievement and participation (Hiller et al., 2021).There is no research on this in statistics.Below, we review the extant research on gender differences in statistics achievement and motivation.

GENDER DIFFERENCES IN STATISTICS ACHIEVEMENT AND MOTIVATION
A search for peer-reviewed journal articles on gender differences in statistics achievement led to conflicting findings.At the college level, some research documented no gender differences in course grade among students taking undergraduate statistics (Buck, 1985;Es & Weaver, 2018;Woehlke & Leitner, 1980), others found results in favor of women (Charles, 1987), while others found results in favor of men (Susbiyanto et al., 2019).A meta-analysis of 18 empirical studies showed that men outperformed women on college statistics course examinations, but women outperformed men in the course overall (Schram, 1996).Similar varying findings were found for high school statistics students.For instance, Saidi and Siew (2019) found that men outperformed women on a test of central tendency; Batanero et al. (2003), in contrast, found no gender differences in high school students' understanding of central tendency.
The research on gender differences in motivation, and specifically self-efficacy and anxiety, in statistics is limited, dated, and has shown varying findings.Onwuegbuzie (1995) and Stroup and Jordan (1982) found that college women experienced higher levels of statistics anxiety than men.However, other studies did not find significant differences between college men and women in their statistics anxiety (Baloglu, 2001;Cruise et al., 1980).There is a lack of research evaluating gender differences in self-efficacy in statistics.This is an area that needs exploration given the significant impact of self-efficacy on achievement and participation in mathematics (Evans et al., 2021).

PRESENT STUDY
The present study is the first to examine the impact of ST on performance in statistics using four experimental conditions including an explicit ST condition (students are told men outperform women on the statistics test), an implicit ST condition (students are not provided any information about the effect of gender on performance, but are in a traditional testing situation), a reverse ST condition (students are told women outperform men on the statistics test), and a nullified condition (students are told that no gender differences in performance have been found on the statistics test).This study is unique in that it considers a reverse ST condition.In addition, it is the first to study gender differences in self-efficacy and anxiety in statistics under ST conditions.

METHOD
Participants 102 introductory level college statistics students at a Southeastern University in the United States participated in the study.39 students identified as male and 63 identified as female; approximately 54% were Caucasian, 25% were African American, 9% were Asian, 8% were Hispanic, and 4% identified as Other.The gender and race distribution for the statistics classes is equivalent to the undergraduate population at the university.Four classes, taught by two different instructors, each with approximately 35 students enrolled, participated.Students were randomly assigned to the four study conditions based on a cluster approach using the classroom as the unit of assignment.The random assignment of ST conditions to classroom resulted in 27 students in the explicit ST group, 33 students in the reverse ST group, 20 students in the nullified ST group, and 22 students in the implicit ST group.An a priori power analysis using G*Power (Faul et al., 2009) indicated that our sample size was larger than what G*power recommended for a 2×4 multivariate analysis of variance (MANOVA) with three dependent variables, an effect size f 2 of .15,an alpha of .05,and a power of .80 (G*Power recommended total sample size was n=56).

Statistics achievement
Students were asked to solve four statistics questions (Table 1).The questions, which were textbook problems (Navidi & Monk, 2019) with the values altered, tested descriptive statistics, z-scores, and the normal curve.The two professors teaching the courses confirmed that the students had not seen the problems previously, that the problems were at the appropriate level, and that the students had learned the material assessed by the problems.

Self-efficacy
The following seven items were derived from the motivation strategies for learning questionnaire (MSLQ) (Pintrich et al., 1993), revised to focus on statistics, and were given to students to assess their study-condition specific self-efficacy: • Even if the test is hard, I can do it.
• I believe I can get an excellent grade on the test.
• I believe I have the skills to do well on the test.
• I expect to do well on the test.
• I'm certain I can figure out how to do the most difficult problem on the test.
• I can do the problems on this test if I don't give up.
• I can do even the hardest problem on this test if I try.
Students responded to the items using a 7-point Likert scale that ranged from 1="not at all true of me" to 7="very true of me."MSLQ has extensive evidence of reliability and construct validity.For our students, reliability as assessed by Cronbach's alpha was .94.

Test anxiety
The following three items were derived from MSLQ (Pintrich et al., 1993), revised to focus on statistics, and were given to students to assess their study-condition specific test-anxiety: • I am worried about failing this test.
• I have an uneasy, upset feeling about taking this test.
• I am nervous about how I will perform on this test.
Students responded to the items using a 7-point Likert scale that ranged from 1="not at all true of me" to 7= "very true of me."MSLQ has extensive evidence of reliability and construct validity.For our students, reliability as assessed by Cronbach's alpha was .92.

Procedures
Study materials were administered during the seventh week of classes to ensure that students were exposed to the material on the test and to give them full exposure to the statistics learning environment.Students were given a packet with the surveys and statistics questions.The instructions varied across the four conditions in just the following way: 1. Implicit ST condition: You will be given four statistics problems to solve.These problems are based on statistics material that you may have already covered.
2. Explicit ST condition: You will be given four statistics problems to solve.These problems are based on statistics material that you may have already covered.This test has shown gender differences with males outperforming females on the problems.
3. Nullified ST condition: You will be given four statistics problems to solve.These problems are based on statistics material that you may have already covered.No gender differences in performance have been found on this test.
4. Reverse ST condition: You will be given four statistics problems to solve.These problems are based on statistics material that you may have already covered.This test has shown gender differences with females outperforming males on the problems.
Before solving the statistics problems and after ST instructions, students were given the self-efficacy and test-anxiety items with the instructions "in order to better understand how you feel about this upcoming statistics test, please respond to each of the following statements."Students had approximately one hour to complete the survey questions and statistics problems.After students completed all the surveys and statistics questions, we asked them the following questions: 1. Do you plan pursue a college major or a career in statistics?(O Yes O No).
2. Do you feel that men and women have the same mental capacity to achieve in statistics?Please explain.How about the same opportunities?Please explain.
Our research was conducted in accordance with the principles embodied in the Declaration of Helsinki and in accordance with local The standard deviation of a given sample is 19.3.What is the sample variance?statutory requirements.All Institutional Review Board protocols were followed including the collection of signed consent forms, voluntary and confidential participation, and debriefing after the study materials were collected.

Analyses of Stereotype Threat Conditions by Gender
A 2×4 MANOVA was used to determine if differences in student performance, test-specific statistics self-efficacy, and test-specific statistics test-anxiety differed by gender (two groups: male and female) and ST condition (four groups: implicit ST, explicit ST, nullified ST, and reverse ST).No significant differences were found for the main effects of gender (F[3, 91], Pillai's trace=.08,p=.07) or the interaction effect of gender by condition (F(9, 279), Pillai's trace=.13,p=.19).Descriptive statistics are reported in Table 2. Six students marked that they were interested in pursuing a college major or career in statistics; five of those students were male.

Analysis of Open-Ended Items
The free-response questions included in the survey provided an opportunity to measure the students' perceptions of the role of gender on mental capacity and opportunities in statistics.
• Do you feel that men and women have the same mental capacity to achieve in statistics?How about the same opportunities?
A vast majority of the participants (91.55%) stated that gender does not influence one's mental capacity, while 8.45% stated that men and women do not have the same mental capacity.When broken down by gender, 67% of the students who stated that men and women do have the same mental capability were women and 5.63% of the students who stated that men and women do not have the same mental capacity were women.A Chi-square test of independent was not significant for gender by yes/no response for mental capacity, χ 2 (1, 71)=.24,p=0.89.When asked about opportunities in statistics, 49.30% reported that men and women have equal opportunities in statistics and 50.70% reported that men tend to be favored with better opportunities.No one reported that women have more opportunities to achieve in statistics.When broken down by gender, 28.17% of the women stated that men and women have the same opportunities in statistics; 45.07% of the women stated that men have greater opportunities in statistics.A Chi-square test of independent was significant for gender by yes/no response for opportunity, χ 2 (1, 71)=16.49,p<.001.Fewer women believed that that there were equal opportunities for them to achieve in statistics.

DISCUSSION
This is the first study to study ST in statistics.Our results indicated that ST instructions did not impact the students' performance on statistics problems.Thus, it appears that the negative effects of ST on females that are found in mathematics are not an issue with this sample of undergraduate statistics students.We also failed to find differences for self-efficacy and test-anxiety by gender and ST condition.
Null findings are an important part of the research process and important for filling gaps in research because they tell us what is not important or effective.Unfortunately, it is well documented that research with significant results is more likely to be published, which results in a publication bias (Fanelli, 2010).However, this is at the expense of the principles of the scientific method (Campbell et al., 2020).Studies that "show null results despite sufficient statistical power are part of the research process and are fundamental to informing the next research question to be tested" (Campbell et al., 2020, p. 1).
The larger number of women than men enrolled in the elementary statistics course may have attenuated the impact of ST instructions.The problems given to the student may have also been too simple for them.The mean score for the four problems was M=2.75, SD=1.38.More complicated problems may have allowed for the effect of ST and anxiety to appear.It also may be that motivation and performance are not contributing factors to the gender disparities in statistics participation, but that other factors, such as the perception of statistics being a White and male dominated field (Taasoobshirazi et al., 2022), are the contributing factors.Indication of this perception that males have greater opportunities to achieve in statistics were seen in our openended items.
Despite the lack of ST effects related to gender in these statistics courses, it is possible that educational and social contexts could still create hostile conditions by which gender differences may be triggered.For example, persistent negative gender stereotyped messages from an instructor may lead to the development of a threat.Our study only looked at a single time point, and the ST instructions were delivered by a researcher with whom students did not have a relationship and who was unrelated to the course.
Although most of the students stated that men and women had the same mental capacity for statistics, a large percentage of students, particularly women, commented that men have more opportunities to succeed in statistics.The most common reason that women reported men having more opportunities in statistics is sexism in the field, specifically a bias against women, stemming from a belief that women are inferior in science and mathematics (Allen et al., 2022).About 57% of the free-response answers to the question about opportunities cited sexism against women and noted that society doubts their capabilities.About 43% of the responses claimed that men are more desired in the STEM field or that society views women in STEM unconventional.
Much more research is needed to understand the cognitive, motivational, and social variables that may contribute to gender disparities in statistics participation.Longitudinal research with statistics students at different levels of study can help pinpoint when these disparities begin.There is little information regarding when and why gender disparities in statistics achievement, if any, may begin and how they are linked to participation in statistics classes and programs later down the road.Multilevel and structural equation models can help researchers understand mediating, moderating, and contextual factors.Finally, in depth interviews can help researchers better understand students' thoughts about ST conditions, their perceptions of statisticians, and their perceptions of the welcomeness of the field.

Table 1 .
Statistics problems Population parameters of some bell-shaped distribution include a mean of 44 & an SD of 3.7.What are cut-off values that define middle 95% of data values?3 Given the same distribution in question 2, what is the z-score for the data value 48? 4

Table 2 .
Descriptive statistics by stereotype threat condition and gender