Running head: TOWARDS DISRUPTING THE STATUS QUO 1 · Running head: TOWARDS DISRUPTING THE STATUS...
Transcript of Running head: TOWARDS DISRUPTING THE STATUS QUO 1 · Running head: TOWARDS DISRUPTING THE STATUS...
Running head: TOWARDS DISRUPTING THE STATUS QUO
1
Towards Disrupting the Status Quo of Exhibit Design:
A Data Reanalysis to Support Future Research
Toni Dancstep (née Dancu) and Joshua P. Gutwill
Exploratorium, San Francisco, California
2019
Note:
This unpublished manuscript is aimed solely at researchers to inform future studies rather than at practitioners to improve exhibits. Although we typically try to reach both audiences in our work, we encountered an unexpected issue while working on the dissemination of this reanalysis: We realized that the results in this paper could overpower the original results in the Exhibit Designs for Girls’ Engagement study for several reasons. First, when faced with the option to design for all visitors or design for girls, many practitioners expressed that they would choose the former even though the latter is necessary given current disparities in the field. Second, the reanalysis is descriptive while the original EDGE analysis was parametric; unfortunately, that subtle difference was difficult for our readers and reviewers to keep in mind when thinking about such a broad set of data. By focusing on researchers as the audience for this article and describing the original EDGE study at some length, we hope to mitigate the possibility that this reanalysis will be applied to practice or overshadow the original EDGE work.
Running head: TOWARDS DISRUPTING THE STATUS QUO
2
Abstract
This article describes the reanalysis of a large dataset containing engagement information
for 60 exhibit design attributes at 301 exhibits across three institutions. The reanalysis
revealed relationships between existing science museum exhibit designs and four
measures of children’s engagement (use, time spent, return, and behavior), providing
baseline data across exhibits and engagement results for all design attributes. We caution
against using the results to inform practice as research on inclusion has found that current
exhibit designs may promote disparities among learners. Instead, we present these results
to learning science researchers in hopes they will identify promising studies that can
inform the field. In our own exploration, we categorized design attributes as (a) positively
associated with engagement here and in prior studies, (b) common but understudied, (c)
rare and promising, and (d) possibly inhibiting engagement, making suggestions about
how future research can disrupt the status quo to create more welcoming learning
environments.
Keywords: learning research; informal learning, learning sciences; exhibit design
attributes; engagement; STEM
Running head: TOWARDS DISRUPTING THE STATUS QUO
3
Toward Disrupting the Status Quo of Exhibit Design: A Data Reanalysis to Support
Future Research
For decades, the museum field has sought out exhibit designs that will enhance
visitor engagement. Faced with myriad design approaches and the expense of conducting
research in ecologically valid museum settings, learning researchers in informal
environments typically must focus on only the few designs they believe are most
important. Here, we describe the reanalysis of a sizable dataset in the hopes that our
colleagues may use it to inform research regarding exhibit designs to advance free-choice
science education.
Recently, the first author undertook a comprehensive study of exhibits in order to
identify designs that most successfully engage girls ages 8-13 years. That project, called
Exhibit Designs for Girls Engagement (EDGE), explored nearly 60 exhibit design
attributes and found nine that consistently and significantly corresponded with positive
engagement for girls in science museums (Dancstep & Sindorf, 2016, 2018a, 2018b). In
this article, we briefly describe the methods and results of EDGE to set the scene for a
reanalysis of that project’s data. The reanalysis has a different goal from the original
study, and combines data gathered on both girls and boys.
Our aim here is to provide a snapshot of exhibit use and design in order to answer
new questions using the EDGE dataset:
1. What does typical child engagement look like at science exhibits in our sample?
That is, what are the baseline levels of exhibit engagement for children (ages 8-
13)?
Running head: TOWARDS DISRUPTING THE STATUS QUO
4
2. What is the relationship between each of the design attributes and children’s
engagement? How can researchers employ those relationships to identify
promising areas of study?
3. How common or rare are those design attributes in science museum exhibits? Can
this information also aid researchers in focusing on productive areas of study?
The reanalysis offered in this article describes the status quo in science museum
exhibit design, a status quo that may contribute to inequities, such as imbalances in
science center engagement in favor of white, middle-class, and male visitors (Borun,
1999; Dawson, 2014a, 2014b; Diamond, 1994; Feinstein, 2017; Girls Inc., 2004;
Hamilton, Michael, Kupermintz, Kerkhoven, & Snow, 1995; National Science
Foundation, 2003, 2007). Applying the results of this reanalysis to practice could
inadvertently strengthen current disparities. Consequently, our reanalysis does not
prescribe particular exhibit designs, but by significantly expanding on prior research, may
lend additional evidence in support of certain designs and be used to identify other
designs that merit additional investigation.
Literature Review
Identifying effective exhibit designs has been a goal of various museums and
educators for over twenty-five years (e.g., Ansbacher, 1999; Bailey, Bronnenkant,
Kelley, & Hein, 1998; Borun et al., 1998; Dancstep, Gutwill, & Sindorf, 2015; Falk,
1997; Garibay, 2013; Humphrey & Gutwill, 2005; Perry, 1993, 2012), and the field has
begun to build a set of design principles or attributes that generally seem to promote
learning. To lay the foundation for the reanalysis of the EDGE project’s dataset, we
Running head: TOWARDS DISRUPTING THE STATUS QUO
5
review prior research on exhibit designs that promote visitor engagement and learning.
Early studies on exhibit designs focused on the text and layout of labels for
object-based exhibits. For example, investigators found that headers and chunked text
help visitors scan a label for the information they need (Bitgood & Gregg, 1986; Bitgood
& Patterson, 1993; Screven, 1995; Serrell, 1996, 2015), and shorter labels are read more
thoroughly than longer labels (Bitgood & Gregg, 1986). Surprisingly, shorter labels even
entice visitors to read a greater number of words than longer labels (Bitgood &
Thompson, 1989). These early studies showed that general design principles could be
applied to labels to improve museum visitors’ learning experiences with object-based
exhibits.
During that same period of time, Deborah Perry published her influential
dissertation investigating the impact of exhibit design on learning and engagement (Perry,
1989). Perry (2012) recently expanded this work into a book called, What Makes
Learning Fun? Principles for the Design of Intrinsically Motivating Museum Exhibits. By
analyzing visitors at the Colored Shadows exhibit at the Indianapolis Children’s Museum,
Perry developed an instructional-design theory with six overarching design principles for
meeting visitors’ learning needs:
communication, curiosity, confidence, challenge, control, and play. Within each of the
six principles, Perry proposed specific design prescriptions, such as:
• Direct visitors’ attention to the important parts of the exhibit, object,
or artifact.
• Provide a variety of different types of activities.
Running head: TOWARDS DISRUPTING THE STATUS QUO
6
• Design spaces that encourage members of visiting social groups to
stay together and in close proximity.
Perry’s total number of design prescriptions is 65, each of which supports her model of
visitor learning that includes motivations, engagements, and outcomes. This research
provided a framework for developing exhibits based on learners’ motivations and needs,
helping to launch the field into an examination of exhibit designs.
The landmark PISEC study (1999) followed, finding that four interactive exhibits
were more effective at promoting family learning when they had seven key design
attributes: multisided, multiuser, multi-outcome, multimodal, accessible, readable, and
relevant. Many of Perry’s design prescriptions fit with the PISEC design attributes. These
publications inspired several exhibitions and research studies across the science museum
field. For example, the Museum of Science in Boston developed multiuser, multi-
outcome exhibits in its Investigate! exhibition (Bailey et al., 1998) and the Science
Museum of Minnesota elevated the attribute of multi-outcome even further in its
Experiment Benches exhibition (Perry, 1993). The Exploratorium fleshed out some of the
PISEC designs with a research and development project called Active Prolonged
Engagement (APE), which found that exhibits engaged visitors more deeply when they
offered multiple related options, allowed for multiple users who could not interfere with
each other, and provided forking paths of multiple intriguing outcomes (Humphrey &
Gutwill, 2005).
More recently, the Design Zone project at OMSI and the Geometry Playground
project at the Exploratorium investigated the impact of APE and other design attributes
Running head: TOWARDS DISRUPTING THE STATUS QUO
7
on mathematical thinking among visitors. Focusing on algebraic thinking, the Design
Zone team explored design attributes such as multiuser interfaces, multi-option math
challenges, cooperative experiences, and competitive activities. Ultimately, the exhibition
demonstrably encouraged prolonged engagement, inspired large percentages of visitors to
pose challenges for themselves, and sparked reasoning about functional relationships
(Garibay, 2013). Working at the Exploratorium in the domain of geometry, we
concentrated on spatial reasoning (Dancstep et al., 2015). Although the geometry team
also utilized APE designs, they explored the impact of exhibit scale, comparing visitors’
experiences at tabletop exhibits to those at whole-body, immersive exhibits. (The exhibits
differed in size, but shared identical phenomenological content and similar visitor
activity.) A quasi-experimental study found that tabletop designs promoted deeper spatial
reasoning, while immersive designs triggered more return visits and were described more
positively by visitors.
Each of these research or development projects involved a relatively small
number of exhibits and design attributes. Many of the label studies concentrated on only
single exhibits or exhibitions.1 Although Perry’s study involved recording 611 visitor
groups, those groups interacted with only two versions of the Colored Shadows exhibit
(380 at original and 231 at revised). The PISEC study looked at the largest number of
design attributes to date, seven; each of four exhibits was studied in both its original
condition and an enhanced condition instantiated with the seven attributes. The total
number of participants involved was 400, with 50 in each condition of each exhibit.
Studies of exhibit designs at Museum of Science in Boston, OMSI, Ruben Fleet Science
1Although some museums use the word exhibit to refer to an entire collection, we refer to the individual elements or components of an exhibition as exhibits.
Running head: TOWARDS DISRUPTING THE STATUS QUO
8
Center, Exploratorium, and Science Museum of Minnesota have ranged from single
exhibits to exhibition collections of 30–60 exhibits, and typically focus on only a few
design attributes (Bailey et al., 1998; Dancstep et al., 2015; Garibay, 2013; Humphrey &
Gutwill, 2005; Perry, 2012; Sandifer, 2003). The relatively small number of exhibits and
attributes in these studies may have limited what we are able to learn about those
attributes, how they compare to one another, and how they hold up across a large number
of exhibits and visitors.
As mentioned at the top of this article, the first author conducted the large-scale
EDGE study to investigate the relationship between female-responsive exhibit designs
and girls’ engagement. Across three science museums, researchers followed 456 girls and
450 boys as they moved among 301 STEM exhibits that represented nearly 60 distinct
design attributes. EDGE was motivated by research indicating that girls visit science
museums less frequently than boys (Borun, 1999; Hamilton et al., 1995; Lee & Burkam,
1996; National Science Foundation, 2003), and that once inside, girls often have less
engaging experiences at exhibits than boys (Diamond, 1994; Girls Inc., 2004; Greenfield,
1995; Kremer & Mullins, 1992; National Science Foundation, 2003; Verheyden, 2003).
Ultimately, the project identified nine design attributes that were most important for
engaging girls at STEM exhibits, none of which were harmful to boys’ engagement. (For
a list, see Dancstep & Sindorf (2016, 2018b)).
This article introduces a new analysis of the EDGE data, one focusing on the
combined engagement of girls and boys at the 301 exhibits. We begin with a brief
description of the methods employed in this reanalysis. Importantly, EDGE advocated for
changing science museum practice by making exhibit design approaches more inclusive.
Running head: TOWARDS DISRUPTING THE STATUS QUO
9
Our reanalysis provides a snapshot of the status quo, and identifies fruitful areas for
future research. Consequently, the analytic methods differ somewhat.
Data Collection Methods
To gather the original data for this reanalysis, our research followed a four-step
process:
1. Identify design attributes that had the potential to engage females.
2. Assess exhibits across three museums for those attributes.
3. Measure engagement of girls and boys across those exhibits.
4. Find the design attributes that consistently, positively, and strongly related to
girls’ engagement. (This step differs in the present reanalysis.)
What follows is a brief description of each step.
1. Identify Potential Design Attributes
Through an extensive review of prior research from various fields; input from experts in
design, gender, equity, and museum learning; and discussions with advisory groups of
girls and exhibit practitioners, the first author’s team compiled a list of nearly 60 specific
design attributes (DAs) that had the potential to engage girls. The list of DAs covered a
wide range of topics, such as seating, size of exhibit, number of users, look and feel, type
of interaction, and label characteristics, just to name a few (for the list of DAs see Table
V).
2. Assess Design Attributes at 301 Exhibits
The research included 301 exhibits at three U.S. institutions that varied in size,
region, demographics, and approach to exhibitry. At a large West Coast science center, a
Running head: TOWARDS DISRUPTING THE STATUS QUO
10
large Midwest science center, and a medium-sized Southwest science center, exhibits on
the topics of physics, engineering, mathematics, and perception were analyzed for the
presence or absence of each design attribute.2Therefore, the results are limited to
particular topic areas, and any summary of prevalence of the design attributes may not
generalize beyond our (nonrandom) sample of exhibits and institutions.
Multiple researchers examined each exhibit for each of the 60 design attributes.
For the majority of attributes, researchers noted their presence or absence. A few
attributes required a rating on a scale, signaling the extent to which the exhibit design
embodied that attribute (e.g., how whimsical or homey the design appeared); in those
cases, the attributes were later categorized into presence or absence based on the data. All
items were examined and adjusted for face and content validity via discussions with
advisory groups and practitioners.
During data collection on the exhibit designs, the team utilized a process of
double coding and then resolving disagreements. Two researchers independently scored
all exhibits for each design attribute. The pairs of researchers then compared their results,
discussing and resolving on final codes for each attribute at each exhibit. This process of
double coding followed by resolution of disagreements was used to improve reliability
(Baxter & Jack, 2008; Olszewski, Macey, & Lindstrom, 2006). For those interested, the
protocol and coding scheme can be found on our website (Sindorf & Dancstep, 2015).
3. Measure Engagement
The project employed traditional timing and tracking methodology (Serrell, 1998,
2010; Yalowitz & Bronnenkant, 2009). This method constitutes unobtrusive observation
2Size of museum based on operating revenue (Association of Science-Technology Centers, 2016).
Running head: TOWARDS DISRUPTING THE STATUS QUO
11
of participants as they move through a museum space, and is commonly used for
understanding visitors’ engagement and behaviors in museum settings (Serrell, 1998,
2010). There are many ways to measure exhibit engagement, and each offers important
information about different aspects of the exhibit’s design and its visitor experience
(Sanford, 2010). Pairs of researchers, blind to the study’s purpose, simultaneously
tracked 15% of participants to check inter-rater agreement. We explored four measures of
engagement at all 301 exhibits.
Exhibit Use. Do children stop to use the exhibit? The attraction power of an
exhibit in free-choice settings is often measured by the percentage of visitors who stop to
use an exhibit (Diamond, 1986; Falk, 1983; Humphrey & Gutwill, 2005; Sandifer, 2003;
Serrell, 1998). In our dataset, Exhibit Use was calculated by the percentage of children in
the study who touched an exhibit for three or more seconds; or stopped, feet planted,
facing the exhibit for five or more seconds. Inter-rater agreement, collected for 15% of
the participants, was high at 93% for all instances of Use.
Return Visits. Do children go back to use the exhibit again? Return visits may be
an important indicator of engagement in a free-choice environment where learners can
vote with their feet (Diamond, 1986; Falk, 1983; Sandifer, 2003; Serrell, 1998). Return
Visits was computed as the percentage of visitors who used an exhibit and who, after
leaving, chose to return to use it again. Participants were considered to have returned to
an exhibit if they met the requirements for Use at two or more unique times during their
visit. Inter-rater agreement across all instances of Use, including Return Visits, was 93%.
Time Spent. How much time do children spend at the exhibit? Holding power is
typically assessed by the average time visitors spend at an exhibit (Diamond, 1986; Falk,
Running head: TOWARDS DISRUPTING THE STATUS QUO
12
1983; Humphrey & Gutwill, 2005; Sandifer, 2003; Serrell, 1998). Time Spent adds
together the durations of all instances of Use for an individual, from when the child first
uses an exhibit to when the child walks away, or faces away from the exhibit for five or
more seconds. To measure Time Spent at an exhibit, the average time is calculated for all
children at that exhibit. Inter-rater agreement for Time Spent across all instances of
exhibit Use identified by either rater was high with Spearman’s Rho at .94 (Ferguson,
2009).
High Engagement Level. Do children repeat, vary, and/or talk about aspects of
the experience while at the exhibit? One of the most powerful approaches for assessing
engagement involves directly observing visitors’ various learning behaviors at exhibits
(Barriault & Pearson, 2010; Borun & Dritsas, 1997; Humphrey & Gutwill, 2005;
Sanford, 2010; Van Schijndel, Franse, & Raijmakers, 2010). For this measure, we drew
on two existing metrics (Barriault & Pearson, 2010; Van Schijndel et al., 2010) to create
a scale that would provide a sense of the depth of children’s engagement at exhibits.
Using the Engagement Level Scale (Table 1), researchers recorded the highest level of
engagement each participant reached at each exhibit, from inattentive engagement to
actively repeating, varying, and/or sharing their explorations. Collapsing across
categories, the analysis used the percentage of children who reached the third level of
engagement: High Engagement Level.3 Inter-rater agreement, collected for 15% of the
participants, was high at 92%.
3The majority of children reached Level 3, so we decided that the most useful unit of analysis would be whether or not participants reached the High Engagement Level.
Running head: TOWARDS DISRUPTING THE STATUS QUO
13
Table 1
Engagement Level Scale
Note.ThistablewasoriginallypresentedinDancstepandSindorf(2018b).
To gather engagement data, we randomly selected children ages 8–13 attending a
museum with their families and invited them to participate in the study. We tracked 906
children as they used or passed by the exhibits in a contained area of the museum in a
free-choice manner. Specifically, the researchers tracked 637 children across 213 exhibits
at the large West Coast science center; 131 children at the 70 exhibits at the large
Midwest science center; and 138 children at the 18 exhibits at the medium Southwest
science center. For some measures, it was necessary to remove a few exhibits from the
analysis.4
4 When relating each design attribute to the engagement measures, we removed confounding exhibits. That is, to minimize the confounding effects of exhibit location, we removed any exhibits that were in low-traffic areas when looking at the correlation between Use and the DAs. In the end we had a final sample size of 269 exhibits for the analyses exploring the relationships between the DAs and Use. Similarly, for some exhibits, Time Spent is not a good indicator of engagement. We identified and removed exhibits that were designed to offer a very short experience (Humphrey & Gutwill, 2005) when correlating Time Spent with the DAs. The final exhibit sample size for the analyses representing the relationships between DAs and Time Spent was 279 exhibits. All exhibits were included in the baseline analyses, which aim to provide a broader picture of exhibit engagement.
Level Description Examples (may do one or more of these)
1 Child passively or inattentively engages with the exhibit.
Observes inattentively, gaze may wander May passively touch/lean on exhibit May seem bored, inattentive
2 Child actively and attentively engages with the exhibit and may seem to enjoy the experience.
May watch attentively May touch exhibit elements Focuses on exhibit May show strong emotions; signs of interest/enjoyment/frustration
3 Child explores the parameters of the exhibit and may want to share the experience with others.
Actively and attentively applies repetition or variation to activity (tries multiple things, tests variables) Shares exhibit experience with others (on-topic talk, collaboration, showing others, calling others over, etc.)
Running head: TOWARDS DISRUPTING THE STATUS QUO
14
Prior research has found that visitors use only a portion of exhibits in any given
museum space (Serrell, 1998), and the design of the space affects exhibit use and
engagement (Adamson, 2008; Allen, 2004; Falk & Dierking, 2000). The layout and
exhibit density of the three museum spaces studied in EDGE differed, contributing to the
external validity and generalizability of the results. The West Coast museum offered its
213 exhibits in a large, open area of approximately 24,000 square feet (8.9 exhibits per
1000 sq ft), with several corridors leading in and out, a single wide path for foot traffic
through the center, and few walls or smaller rooms. The Midwest museum presented 70
exhibits in a total exhibit area of about 15,000 square feet (4.7 exhibits per 1000 sq ft),
and the spaces were broken up by empty regions or low walls as well as a few small
adjoining rooms. The exhibits were on the bottom floor of the institution; three other
exhibition spaces on that level were not included in the study. The Southwest museum
space contained all 18 exhibits in a single small room of approximately 2,800 square feet
(6.0 exhibits per 1000 sq ft) on the second floor of the institution. Figure 1 shows the
rough layout of exhibit spaces within each institution, approximately scaled by size of the
space.
The West Coast museum offered the greatest number of exhibits across the largest
area for the study, so we recruited a higher number of participants there. We used analytic
techniques to deal with the different sample sizes in the EDGE analyses but were not able
to address this disparity in the reanalysis (see next section). To obtain sampling parity by
gender, age, and interior location, we oversampled within each institution when any
imbalances greater than five occurred between girls and boys, younger and older
children, or recruit locations on a museum’s floor. See Table 2 for participants’
Running head: TOWARDS DISRUPTING THE STATUS QUO
15
demographics. Although the survey offered a non-binary and open-ended gender identity
option, no children chose it. In addition, the sample contained too few children of color
for meaningful analyses by ethnicity.
Figure 1. Exhibits and spaces at the three participating institutions.
Running head: TOWARDS DISRUPTING THE STATUS QUO
16
Table 2 Participant Demographics
Demographic Information n Percentage
Total participants 906 100%
Gender
Male 450 50% Female 456 50% Another category 0 0%
Race/Ethnicity
Caucasian or White 606 67% Hispanic or Latino 103 11% Asian 91 10% African-American or Black 43 5% American Indian and Alaska Native 16 2% Native Hawaiian and Other Pacific Islander 14 2% Other 10 1% Unspecified 78 9%
Attendance
First time 729 80% Return 154 17% Missing data 23 3%
Note.Thesurveyofferedthreeoptionsforgender,includingAnotherCategorywithawrite-infield.Ethnicitytotalsforeachcategoryincludemulti-racial/multi-ethnicparticipants;thustotalspercategorymaybegreaterthanparticipanttotals.Thistableoriginallypresentedin(Dancstep&Sindorf,2018b).
4. Relate Design Attributes to Girls’ Engagement (Differs in reanalysis)
The goal of the original EDGE research was to winnow the large set of promising
female-responsive design attributes down to the best ones for engaging girls in a STEM
learning environment in order to influence practice. To that end, the analysis required a
high bar of confidence, surfacing only those DAs that consistently, positively, and
significantly related to girls’ engagement at STEM exhibits. The best DAs from the girls’
data were then checked against the boys’ data for any ill effects on boys’ engagement;
none were found. The high bar allowed EDGE to advocate for changing exhibit designs
to be more inclusive for girls. The present reanalysis had the goal of describing the status
Running head: TOWARDS DISRUPTING THE STATUS QUO
17
quo, not making recommendations for practice, and thus employed a different process for
analyzing the data (explained below).
Reanalyzing the Data: State of the Field and Children’s Engagement
The goal of the present reanalysis is to provide a snapshot of the field’s current
state for researchers— how do the DAs relate to various measures of children’s
engagement, and how common or rare are they? This information will help researchers
begin to distinguish designs that offer low returns on investment from those that presently
show promise for visitor learning, in order to conduct studies that push the field beyond
the status quo. Additionally, our survey of DAs at exhibits in the content areas of physics,
technology, engineering, math, time, and perception begins to document the prevalence
of science museum exhibit designs in those domains. To our knowledge, no synthesis
study about exhibit design has been conducted to date with a large number of design
attributes across many exhibits and multiple institutions. Still, we must keep in mind that
the exhibits were selected based on convenience, and overrepresented at the larger
institutions (particularly the West Coast museum). Further, DAs were developed and
selected based on a review of girls’ STEM learning, rather than broader notions of exhibit
designs. For example, none of the DAs were concerned with wheelchair accessibility,
live specimens, or audio labels. Our reanalysis of the EDGE data applies to only a subset
of all possible exhibit design attributes.
Changes From EDGE
Because the research questions differed substantially from the original project, we
changed various aspects of both the dataset and the statistical analyses employed in
Running head: TOWARDS DISRUPTING THE STATUS QUO
18
EDGE. These changes affected our level of confidence in the outcomes and created some
differences between EDGE and the reanalysis, making direct comparisons difficult and
unwise. Table 3 shows the differences between the original analyses and the current
reanalysis. Perhaps the most important change was that the original analysis employed
inferential statistics (with p values) in order to make claims about which design attributes
would best support girls’ engagement. That project was looking for consistently strong
positive relationships with high confidence in order to affect practice. Based on the power
analysis for that original study, the inferential statistics required a minimum sample of 50
exhibits representing each DA to detect small effects, which are common in museum
studies (Cohen, 1992; Wilson, Voorhis, & Morgan, 2007).
Running head: TOWARDS DISRUPTING THE STATUS QUO
19
Table 3 Differences between Original Analysis and Reanalysis
EDGE Reanalysis of dataset
Research question
What are the best design attributes for engaging girls at exhibits?
What is the status quo regarding use and design of exhibits?
Focal group Girls Children: girls and boys combined Analyses Correlation followed by focused
regression with semi-partial correlations. Includes inferential statistics and p values.
Rationale: Identifies the best design attributes with high confidence and ensures that no results are driven by a single institution.
Semi-partial correlation. Solely descriptive statistics without p values.
Rationale: Provides current snapshot of
design attributes at STEM exhibits. Identifies promising areas for future R&D.
Minimum number of exhibits with an attribute needed
50 Rationale: Inferential regression
analyses recommend a large sample size.
3 Rationale: Museum studies typically require a
minimum of three exhibits for generalization.
Decisions regarding design attributes
• Included only girl-friendly DAs. • DAs rated on a scale were
categorized into presence or absence based on the data.
• Choices among combinations and variations of DAs were made based on the strength of their relationships with the four engagement measures.
Rationale: The aim was to identify the
strongest and best DAs for engaging girls.
• Included non-girl-friendly and girl-unfriendly DAs (e.g., competition).
• DAs rated on a scale were categorized into presence or absence based on an across-the-board decision of medium/low vs. high.
• Choices among combinations and variations of DAs were made prior to analysis based on across-the-board decisions. Many redundant DAs were retained to show how the relationship to engagement measures differs depending on subtle variations in their operationalization.
Rationale: The intent was to provide a full
picture; such comprehensiveness requires consistency and ease of interpretation.
Controls for institution and crowdedness
Yes Yes
Describes negative relationships
No Rationale: Analysis identifies best
attributes, so negative relationships are dropped.
Yes Rationale: Snapshot requires full picture
(positive and negative correlations).
Inter-rater reliability of highest engagement level
Kappa Rationale: We wanted to have greater
confidence in the claims so we set a higher bar for reliability by controlling for chance agreement. This affected the number of exhibits in the analysis.
Percentage agreement Rationale: This allowed us to include the
majority of the exhibits in the analysis. Because we are not making strong claims, we can learn from the less rigorous approach.
Running head: TOWARDS DISRUPTING THE STATUS QUO
20
In contrast, the current reanalysis attempts to document the landscape without
generalizing beyond the sample, acknowledging that such descriptive results may be due
to chance. This approach allows us to include rare DAs (present at a minimum of only
three exhibits) and DAs that correlate with poor outcomes. Unlike the original analysis,
we did not conduct significance tests, because doing so on over 60 DAs across each of
the four engagement measures would have increased the likelihood of a Type I error to
perilously high levels. Instead, we calculated correlations between each DA and
engagement measure, which differed in precision depending on the number of exhibits
with that design attribute. This allowed us to identify areas of promising research, rather
than make inferences about DAs to embrace or avoid.
Results of Reanalysis
Baseline levels of Engagement
How did the exhibits tend to engage children, regardless of specific design
attributes? Before delving into the DAs and their relationships with engagement, we
present descriptive statistics for the four engagement measures across all exhibits and
institutions in the study. These statistics serve two functions. First, they provide baselines
for the remaining results that describe associations between DAs and engagement
measures. Second, the descriptive statistics offer a portrait of typical engagement levels at
physics, engineering, math, time, and perception exhibits among 8–13 year olds at three
representative museums. However, it is important to note that these results are more
representative of engagement at the institutions with more exhibits (i.e., West Coast
followed by Midwest). Indeed, over two-thirds of the exhibits were located at the West
Coast museum.
Running head: TOWARDS DISRUPTING THE STATUS QUO
21
Table 4 shows the engagement baselines for children averaged across all exhibits.
The mean percentage of children who used an exhibit was 22%; the mean time children
spent at an exhibit was 62.5 seconds; on average, the percentage of children who returned
to an exhibit they had already used was 15%; and the average percentage of children who
reached the highest level of engagement at an exhibit was 88%.5
5The large percentage of children reaching the highest level of engagement represents restricted variability (range restriction between 70% and 100%), and may therefore underestimate some of the correlations between design attributes and High Engagement Level (presented in the next section).
Running head: TOWARDS DISRUPTING THE STATUS QUO
22
Table 4 Descriptive Statistics of Average Exhibit Engagement for Children Engagement measure Mean (SD) 95% CI Minimum Maximum
Use at exhibits (%)
Overall 22% (16%) 20% - 24% 2% 98%
Southwest 70% (16%) .62% - .78% 45% 98%
Midwest 24% (12%) .21% - .27% 7% 63%
West Coast 18% (8%) .16% - .19% 2% 44%
Time Spent (seconds)
Overall 62.5 (40.60) 57.91 - 67.11 9.2 273.9
Southwest 76.99 (39.42) 57.39 – 96.59 25.99 162.94
Midwest 83.75 (52.22) 71.3 – 96.2 9.24 270.76
West Coast 54.30 (32.89) 49.86 – 58.75 12.48 273.94
Return Visits (%)
Overall 15% (9%) 14% - 16% 0% 60%
Southwest 29% (15%) .22% - .36% 10% 63%
Midwest 14% (13%) .13% - .18% 0% 50%
West Coast 13% (9%) .12% - .14% 0% 50%
High Engagement Level (%)
Overall 88% (10%) 86% - 89% 36% 100%
Southwest 82% (7%) .79% - .86% 70% 94%
Midwest 83% (12%) .80% - .86% 36% 100%
West Coast 89% (8%) .88% - .91% 46% 100% Note.Descriptivestatisticsofparticipantbehaviorinvolvedaveragingacrossallexhibits.n=301exhibitsoverall;n=18exhibitsattheSouthwestmuseum;n=70exhibitsattheMidwestmuseum;andn=213exhibitsattheWestCoastmuseum.
There was variability across institutions, perhaps most strikingly at the Southwest
museum, which had the smallest space by far. With only 18 exhibits, that museum saw a
much larger percentage visitors using and returning to each exhibit on average. Visitors
also spent less time on average at exhibits in the large, West Coast museum. The
percentage of children reaching a High Engagement Level was similar across museums.
Running head: TOWARDS DISRUPTING THE STATUS QUO
23
Design Attributes and Children’s Engagement
The current analysis calculated semi-partial correlations between DAs and
children’s engagement for each of the four engagement measures. Semi-partial
correlations show the strength of the relationship between each design attribute and an
engagement measure (i.e., the size of the effect), over and above institutional differences
and the effects of exhibit crowdedness on the engagement measures. Partialing out
institutional differences helps us account for the differences in results that can be seen in
Table 4 (potentially due to museum size, pedagogical approach, visitor demographics,
location, and number of exhibits in the space). We provide the full list of attributes and
their semi-partial correlations with each type of engagement in Table 5.
To illustrate how we might use these effect sizes to inform future research, we
present a specific cutoff in the table: greater or less than 0.1. Our cutoff is based on the
standard rule of thumb used in social science research, where 0.1 is considered a small
but meaningful effect, 0.3 is considered moderate, and 0.5 is considered to be a large
effect (Cohen, 1988, 1992). Further, a group of advising practitioners expressed an
interest in the relationships that showed this size of effect, when we discussed the
relationship in terms of the baseline levels. For example, a correlation of 0.1 translates to
approximately a 4-9% increase in exhibit use (c.f., the baseline average of 22%);
approximately a 11-13 second increase in time spent (c.f., the baseline average of about
one minute); approximately a 2-5% increase in return visits (c.f., the baseline average of
15%); and approximately a 3-4% increase in highest engagement level reached (c.f., the
baseline average of 88%).6
6These ranges are based on the regression slopes for effect sizes of .10 to .14.
Running head: TOWARDS DISRUPTING THE STATUS QUO
24
For the purpose of illustration, we identify good, neutral, and poor semi-partial
correlations using the 0.1 cutoff in the table. A “good” semi-partial correlation means that
the presence of the design attribute was associated with higher engagement. In technical
terms, the semi-partial correlation was 0.1 or greater. A “poor” semi-partial correlation
means a negative correlation: The presence of that design attribute was related to lower
engagement (the semi-partial correlation was -0.1 or less). A “neutral” semi-partial
correlation means the design attribute was not correlated with the engagement measure
(semi-partial correlation was between 0.1 and -0.1). We report the detailed semi-partial
correlations to allow readers to interpret the size of the effect, choose their own cutoffs if
they wish, and employ the results in power analyses to determine sample sizes necessary
for future research. To sort our table, we calculated the ranking for each design attribute
within each engagement measure, averaged across the four rankings, sorted by the
resulting mean rank (highest to lowest), and finally adjusted for legibility.
The table also shows the prevalence of each DA by indicating the number and
percentage of exhibits incorporating that attribute. The prevalence of each DA across
exhibits helps us understand not only whether such designs arise frequently or rarely
across the three museums, but also how reliable or precise is our measure of their
relationship to engagement. That is, the fewer exhibits with that attribute, the less
precision in estimating the relationship (or the more heavily the finding relies on those
specific exhibits rather than the design attribute generally). Finally, in order to put these
results in the context of the larger EDGE findings, the exhibit design attributes that were
Running head: TOWARDS DISRUPTING THE STATUS QUO
25
the most important for engaging girls are marked with this symbol: . 7
7Although the table highlights the nine DAs that were discovered as helpful to girls in EDGE,several of the DA variables differ between the two analyses, and these differences affect the sample sizes. For example, EDGE explored combinations of variables, and one of the top DAs for designing for girls was the combination variable: exhibit has multiple sides or stations. The reanalysis reported here maintained these two DA variables as separate. Another example includes exhibits’ look-and-feel is homey, personal, homemade which was collected using a three-point scale; EDGE split the variable into high/medium vs. low, while the reanalysis split the variable into high vs. low/medium (based on the best approach for answering the respective research questions).
Running head: TOWARDS DISRUPTING THE STATUS QUO
26
Neutral= EDGE DA Poor Good Neutral= EDGE DA Poor Good
Design Attribute Use Time Spent
Return Visits
High Eng. Level
Exhibits with DA
(max = 301)
% of exhibits with DA
Has multiple stations .09 .23 .15 .1 44 15%
Phenomenon can be experienced by 2 or more people at the same time .09 .13 .1 .12 240 80%
Label suggests three or more distinct activities .02 .18 .14 .12 25 8%
Space to accommodate three or more people .17 .16 .24 .07 170 56%
Allows for more than one set of hands or bodies .12 .14 .15 .01 224 74%
Designed for multiple players to use without interfering with each other .2 .17 .23 -.1 94 31%
Exhibit has a competitive feel .07 .28 .11 .07 17 6%
Exhibit is open-ended .05 .31 .16 .04 132 44%
One or more visitors can lose .04 .15 .17 .03 4 1%
Can be used from multiple sides .24 .07 .26 .08 94 31%
Label invites visitors to work together .12 .02 .11 .07 20 7%
Visitors can watch others from afar to preview what to do .1 -.01 .14 -.03 232 77%
Exhibit includes at least one familiar object .05 .13 .12 -.001 97 32%
Title suggests what to do at the exhibit -.11 .22 -.01 .17 147 49%
Interactive elements are observable prior to use -.01 .12 -.03 .12 225 75%
Exhibit includes a use drawing .03 .08 .07 .13 98 33%
Exhibit includes at least one image of a person .05 .08 .09 .18 126 42%
Title suggests what the exhibit does .09 .01 0 .14 223 74%
Exhibit has a label .06 .01 -.02 .21 291 97%
Title is legible from afar .03 .02 .01 .12 256 85%
Label has bright, prominent color 0 .09 -.06 .14 70 23%
Label asks at least one open-ended question .02 0 -.01 .13 148 49%
Includes two or more required roles .11 .04 .09 .05 14 5%
The exhibits’ look-and-feel is playful, whimsical, or humorous .06 .06 .15 .001 83 28%
Exhibit includes any text about an ambiguously gendered (OR non-gendered) STEM professional .03 .06 .1 -.002 21 7%
Label provides history of the exhibit or the phenomenon .09 -.04 .11 -.03 60 20%
Seating for two or more -.03 .32 .04 -.01 56 19%
Table 5Relationships between Design Attributes and Children’s Engagement
Running head: TOWARDS DISRUPTING THE STATUS QUO
27
Neutral= EDGE DA Poor Good
Design Attribute Use Time Spent
Return Visits
High Eng. Level
Exhibits with DA
(max = 301)
% of exhibits with DA
Exhibit involves self-expression or authorship -.08 .36 .08 -.05 38 13%
Prior visitors’ work is visible from afar -.07 .37 .07 -.08 26 9%
Exhibit has seating -.08 .16 -.07 .06 147 49%
Exhibit includes any image of a female .06 0 .09 .07 38 13%
Exhibit includes any image of a male .03 .04 .06 .08 52 17%
The overall color palette for the exhibit is: Public signage or safety colors (red/orange/yellow with gray/black)
.02 .02 .05 .09 43 14%
The exhibits’ look-and-feel is industrial, slick, bulky, or indestructible .05 -.04 .05 .09 121 40%
The exhibit is designed to reflect a visitors’ self-image .05 .02 .05 .02 34 11%
The overall color palette for the exhibit is: Neutral colors (white, black, gray, brown, wood, metal, mirror, clear) -.01 .01 .08 .07 138 46%
Label images are informal, hand-drawn, and casual, rather than precise or formal .03 -.06 .05 .07 33 11%
Exhibit has bright, prominent color in the secondary elements (such as cabinetry or hardware) .07 -.02 .08 -.03 17 6%
Label encourages visitors to compare with others 0 .02 .02 .06 10 3%
Exhibit includes any image of a STEM professional .03 -.04 .01 .05 6 2%
The exhibits’ inner workings are visible .03 .01 -.03 .05 100 33%
Exhibit includes any text about a female STEM professional -.03 .02 .08 0 4 1%
The overall color palette for the exhibit is: Subdued, desaturated colors (pink, burgundy, rust, pale green, olive, aqua)
.02 .04 -.05 .02 23 8%
Exhibit includes any text about a male STEM professional .03 -.08 .04 .01 26 9%
Label text tone is informal and friendly, rather than formal or academic -.02 .09 -.002 -.04 33 11%
Visitors can watch others at the exhibit to preview what to do -.09 .06 .03 -.01 287 95%
Exhibit provides opportunity to use small motor skills -.07 .08 -.07 .04 264 88%
Label describes how the exhibit phenomenon is related to social issues for humans, animals, or the environment -.01 -.09 -.01 .01 42 14%
The exhibits’ look-and-feel is homey, personal, homemade, or delicate -.02 -.03 -.05 -.02 25 8%
Running head: TOWARDS DISRUPTING THE STATUS QUO
28
To further exemplify how we might use these data to guide our own future
research, we present four additional tables (6-9) that aid interpretation. These tables, one
for each of the engagement measures, present only the DAs that had a notable positive
semi-partial correlation (sr ≥ 0.1) with that engagement measure. The tables also
highlight the prevalence of each DA across exhibits, helping us understand, again, how
reliable or precise is our measure of their relationship to engagement. We considered
DAs to be “widespread” if they were found at more than 100 exhibits (>33%) in our
sample; “fairly common” DAs were present at 30-100 exhibits (10-33%); and rare DAs
were found at less than 30 exhibits (<10%). Finally, for ease of reading and
interpretation, we classified disparate DAs into three groups based on three aspects of
exhibits: exhibit labels, exhibit look-and-feel, and exhibit interactions. Table 5 provides
the full data for readers to create their own groupings and cutoff criteria to support their
own explorations. For each exhibit aspect, some design attributes were notably
correlated, and some were not. The following four tables only present those that were
positively correlated (sr ≥ 0.1). For ease of interpretation, each table has been sorted by
Neutral= EDGE DA Poor Good
Design Attribute Use Time Spent
Return Visits
High Eng. Level
Exhibits with DA
(max = 301)
% of exhibits with DA
Exhibit uses a familiar object in an unfamiliar way .03 -.12 .01 -.02 24 8%
Label describes how the exhibit phenomenon is used in the real world .004 -.13 -.02 -.02 157 52%
Exhibit has bright, prominent color in the phenomenon or interactive elements -.03 .09 -.04 -.1 113 38%
Exhibit has embedded story or narrative -.06 -.01 -.05 -.13 7 2%
The overall color palette for the exhibit is: Vivid, saturated colors (red, orange, yellow, green, blue, purple) -.02 -.05 -.1 -.15 97 32%
Running head: TOWARDS DISRUPTING THE STATUS QUO
29
these groupings (in the order listed above) and prevalence (from rare to fairly common to
widespread).
Table 6
Design Attributes Positively Correlated with Exhibit Use
Design Attribute Semi-partial correlation Exhibit n Prevalence
Labels
Label invites visitors to work together 0.12 20 Rare Interactions
Includes two or more required roles 0.11 14 Rare Can be used from multiple sides 0.24 94 Fairly Common Designed for multiple players to use without
interfering with each other 0.2 94 Fairly Common Space to accommodate three or more people 0.17 170 Widespread Allows for more than one set of hands or bodies 0.12 224 Widespread
Visitors can watch others from afar to preview what to do 0.1 232 Widespread
Note.Thegirlsymbol indicatesDAsthatwerealsofoundtobesuccessfulinengaginggirlsintheEDGEstudy.
Running head: TOWARDS DISRUPTING THE STATUS QUO
30
Table 7
Design Attributes Positively Correlated with Time Spent
Design Attribute Semi-partial correlation Exhibit n Prevalence
Labels Label suggests three or more distinct activities 0.18 25 Rare Title suggests what to do at the exhibit 0.22 147 Widespread
Look & Feel
Exhibit includes at least one familiar object 0.13 97 Fairly Common Interactions
Prior visitors’ work is visible from afar 0.37 26 Rare Exhibit has a competitive feel 0.28 17 Rare One or more visitors can lose 0.15 4 Rare Exhibit involves self-expression or authorship 0.36 38 Fairly Common Seating for two or more 0.32 56 Fairly Common
Has multiple stations 0.23 44 Fairly Common Designed for multiple players to use without
interfering with each other 0.17 94 Fairly Common Exhibit is open-ended 0.31 132 Widespread Exhibit has seating 0.16 147 Widespread
Space to accommodate three or more people 0.16 170 Widespread Allows for more than one set of hands or bodies 0.14 224 Widespread Phenomenon can be experienced by 2 or more
people at the same time 0.13 240 Widespread Interactive elements are observable prior to use 0.12 225 Widespread
Note.Thegirlsymbol indicatesDAsthatwerealsofoundtobesuccessfulinengaginggirlsintheEDGEstudy.
Running head: TOWARDS DISRUPTING THE STATUS QUO
31
Table 8
Design Attributes Positively Correlated with Return Visits
Design Attribute Semi-partial correlation Exhibit n Prevalence
Labels Label suggests three or more distinct activities 0.14 25 Rare Label invites visitors to work together 0.11 20 Rare Exhibit includes any text about an ambiguously
gendered (OR non-gendered) STEM professional 0.10 21 Rare
Label provides history of the exhibit or the phenomenon 0.11 60 Fairly Common
Look & Feel
The exhibits’ look-and-feel is playful, whimsical, or humorous 0.15 83 Fairly Common
Exhibit includes at least one familiar object 0.12 97 Fairly Common Interactions
One or more visitors can lose 0.17 4 Rare Exhibit has a competitive feel 0.11 17 Rare
Can be used from multiple sides 0.26 94 Fairly Common Designed for multiple players to use without
interfering with each other 0.23 94 Fairly Common Has multiple stations 0.15 44 Fairly Common Space to accommodate three or more people 0.24 170 Widespread Exhibit is open-ended 0.16 132 Widespread Allows for more than one set of hands or bodies 0.15 224 Widespread
Visitors can watch others from afar to preview what to do 0.14 232 Widespread Phenomenon can be experienced by 2 or more
people at the same time 0.10 240 Widespread Note.Thegirlsymbol indicatesDAsthatwerealsofoundtobesuccessfulinengaginggirlsintheEDGEstudy.
Running head: TOWARDS DISRUPTING THE STATUS QUO
32
Table 9
Design Attributes Positively Correlated with High Engagement Level
Design Attribute Semi-partial correlation Exhibit n Prevalence
Labels Label has bright, prominent color 0.14 70 Fairly Common
Exhibit includes a use drawing 0.13 98 Fairly Common Exhibit has a label 0.21 291 Widespread
Exhibit includes at least one image of a person 0.18 126 Widespread Title suggests what to do at the exhibit 0.17 147 Widespread Title suggests what the exhibit does 0.14 223 Widespread Label asks at least one open-ended question 0.13 148 Widespread Title is legible from afar 0.12 256 Widespread
Interactions
Has multiple stations 0.10 44 Fairly Common Phenomenon can be experienced by 2 or more
people at the same time 0.12 240 Widespread Interactive elements are observable prior to
use 0.12 225 Widespread Note.Thegirlsymbol indicatesDAsthatwerealsofoundtobesuccessfulinengaginggirlsintheEDGEstudy.
Discussion
This reanalysis highlights the relationships between existing science museum
exhibit designs and children’s engagement. Importantly, current designs represent the
status quo, a state of affairs in museums that unfortunately promote inequities, often
advantaging white, middle-class, and male ways of learning (Dawson, 2014a, 2014b;
Feinstein, 2017). Rather than reify present circumstances by using the results to
recommend some existing designs over others, we present the findings to identify
productive subjects of future research.
The reanalysis of the EDGE data provides two kinds of results. The first offers
baseline data. By looking across all of the exhibits in the study, we can provide an
Running head: TOWARDS DISRUPTING THE STATUS QUO
33
average or typical engagement for all four measures. By comparing those results to prior
studies that used similar measures, we can add to the understanding the field holds about
how children engage with science museum exhibits.
The second kind of result presents children’s engagement data for a variety of
design attributes along with how common or rare those attributes were. The reanalysis of
the EDGE data indicates that several design attributes were positively correlated with
children’s engagement, and a smaller number correlated with negative engagement for
children. Again, rather than prescribe use of the former and avoidance of the latter, we
suggest that the results guide future research and evaluation regarding exhibit design.
Baseline Statistics
Providing the baseline levels for each area of engagement can help researchers
set expectations and clarify impacts. Below we share what we learned from the reanalysis
about children’s general engagement with STEM exhibits. We also point out instances
when our data align with prior research (i.e., replicate prior results).
Using an exhibit is a pre-requisite to any kind of engagement or deeper learning
opportunities. Our analysis found that, averaging across exhibits, only 22% of children
used a typical exhibit. This lies at the low end of prior research in science exhibit
“attracting power,” the ability of an exhibit to entice museum-goers to stop and try it. For
example, Yalowitz and Bronnenkant (2009), describing a study of 44 exhibits in an
aquarium exhibition, reported that the average exhibit attracted 41% of general visitors;
Sandifer (2003) found that 38% of visitors of all ages stopped at the average science
museum exhibit among 61 exhibits spread across two exhibitions; finally, Bosivert &
Slez (1995) tracked visitors of all ages as they moved among 80 exhibits and discovered
Running head: TOWARDS DISRUPTING THE STATUS QUO
34
that only 19% stopped at the average exhibit. Perhaps there is an inverse relationship
between number of exhibits and attracting power, or perhaps other differences in context,
exhibits, or visitors’ ages bear responsibility for the variation. However, even if science
museum exhibits on average are used by only 22% of children, this could represent a
large number of people, especially if we extend that finding to approximate usage by
visitors of all ages. At the Exploratorium whose annual attendance is 800,000, 22% of
visitors would translate into the average exhibit being utilized by 176,000 visitors per
year. This suggests that over the life of a science museum exhibit, millions of people
experience its marvels. No wonder Frank Oppenheimer, founder of the Exploratorium,
suggested that every exhibit be iteratively improved over time to maximize its
effectiveness (Oppenheimer, 1986).
Perhaps one of the most-studied aspects of the exhibit encounter has been holding
time (Sanford, 2010; Serrell, 1998, 2010; Yalowitz & Bronnenkant, 2009). Across
multiple studies conducted at small numbers of science museum exhibits, the average
visitor or group was found to spend about one minute interacting with the average exhibit
(Falk, 1983; Humphrey & Gutwill, 2005; Sandifer, 2003). The present analysis found the
same result with children, providing robust replication and setting this baseline firmly in
place. As new projects develop or study innovative exhibits to deepen engagement,
especially for underserved audiences, we suggest they continue to use this result as a
benchmark beyond which to push their own development.
We found that 15% of children returned to an exhibit after using it. We believe
return visits are important because they reflect a visitor’s deep enough interest to
experience something again, use something after observing someone else use it, or bring
Running head: TOWARDS DISRUPTING THE STATUS QUO
35
someone else over to show them an exhibit. Again, assuming that children and adults
behave similarly in this regard, this would mean that approximately 120,000 people
annually use the average exhibit more than once during a visit to the Exploratorium.
Although such findings require replication with adults, they provide a benchmark for
exhibits seeking to offer rich sets of activities so millions of returning users can deepen
their learning experiences over the lifetime of the exhibit.
Lastly, we found that 88% of children reached the highest level of engagement in
our outcome measure. That is, the vast majority of children were actively repeating or
varying inputs or sharing their experience with others while at the exhibit. This finding
fluctuated little across the three museums. This may suggest that most interactive, hands-
on science museum exhibits successfully promote basic inquiry behaviors. Still, further
research on rates of minimum inquiry may be warranted in other types of informal
learning institutions, such as history, natural history, art, and cultural museums.
Notably, participants behaved differently at the three museums (as shown above
in Table 4). The results seem to underscore the importance of environmental design. For
example, the smaller Southwest institution had the highest mean, minimum, and
maximum percentage of Use and Returns; the Midwest institution had the midranges for
Use and Returns; and the West Coast museum’s Use and Return results were lowest.
Scrutinizing the environmental design (Figure 1), we see that when fewer exhibits were
presented, children spread their attention across the exhibits available to them, rather than
using a fixed percentage of exhibits in a given area. (The average length of time of a
museum visit tends to remain steady for large and medium size museums (R. Auster,
personal communication, February 16, 2018).) Regarding Time Spent, the West Coast
Running head: TOWARDS DISRUPTING THE STATUS QUO
36
museum had the lowest average across their exhibits. This may be partially due to a
larger number of exhibits in the study space or a higher density of exhibits. Finally, when
we look to the High Engagement Level results, we find that once again the smaller
institution had a much higher minimum average percentage of children who repeat, vary,
or talk about their experience. Again, this lack of variability at the medium-sized museum
may be due to visitors trying get the most out of a small number of exhibits, or the
environmental design of a small room allowing for deeper focus.
The differences across institutions in the four engagement results may be related
to prior research investigating the effects of walls on visitors’ exhibit experience. In two
studies conducted at different institutions, the presence of walls, which created a smaller
space with the same exhibit density, increased holding time (Adamson, 2008; Allen,
2004). (However, one study found that walls also increased attraction power while the
other study found the opposite effect.) In our view, the impact of environmental cues on
exhibit experience remains a rich domain for future research.
Informing Research to Advance the Field
This study provides ample information regarding the relationship between each
DA and each engagement measure, along with the prevalence of each attribute in the data
set. Next we detail several examples of how we might use these data to guide our own
research agendas. We encourage others to explore similar paths and use the data to
generate their own questions to further the field. Several of the findings support prior
research conducted in this arena, while other aspects of the results seem to raise new
questions.
Running head: TOWARDS DISRUPTING THE STATUS QUO
37
Well-studied design attributes/replication. Several of the DAs that have been
found effective in prior research received further support here. Specifically, multiuser /
multistation (i.e., exhibit has multiple sides or stations) designs showed positive
correlations across measures. Multiple option (i.e., label suggests three or more
activities) and multi-outcome (i.e., exhibit is open-ended) attributes also proved
themselves valuable in our study. As these design attributes were found effectual in a
variety of studies (Borun et al., 1998; Dancstep et al., 2015; Humphrey & Gutwill, 2005;
Perry, 2012; Sandifer, 2003; Serrell, 1998, 2010), they may be viewed as sound design
ideas for typical science museum audiences without further research.
In addition, the current study suggests that including at least one familiar object,
employing labels that clearly suggest what to do, and allowing visitors to watch others
from afar to preview what to do were all positively related to engagement. All three of
these relate to Perry’s design principles as well as concepts from label researchers like
Beverly Serrell (Perry, 2012; Serrell, 2015). Familiar objects and visitors watching others
from afar to preview what to do also emerged from the EDGE analyses as important to
girls. Discovering that they were positive for boys as well underscores the perspective
that sometimes designing for accessibility with one underserved group in mind can
sometimes improve experience for many (Reich, 2005). Based on past and present
results, we believe that familiar objects, clear labels, and orientation from afar probably
need little further research with existing science museums audiences. Of course, results
from this reanalysis represent the status quo; further research is needed with underserved
and intersectional audiences.
Running head: TOWARDS DISRUPTING THE STATUS QUO
38
Common but understudied. Two DAs arose that are common, but have not been
studied much (to our knowledge): label has bright, prominent color and exhibit includes
at least one image of a person. In our experience, color sometimes becomes an issue of
contention among designers; while some see real power in the inclusion of color, others
voice concern that color distracts from the phenomenon (unless used strategically to
highlight it). We have heard from many designers that they want more solid data to help
inform these decisions. The second design attribute, using an image of a person, emerged
in EDGE as an important design attribute for engaging girls and also has been found
elsewhere to contribute to a feeling of welcome for girls (Crowley, Callanan,
Tenenbaum, & Allen, 2001). Given the result that both label attributes correlated with
engagement for boys and girls suggests that more research is warranted.
Rare but promising design attributes. For three promising DAs that appeared in
exhibits only rarely, it is unclear whether the positive results would hold with a larger
sample size: exhibit has a competitive feel (N=17 exhibits); one or more visitors can lose
(N=4); and label suggests three or more distinct activities (N=25). The first two seem
related. In a prior study, competition was associated with longer holding times and more
return use for children (Garibay, 2013). Perhaps the competitive activities were exciting
and demanding, prompting children to keep playing until they could win. However, this
result challenges prior research on girls’ preferred ways of learning and knowing
(National Science Foundation, 2003; Peterson & Fennema, 1985; Rosser, 1991; Taylor,
2005) and thus seems an important area for further study and deeper understanding (more
on this below). Label suggests three or more distinct activities fits well with multi-option,
multi-outcome designs. However, we seldom found exhibits with this label attribute,
Running head: TOWARDS DISRUPTING THE STATUS QUO
39
indicating that further study could inform the field as to whether this simple label change
should be utilized more frequently.
Negative associations. A small number of DAs actually showed negative
correlations with measures of engagement. The EDGE study eliminated any DAs with
negative correlations, because the goal was to discover the best designs for engaging girls
(those that were consistently, strongly, and positively related). However, when we
consider how this full data set can inform our future research agenda, we believe that
common, negative design attributes require further investigation. Three DAs fell into this
category: Exhibit has bright, prominent color in the phenomenon or interactive elements;
overall color palette for the exhibit is vivid, saturated colors (red, orange, yellow, green,
blue, purple); and label describes how the exhibit phenomenon is used in the real world.
The first two seem related, but contradict an earlier finding. Recall that when the label
contained bright, prominent colors, children tended to show higher engagement at the
exhibits. The opposite occurred when the main aspects of the exhibit itself employed
bright colors. This inconsistency begs for more research on the use of bright color on
exhibits and labels. The last commonly found DA that negatively correlated with
engagement also surprised us. Making real world connections in labels is a widespread
practice, supported by research and practical wisdom (e.g., Ansbacher, 1999; Atkins,
Velez, Goudy, & Dunbar, 2009; Perry, 2012; Serrell, 2015). Why did it correlate with
less holding time for children? Does a major difference exist between children and adults
in this respect, or might the finding be related to label length (Bitgood & Gregg, 1986;
Bitgood & Thompson, 1989)? We hope for additional study.
Running head: TOWARDS DISRUPTING THE STATUS QUO
40
Taking a larger view. Stepping back from the individual DAs, we see a larger
pattern in the results, particularly in the positive correlations between DAs and
engagement. Exhibit Use, Time Spent, and Return Visits seemed to correlate positively
with a remarkably small, stable set of design attributes. Those designs tended to match
what were found in the PISEC, APE, and other studies: Science exhibits promote
engagement well when multiple stations offer groups numerous, non-interfering options
that can lead to multiple outcomes (Borun et al., 1998; Humphrey & Gutwill, 2005;
Sandifer, 2003). This result further supports the previously developed idea that science
museum visitors prefer to stay with their friends and family while using exhibits and will
explore more deeply when exhibits provide greater complexity (Falk & Dierking, 2000;
Humphrey & Gutwill, 2005; Perry, 2012; Sandifer, 2003).
In contrast to the other three engagement measures, a high engagement level score
(repeating, varying, or talking about an experience while at an exhibit) seemed to
correlate with a different set of design attributes. Label designs, more than exhibit
designs, appeared to be important for promoting manipulation of variables or
communication among visitors at exhibits. Titles that describe what to do at an exhibit or
what the exhibit does and the labels that include use drawings were related to a high
engagement level. These kinds of label designs have been tangentially studied before
(Humphrey & Gutwill, 2005), but the field would benefit from more rigorous research.
Differences from EDGE. These data were originally collected to address issues
of disparity in science centers that echo broader equity and inclusion issues for women
and girls in STEM (Borun, 1999; Diamond, 1994; Girls Inc., 2004; Greenfield, 1995;
Hill, Corbett, & St. Rose, 2010; Kremer & Mullins, 1992; National Science Foundation,
Running head: TOWARDS DISRUPTING THE STATUS QUO
41
2003, 2007; Verheyden, 2003). As described earlier, the results from the EDGE study
warrant a higher level of confidence and allow us to advocate for design attributes that
best support girls’ engagement. The results from the reanalysis differ from the EDGE in
many ways. These differences are due partly to the inclusion of boys’ data, and could,
therefore, serve to reify existing disparities. Differences also arise from the changes in the
analytic approach—exploring the status quo in the reanalysis allowed us to highlight
negative relationships and investigate designs represented by fewer exhibits, but limits
any claims of generalizability beyond the sample. We discuss a few of these discrepant
outcomes below.
Some of the design attributes that were strongly related to engagement in the
reanalysis differ from what we found in EDGE, or even run counter to existing research
regarding females’ STEM learning preferences. For example, regarding the DA of label
has bright, prominent color, EDGE found that the relationship with girls’ behavior was
not significant and therefore may be due to chance. However, the reanalysis did not
employ inferential statistics, so we do not know if the positive relationship found herein
is significant or simply due to chance. Another example includes the competitive
attributes—exhibit has a competitive feel and one or more visitors can lose—which were
positively related to children’s engagement in the reanalysis. EDGE did not analyze the
impact of either of these competition attributes because too few exhibits represented
them. As mentioned earlier, competition as a design approach may negatively impact
females (National Science Foundation, 2003; Peterson & Fennema, 1985; Rosser, 1991;
Sammet & Kekelis, 2016; Taylor, 2005; Wellcome Trust, 2013). However, the impact
may depend on the type of competition: One of the project’s expert interviewees
Running head: TOWARDS DISRUPTING THE STATUS QUO
42
suggested that girls may enjoy group competition, but not wish to be pitted against one
another in individual competition (Koke, 2007). Moreover, issues with competition may
no longer hold in informal learning environments: The EDGE project’s Girl Advisory
Committee expressed enjoyment at an exhibit that incorporated competition. These two
attributes represent complicated relationships between design and visitor engagement.
We recommend studies that try to tease out the effects of variations in these designs, in
addition to keeping a keen eye out for negative outcomes for underrepresented audience
members, such as females.
Finally, four of the design attributes that were most successful for engaging girls
in EDGE did not consistently rise to the top in the reanalysis: (1) exhibits’ look-and-feel
is playful, whimsical, or humorous; (2) exhibits’ look-and-feel is homey, personal,
homemade, or delicate; (3) exhibit includes a use drawing; and (4) exhibit includes at
least one image of a person. We advocate for incorporating the EDGE DAs in order to
push museum practice and create learning environments that are welcoming to a broader
group of visitors, rather than remaining rooted in the status quo.
Conclusion
This reanalysis of EDGE data has provided a snapshot of the status quo of science
museum exhibit designs. Although the dataset was quite large compared to those of
previous studies, the exhibits in the current research came from only three representative
science museums (using a convenience sample that overrepresents the larger institutions),
focused on STEM content that excluded biology and social science, and were originally
investigated with coding schemes for design attributes related to females’ ways of
learning and knowing (Dancstep & Sindorf, 2018a). Further, while the team increased
Running head: TOWARDS DISRUPTING THE STATUS QUO
43
data collection efforts on free days and Latino family program days, the sample included
relatively few participants from non-dominant economic or cultural communities. The
sample also lacked participants with non-binary genders, perhaps due to the
comparatively young age of children in the study. These results may not extend to adults
or even children in other age ranges. While we value ecologically valid studies, research
within museums is often limited to certain audiences. We recommend that future research
explore mixed and alternate methodologies that amplify non-dominant voices. More
work is needed to understand exhibit design and the intersection of multiple identities
including gender, economic, ethnic, and STEM.
Given these limitations, along with the reliance on descriptive rather than
inferential statistics, we recommend caution in applying the results of this reanalysis to
practice. In addition, the data presented here represent the current state of the art in
exhibit design, which may actually inhibit engagement for disenfranchised learners
(Dawson, 2014a, 2014b; Feinstein, 2017). Simply implementing the positive design
attributes and eradicating the negative ones could reify existing inequities. Instead, we
present these data for researchers as one resource for determining where the field’s
investigative energy is best spent. To that end, we have shown examples where support
already exists, and where we see promising areas for further research. Other researchers
are encouraged to explore the data to inform their own next steps. The present reanalysis
has shown us, and hopefully others in the museum field, where to begin.
Running head: TOWARDS DISRUPTING THE STATUS QUO
44
Acknowledgements
We appreciate the thoughtful contributions by the project team, including researchers,
exhibit and content developers, graphics and editorial staff, and project management.
This research would not be possible without the museums and visitors that agreed to
participate. We thank the members of our esteemed Advisory board: Todd Bodner, Judy
Brown, Lynn Dierking, Cecilia Garibay, Kris Gutierrez, Judy Lee Haworth, Laura Huerta
Migus, and Don Norman. We developed this article with generous feedback from
reviewers, several EDGE Workshop participants, and museology graduate students at the
University of Washington (special thanks to Dr. Jessica Luke and Dylan High). This
material is based upon work supported by the National Science Foundation under Grant
Number DRL-1323806. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the
views of the National Science Foundation.
Running head: TOWARDS DISRUPTING THE STATUS QUO
45
References
Adamson,B.(2008).Therelationshipbetweenexhibitionenvironmentsandvisitors'behaviors.PaperpresentedattheVisitorStudiesAssociation,Houston,TX.
Allen,S.(2004).Topartitionornottopartition:Astudyoftheimpactofwallsonvisitorbehavioratanexhibitcluster.PaperpresentedattheVisitorStudiesAssociation,Albuquerque,NM.
Ansbacher,T.(1999).Experience,inquiryandmakingmeaning.Exhibitionist,18(2),22-26.AssociationofScience-TechnologyCenters.(2016).ASTCStatisticsPackage.Washington,D.C.Atkins,L.J.,Velez,L.,Goudy,D.,&Dunbar,K.N.(2009).Theunintendedeffectsofinteractive
objectsandlabelsinthesciencemuseum.ScienceEducation,93(1),161-184.Bailey,E.,Bronnenkant,K.,Kelley,J.,&Hein,G.(1998).Visitorbehaviorataconstructivist
exhibition:EvaluatingInvestigate!atBoston'sMuseumofScience.InC.Dufresne-Tassé(Ed.),Évaluationetéducationmuséale;nouvellestendances/EvaluationandMuseumEducation:NewTrends/Evalacionyeducacionmuseistica:Nuevastendencias(pp.149-168).Montreal:ICOM/CECA.
Barriault,C.,&Pearson,D.(2010).Assessingexhibitsforlearninginsciencecenters:Apracticaltool.VisitorStudies,13(1),90-106.doi:10.1080/10645571003618824
Baxter,P.,&Jack,S.(2008).Qualitativecasestudymethodology:Studydesignandimplementationfornoviceresearchers.TheQualitativeReport,13(4),544-559.
Bitgood,S.,&Gregg,G.(1986).Abriefreviewoftheresearchonsignsandlabels:Wherearethedata?VisitorBehavior,1(3),4-6.
Bitgood,S.,&Patterson,D.(1993).Theeffectsofgallerychangesonvisitorreadingandobjectviewingtime.EnvironmentandBehavior,25(6),761-781.
Bitgood,S.,&Thompson,D.(1989).Theeffectsofsignlength,lettersize,andproximityonreading.VisitorStudies,1(1),101-112.
Boisvert,D.L.,&Slez,B.J.(1995).Therelationshipbetweenexhibitcharacteristicsandlearning-associatedbehaviorsinasciencemuseumdiscoveryspace.ScienceEducation,79(5),503-518.
Borun,M.(1999).Genderrolesinsciencemuseumlearning.VisitorStudiesToday,3(3),11-14.Borun,M.,&Dritsas,J.(1997).Developingfamily-friendlyexhibits.Curator:TheMuseum
Journal,40(3),178-196.doi:10.1111/j.2151-6952.1997.tb01302.xBorun,M.,Dritsas,J.,Johnson,J.I.,Peter,N.E.,Wagner,K.F.,Fadigan,K.,Jangaard,A.,Stroup,
E.,&Wenger,A.(1998).FamilyLearninginMuseums:ThePISECPerspective.Philadelphia,PA:FranklinInstitute.
Cohen,J.(1988).StatisticalPowerAnalysisfortheBehaviorSciences(2nded.).Hillsdale,NJ:LawrenceEarlbaumAssociates.
Cohen,J.(1992).Quanitativemethodsinpsychology:Apowerprimer.PsychologicalBulletin,112(1),155-159.
Crowley,K.,Callanan,M.A.,Tenenbaum,H.R.,&Allen,E.(2001).Parentsexplainmoreoftentoboysthantogirlsduringsharedscientificthinking.PsychologicalScience,12(3),258-261.
Dancstep,T.,Gutwill,J.P.,&Sindorf,L.(2015).Comparingthevisitorexperienceatimmersiveandtabletopexhibits.Curator:TheMuseumJournal,58(4),401-422.
Dancstep,T.,&Sindorf,L.(2016).ExhibitDesignsforGirls’Engagement:AGuidetotheEDGEDesignAttributes.SanFrancisco,CA:Exploratorium.
Running head: TOWARDS DISRUPTING THE STATUS QUO
46
Dancstep,T.,&Sindorf,L.(2018a).Creatingafemale-responsivedesignframeworkforSTEMexhibits.Curator:TheMuseumJournal,61(3),469-484.doi:doi:10.1111/cura.12268
Dancstep,T.,&Sindorf,L.(2018b).ExhibitDesignsforGirls'Engagement(EDGE).Curator:TheMuseumJournal,61(3),485-506.doi:doi:10.1111/cura.12267
Dawson,E.(2014a).Equityininformalscienceeducation:developinganaccessandequityframeworkforsciencemuseumsandsciencecentres.StudiesinScienceEducation,50(2),209-247.doi:10.1080/03057267.2014.957558
Dawson,E.(2014b).“Notdesignedforus”:Howsciencemuseumsandsciencecenterssociallyexcludelow-income,minorityethnicgroups.ScienceEducation,98(6),981-1008.doi:10.1002/sce.21133
Diamond,J.(1986).Thebehavioroffamilygroupsinsciencemuseums.Curator:TheMuseumJournal,29(2),139-154.
Diamond,J.(1994).Sexdifferencesinsciencemuseums:Areview.Curator:TheMuseumJournal,37(1),17-24.
Falk,J.H.(1983).Timeandbehavioraspredictorsoflearning.ScienceEducation,67(2),267-276.Falk,J.H.(1997).Testingamuseumexhibitiondesignassumption:Effectofexplicitlabelingof
exhibitclustersonvisitorconceptdevelopment.ScienceEducation,81(6),679-688.Falk,J.H.,&Dierking,L.(2000).LearningFromMuseums:VisitorExperiencesandtheMakingof
Meaning.Lanham,MD:AltaMiraPress.Feinstein,N.W.(2017).Equityandthemeaningofsciencelearning:Adefiningchallengefor
sciencemuseums.ScienceEducation,101(4),533-538.doi:doi:10.1002/sce.21287Ferguson,C.(2009).Aneffectsizeprimer:Aguideforcliniciansandresearchers.Professional
Psychology:ResearchandPractice,40(5),532-538.Garibay,C.(2013).DesignZoneExhibitionSummativeReport.Chicago,IL:GaribayGroup.GirlsInc.(2004).Girlsandscience,math,andengineering.RetrievedJanuary19,2012,from
http://www.girlsinc.org/resources/fact-sheets.htmlGreenfield,T.A.(1995).Sexdifferencesinsciencemuseumexhibitattraction.Journalof
ResearchinScienceTeaching,32(9),925-938.Hamilton,L.S.,Michael,N.E.,Kupermintz,H.,Kerkhoven,J.I.M.,&Snow,R.E.(1995).
Enhancingthevalidityandusefulnessoflarge-scaleeducationalassessments:II.NELS:88Scienceachievement.AmericanEducationalResearchJournal,32(3),555-581.
Hill,C.,Corbett,C.,&St.Rose,A.(2010).WhysoFew?WomeninScience,Technology,Engineering,andMathematics.Washington,D.C.:AmericanAssociationofUniversityWomen.
Humphrey,T.,&Gutwill,J.P.(Eds.).(2005).FosteringActiveProlongedEngagement:TheArtofCreatingAPEEexhibits.WalnutCreek:LeftCoastPress.
Koke,J.(2007).[Expertinterview].Kremer,K.,&Mullins,G.(1992).Children'sgenderbehavioratsciencemuseumexhibits.
Curator:TheMuseumJournal,35(1),39-48.Lee,V.E.,&Burkam,D.T.(1996).Genderdifferencesinmiddlegradescienceachievement:
Subjectdomain,abilitylevel,andcourseemphasis.ScienceEducation,80(6),613-650.NationalScienceFoundation.(2003).NewFormulasforAmerica'sWorkforce:GirlsinScience
andEngineering.Arlington,VA:NSF03-207.NationalScienceFoundation.(2007).NewFormulasforAmerica'sWorkforce2:GirlsinScience
andEngineering.Arlington,VA:NationalScienceFoundation.Olszewski,B.,Macey,D.,&Lindstrom,L.(2006).Thepracticalworkof<coding>:An
ethnomethodologicalinquiry.HumanStudies,29(3),363-380.
Running head: TOWARDS DISRUPTING THE STATUS QUO
47
Oppenheimer,F.(1986).WorkingPrototypes:ExhibitDesignattheExploratorium.SanFrancisco,CA:Exploratorium.
Perry,D.(1989).TheCreationandVerificationofaDevelopmentalModelfortheDesignofaMuseumExhibit.DoctoralDissertation,IndianaUniversity.
Perry,D.(1993).ExperimentBenchProject:SummativeEvaluationReport.SaintPaul,MN:ScienceMuseumofMinnesota.
Perry,D.(2012).WhatMakesLearningFun?PrinciplesfortheDesignofIntrinsicallyMotivatingMuseumExhibits.Lanham,MD:AltaMiraPress.
Peterson,P.,&Fennema,E.(1985).Effectiveteaching,studentengagementinclassroomactivities,andsex-relateddifferencesinlearningmathematics.AmericanEducationalResearchJournal,22(3),309-335.
Reich,C.(2005).UniversalDesignofInteractivesforMuseumExhibitions.MasterofEducation,LesleyUniversity,Cambridge.
Rosser,S.V.(1991).FemaleFriendlyScience:ApplyingWomen'sStudiesMethodsandTheoriestoAttractStudents.NewYork:Teacher'sCollegePress.
Sammet,K.,&Kekelis,L.(2016).ChangingtheGameforGirlsinSTEM:FindingsonHighImpactProgramsandSystem-BuildingStrategies.
Sandifer,C.(2003).Technologicalnoveltyandopen-endedness:Twocharacteristicsofinteractiveexhibitsthatcontributetotheholdingofvisitorattentioninasciencemuseum.JournalofResearchinScienceTeaching,40(2),121-137.
Sanford,C.W.(2010).Evaluatingfamilyinteractionstoinformexhibitdesign:Comparingthreedifferentlearningbehaviorsinamuseumsetting.VisitorStudies,13(1),67-89.
Screven,C.(1995).Motivatingvisitorstoreadlabels.InA.Blais(Ed.),TextintheExhibitionMedium(pp.97-132).Quebec:MuseedelaCivilisation.
Serrell,B.(1996).ExhibitLabels:AnInterpretiveApproach.WalnutCreek,CA:AltaMiraPress.Serrell,B.(1998).PayingAttention:VisitorsandMuseumExhibitions.Washington,D.C.:
AmericanAssociationofMuseums.Serrell,B.(2010).Payingmoreattentiontopayingattention.CenterforAdvancementof
InformalScienceEducation.Serrell,B.(2015).ExhibitLabels:AnInterpretiveApproach(2nded.).London,UK:Rowman&
Littlefield.Sindorf,L.,&Dancstep,T.(2015).ExhibitDesignsforGirls’Engagement(EDGE)ExhibitCoding
SchemeandProtocol.SanFrancisco,CA:ExploratoriumTaylor,D.(2005).Socialscience:Observingwomenandgirlsinthemuseum.ASTC
Dimensions(May/June),11-12.VanSchijndel,T.J.P.,Franse,R.K.,&Raijmakers,M.E.J.(2010).TheExploratoryBehavior
Scale:Assessingyoungvisitors'hands-onbehaviorinsciencemuseums.ScienceEducation,94(5),794-809.doi:10.1002/sce.20394
Verheyden,P.(2003).Thegreatsexperiment.ECSITENewsletter,54(Spring),10-11.WellcomeTrust.(2013).RisksandRewards:HowPhDStudentsChooseTheirCareers:Qualitative
ResearchReport.London,UK.Wilson,C.,Voorhis,V.,&Morgan,B.(2007).Understandingpowerandrulesofthumbfor
determiningsamplesizes.TutorialsinQuantitativeMethodsforPsychology,3(2),43-50.Yalowitz,S.S.,&Bronnenkant,K.(2009).Timingandtracking:Unlockingvisitorbehavior.Visitor
Studies,12(1),47-64.doi:10.1080/10645570902769134