The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by...

231
The Initial Knowledge State of High School Astronomy Students Philip Michael Sadler A Dissertation Presented to the Faculty of the Graduate School of Education of Harvard University in Partial Fulfillment of the Requirements for the Degree of Doctor of Education 1992

Transcript of The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by...

Page 1: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

The Initial Knowledge State of High School Astronomy Students

Philip Michael Sadler

A DissertationPresented to the Faculty

of theGraduate School of Education of Harvard University

in Partial Fulfillment of the Requirementsfor the Degree of Doctor of Education

1992

Page 2: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

© 1992Philip M. Sadler

All Rights Reserved

Page 3: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

Dedication

To Professor Jerrold Reinach Zacharias (1905-1986) of the MassachusettsInstitute of Technology, a world-class scientist, a wonderful teacher, and anational leader, whom I first met in the Fall of 1969. He, more than any otherperson, inspired me to dedicate my life to teaching science and mathematics toyoung people. Forever berating schools of education, I can imagine Zach saying,“Philip, a Ph.D. in Physics is better than an Ed.D., especially from that school upthe river.” But Jerrold, that “school up the river” is actually quite wonderful. Iknow this is the best path to carry your traditions and ideas to the nextgeneration of teachers and students.

Page 4: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

iAcknowledgmentsThis dissertation has been a long time in coming and many people have

encouraged me to start and cajoled me into finishing. I wish to thank all thoseinvolved with Project STAR at the Harvard-Smithsonian Center forAstrophysics, especially Professor Irwin Shapiro, for the risk he took in bringingin an outsider to direct the project and for his thoughtful counsel in matters ofboth science and education; Bruce Gregory, for always having time to discussproblems of misconceptions and science learning; and Professor Darrel Hoff forencouragement, ideas, and enthusiasm for my chosen path.

Dr. Marcus Lieberman, consultant to Project STAR, has organized, coded,and provided preliminary analysis of the data. He has always had the patience toexplain obscure statistical procedures and has been a sounding board for myresearch methodology. Heather Whipple supported these efforts by trackingparticipating teachers and providing them with copies of pre- and post-tests. SamPalmer has helped me by just being himself, a profoundly creative and reflectiveteacher who, along with Professor Charles Whitney and Professor OwenGingerich, has helped to mold the items on the pre-test through their manyincarnations. Marvin Grossman, a long-time friend and colleague, has been asounding board for my ideas as well.

Test items dealing with light and color were originally worked on with theadvice of Professor Eleanor Duckworth. Professor Ann Young of the RochesterInstitute of Technology was instrumental in developing and refining many of thetest items. Jenny Bond Hickman of Phillips Andover and Paul Hickman ofBelmont High School contributed by taping interviews with many of theirstudents. This process revealed students’ ideas about light and color. ProfessorLinda Shore of Boston University also examined light and color, as well asquestions that tease out students’ ideas about gravity. Dr. Matthew Schnepsdirected and produced A Private Universe, which has promoted many of the ideaswithin this dissertation to tens of thousands of teachers.

This study could not have been performed if it were not for anextraordinary group of teachers dedicated to improving Project STAR: AndrewAnzalone, Richard Ayache, Lynn Bastoni, Russell Blake, Richard Brown, DanielFrancetic, Linda French, Gita Hakerem, Jennifer Bond Hickman, Paul Hickman,Robert Holtzman, Jeff Lane, Larry Mascotti, Bruce Mellin, Fiona McDonnell,Mark Petricone, Michael Richard, Larry Sabbath, Gary Sampson, Dorothy Walk,and especially Harold Coyle, Jr., who is now on the project staff, and Bill Luzaderwho spent a year’s sabbatical at the Center for Astrophysics. Mark Petriconelabored for a summer helping to rewrite a version of the misconception test thatresulted in the improvement of many of the items.

I want to thank the National Science Foundation for the generous fundingthey have provided for Project STAR from 1985 to 1992 and the three ProgramDirectors who have supported our activities: John Thorpe, Mary Kohlerman, andGerhard Salinger. A special thanks goes to Andy Molnar of the NSF, who, as wellas being an enthusiastic supporter of our efforts, has never ended a conversationwith me without asking about the progress of my dissertation. The SmithsonianInstitution has also supported our activities with grants and the time of scientists.

Page 5: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

iiMy wife Jane has put up with the long hours and grumpy moodsprovoked by my doctoral studies and I appreciate her generosity of time andspirit. Thanks to my kids Benjamin, Samuel, and Daniel who have provided somuch in the way of pleasant diversion which more than make up for theirinterruptions at the keyboard.

A good friend and experienced editor, Bessie Blum, proofread,commented upon, and corrected several versions of both my qualifying paperand this dissertation.

Thanks also go to my readers. Professor Israel Scheffler was my advisorthrough most of my doctoral work. He always found the time to advise me onmy research and confided that I gave him “less trouble” than any other graduatestudent. Since his retirement, Professor Judah Schwartz has more than filled thebreach. He was the natural choice, since he was instrumental in providing mewith an exceptional undergraduate education at MIT as the Director of theUnified Science Studies Program. I am proud to have joined him as a colleague atHUGSE. Professor Irwin Shapiro has aided me with excellent feedback throughhis instrument of choice, the red pen. I am sure he will continue to do so in futureprojects. Terrence Tivnan has somehow found time to read my work and todiscuss my methodology while seemingly being a reader for every otherdoctoral student at the Graduate School of Education. Moreover, he carried outthe initial pre- and post-test analyses of Project STAR, along with Terresa Tatto,in 1987. As crude as our instrument was at that time, he planted the seeds of thisdissertation by shaping the formative evaluation of the project. His insights arethe seeds from which this study grew. Thank you all. I will try to measure up tothe confidence you have shown in me. I will attempt to give back to my ownstudents the caring and guidance that you have shown to me.

Page 6: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

vTable of Contents

Dedication......................................................................................................................iAcknowledgments........................................................................................................iiTable of Contents.........................................................................................................vAbstract...........................................................................................................................viiI. Introduction...............................................................................................................1

A. The Problem Context: Difficulty in Learning Science............................1B. Statement of the Problem..........................................................................5C. Significance of this Study............................................................................5

II. Review of the Literature ......................................................................................7A. Cognitive Roots of Misconception Research ..........................................9B. History of Scientific Misconceptions.........................................................13C. A Hierarchy of Misconception Research .................................................19D. Methodological Problems of Past Research............................................49E. Implications of Past Studies .......................................................................51

III. Methodology .........................................................................................................56A. Description of the Dataset..........................................................................56B. Research Questions.....................................................................................63C. Hypotheses ..................................................................................................64D. Instrument....................................................................................................64E. Procedure .....................................................................................................65F. Statistical Analyses ......................................................................................77

IV. Reliability and Validity.......................................................................................80A. Reliability.......................................................................................................80B. Validity Tests .................................................................................................80

V. Item Analysis Results ...........................................................................................87A. Total Score....................................................................................................87B. Earth and Sun...............................................................................................89C. Earth and Moon...........................................................................................117D. Mathematics.................................................................................................131E. Solar System.................................................................................................146F. Stars...............................................................................................................150G. Galaxies.........................................................................................................159

Page 7: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

H. Light and Color ...........................................................................................165VI. Whole Test Results ...............................................................................................184

A. Ranking of Test Items by P-value.............................................................184B. Ranking of Test Items by D-value............................................................189C. Mean Item Characteristic Curve...............................................................190D. Discrimination/Difficulty Graph...............................................................192E. Distribution of Correct Answers ..............................................................195F. Which Questions Should Be Included on a Shortened Test? ................197G. Predictors of Difficulty and Discrimination.............................................199

VII. Demographic and Schooling Factors Results.................................................201A. Demographic Factors .................................................................................204B. Schooling Factors ........................................................................................213C. Attitude Factors...........................................................................................228D. Analysis of Variance ...................................................................................234

VIII. Discussion............................................................................................................238A. Research Questions.....................................................................................238B. Characterizing Misconception Questions................................................244C. Dissemination ..............................................................................................245D. Errors, Omissions, and Problems .............................................................245E. Future Extensions........................................................................................247

IX. References ...............................................................................................................248X. Bibliography ............................................................................................................261Appendices.....................................................................................................................269

A. School DataB. Pre-test InstrumentC. P-Value, D-Value TablesD. Classical Test Theory TablesE. Item Correlation MatrixF. Chi-Square Analysis

Vitae

Page 8: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

iAbstractThis study of 1,414 high school earth science and astronomy students

characterizes the prevalence of their astronomical misconceptions. The multiple-choice instrument was prepared by scouring the literature on scientificmisconceptions for evidence of preconceptions and from the author’s interviewswith students. Views that were incorrect, but espoused by a large fraction ofstudents, were included as distractors. Results have been analyzed using classicaltest theory. A linear multiple regression model has helped to show the relativecontributions of demographic and school factors to the number ofmisconceptions held by students.

The instrument was found to be a reliable and valid test of students’misconceptions. The mean student score was 34 percent. Fifty-one studentmisconceptions were revealed by this test, nineteen of which were preferred bystudents to the correct answer. Several misconceptions appeared morefrequently among the higher-performing students. Significant differences instudent performance were found in several subgroups based upon schoolingand demographic factors. Twenty-five percent out of a total of 30 percent of thevariance in total test score could be accounted for by gender, race, and mathlevel courses taken. Grade level and previous enrollment in an earth sciencecourse were not found to be predictive of total score. Mother’s educationproved to be of small import; level of father’s education was not significant.

This test is a useful addition to instruments that measure studentmisconceptions. It could find application in tests of effective intervention forconceptual learning. Significantly shortened versions of this instrument thataccount for 75 and 90 percent of the variance in the forty-seven-item instrumentare recommended. Such tests of misconceptions may be somewhatdisheartening to teachers and their students. A test made up of onlymisconception questions will probably have average total scores less than 40percent. If teachers are to test their students using misconception questions, theyshould adjust grading policies to reflect this lower average score.

Page 9: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

I. IntroductionA. The Problem Context: Difficulty in Learning ScienceScience education in the United States is a disaster. According to tests that

compare our students with those of other countries, the United States is rankedclose to, if not at, the bottom of the list (International Association for theEvaluation of Educational Achievement 1988). Our major economic competitors,Germany, Korea, and Japan, place at or near the top. Science has dropped fromthe lofty position it held in the nation’s schools in the late 1950s and 1960s. Canwe reverse this decline and make our society more scientifically literate andeconomically competitive?

When examined, tests of students’ performance show that Americanstudents know their science facts. Where they stumble badly is in conceptual un-derstanding and in the use of concepts to solve problems.1 Several factors couldhelp account for these differences:

• Our textbooks are long and filled with jargon. Comparable Japanese textsare short and filled with concepts (Troost 1985).

• American students spend nearly one-half of their class time reading sciencetexts and listening to their teachers lecture (Weiss 1987a). There is littlecooperative or hands-on learning (Hofwolt 1985).

• Our texts rarely apply scientific concepts directly to the world with whichstudents are familiar (Goodlad 1984; National Assessment of EducationalProgress 1989). German schools emphasize relevance and technical appli-cations of physics (Klein 1985).

Few would contest that our curriculum materials and teaching methodscould use improvement, but in the United States, professional evaluation ofcurricula and teaching is for the most part ignored. Whereas Germany, Japan,and Korea have powerful Ministries of Education that create curricula andevaluate them with national tests, in the United States there are 16,000independent school systems and almost 200,000 full- and part-time teachers ofscience (Fisher and Lipson 1985). Teachers are simply left on their own to designinstruments to assess the learning of their own students. Such tests and quizzesas they do design are amateur efforts. They do little to discriminate betweeneffective and poor teaching or curricula against national standards (Rutherford1985).

1 Examples of questions from the 1986 National Assessment of Educational Progress :Fact (level 150), To which of the following is the wolf most closely related?

Buffalo, Deer, Dog, Rabbit, Sheep, I don’t knowConcept (level 350), An ore sample contains 50 grams of radioisotope with a half-life of 5

seconds. After 10 seconds, how many grams of radioisotope are in the sample?12.5 grams, 25 grams, 50 grams, 75 grams

Page 10: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

Teaching for conceptual understanding rather than the memorization offacts and algorithms is far from easy. Many hurdles face teachers who hope toimpart to their students even a few powerful scientific concepts. Perhaps themost daunting problem is that students enter the classroom with beliefs abouthow the natural world behaves that are at odds with accepted views in science.Students apply these “misconceptions”2 to make predictions about events, suchas that gravity is the result of air pressure (Minstrell 1982a) or that light from acandle will travel further at night than in the day (Stead and Osborne 1980).Dozens of studies in recent years show that these beliefs, constructed by thestudents themselves or garnered from misinformed adults, are quite tenacious.Once in place, they rarely change in the course of even the best instruction. Moststudents leave their science courses with no better conceptual understanding ofscientific ideas than when they enter.3

The fact that students maintain conceptions that are at odds with scientificfact has major consequences. For the individual student, these beliefs can becomecritical barriers that often keep them from succeeding in their science courses,prematurely ending their study in science (Arons 1983). Teachers becomefrustrated when dealing with the confusion of students who have difficultylearning simple concepts because of their faulty foundations. Yet teachers arewoefully unaware of the misconceptions of their students (Sadler 1987). Perhapsthe most damaging indictment of science teaching is that students who took highschool physics do no better in college physics than those who chose not to take ahigh school course (Champagne and Klopfer 1982; Halloun and Hestenes 1985).For the U.S. educational system, it is a monumental waste of resources to teachineffectively, yet there is little proof of the existence of teaching effective enoughto alter students’ misconceptions.

2 The term “scientific misconceptions,” as used in this paper, refers to ideas that people possess

that are different from accepted scientific views. Alternatives for the term “misconception”have been suggested by some researchers, because use of the term can seem to denigrate studentideas. After all, it is wonderful that students do come up with such amazing and originalconstructions. “Alternative frameworks” has been suggested as less judgmental (Driver, 1978).“Preconceptions” emphasizes the prior knowledge that students bring to class (Ausubel, 1978).“Naive theories” is a term that recognizes that students' ideas are theories that result fromthought, not guessing (Arnaudin, 1985). “Alternative conceptions” was coined to characterizethose false ideas that do not change even after instruction (Hewson, 1983).

3 An interesting example is that in spite of a fine education, commencement-day interviewsshowed 21 out of 23 Harvard graduates, alumni, and faculty thought that the earth'schanging distance from the sun is responsible for its seasons or that the moon's phases arecaused by the shadow of the earth (Sadler and Schneps, 1988 #341).

Page 11: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

Increasing the efficacy of conveying scientific concepts demands thatteachers become aware of and seek to change the misconceptions of their stu-dents (Champagne et al. 1982). Much of the science that students are required tostudy in school is “alien” and in conflict with their experiences and thinking(Gardner 1991). Students have difficulty learning science because of the difficultyof reconciling their own beliefs with the conflicting ideas of their teacher.Misconceptions should be viewed instead as stumbling blocks for the studentand signposts for the teacher (Narode 1987).

The examination of scientific misconceptions, however, is still in itsinfancy. Through clinical interviews and formal testing, the misconceptions thatstudents bring to their science classes have been extensively investigated in onlya few domains. The most attention has been paid to Newtonian mechanics;astronomy is much less studied. Yet the subject of astronomy is taught yearly insecondary school as a separate course to approximately 50,000 students (Weiss1987b), as a part of middle school earth science to more than 1,000,000 students(Welch et al. 1984), and at the college level as an introductory course to morethan 300,000 (Hoff 1982). For many pre-service teachers, especially those whowill teach in the primary grades, taking a single astronomy course may fulfilltheir only physical science course requirement.

The misconceptions in this domain have not been adequately examined inlarge-scale surveys. There have been many investigations of single conceptsthrough interviews of small numbers of students. This study characterizes thepreconceptions of more than 1,400 high-school students taking introductoryastronomy. Demographic information helps to determine the relationship ofgender, race, age, and the role of previously taken science and mathematicscourses to the quantity of misconceptions that students hold. The results of thisstudy could help in the formulation of tests, new curriculum materials, andteaching techniques for introductory astronomy courses.

It is the intent of this study to explore the extent of common “wronganswers” among introductory astronomy students. Some may view thesestudent responses as not being definitively indicative of particular scientificmisconceptions. This is only partly true. Many previous studies have exploredthe “web of entailed and related” faulty answers in astronomy and linked themtogether into justifiable misconceptions. This test builds upon such work andattempts to quantify the prevalence of these misconceptions by having studentschoose among scientific explanations and distractors garnered from interviews.In my analysis, I have explicitly attempted to use statistical techniques to examinethe relatedness of various misconceptions. I wished to avoid qualitative analysesin trying to establish these connections and use a purely quantitative approach.This quantitative approach to relatedness of misconceptions has not beencommented upon in the literature.

This work is an outgrowth of Project STAR, a curriculum developmentproject supported by the National Science Foundation. The course produced bythis project based at the Harvard-Smithsonian Center for Astrophysics has hadas a main objective the study of students’ ideas in astronomy and theinvestigation of methods for aiding students in building powerful and predictivescientific ideas.

Page 12: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

B. Statement of the ProblemThe purpose of this thesis is to identify and to investigate key scientific

misconceptions that students may have when they enter introductoryastronomy courses. It also seeks to determine the significance of schooling anddemographic factors in the frequency of these misconceptions.

C. Significance of this StudyAstronomy is taught at many levels to over one million students each

year. Astronomy and earth science texts are packed with sophisticated conceptsthat appear trivial to teachers, but interviews with students uncover the fact thatthey have notions contrary to those espoused by teacher or text. There has yetto be an attempt at formulating a comprehensive instrument that will evaluatebasic astronomical misconceptions in a form that is both reliable and easy toadminister. The validation of such a test would provide a way for:

• Over 15,000 teachers to test their students to identify prevalentmisconceptions and allow them to plan their lessons accordingly.

• Teachers and researchers to measure the effect of instructional tech-niques or curriculum materials in reducing student misconceptions.

• Astronomy departments to measure the misconceptions of theirgraduate teaching fellows.4This analysis of results from the 1,414 students who took a test of this kind

in September of 1991 seeks to answer questions that were not examined in priorstudies of student misconceptions in science. Studying the role of schooling anddemographic factors would help to determine if:

• Previous science courses appear to reduce misconceptions.• Older students have different misconceptions than younger ones.• Boys and girls have different misconceptions.• Members of minority groups have different misconceptions.The application of advanced statistical tools on this test explores how

items designed to reveal misconceptions may differ from conventional multiple-choice items. This may help substantiate the claim that standardized tests excludeitems dealing with misconceptions because they do not fit the recommendedprofile (Narode 1987).

Construction of an inventory of astronomical misconceptions will helpmany teachers to focus on changing student preconceptions. Test items can beused as the basis of classroom discussion, to help guide laboratory activities andwriting assignments, and for the assessment of students on quizzes and tests.

4 This has been suggested by Professor Al Cameron of Harvard’s Department of Astronomy for

implementation in September 1992.

Page 13: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

II. Review of the LiteratureA good literature review distinguishes what has been done from what

needs to be done. There has been a great deal of effort expended in the lastfifteen years to identify student conceptions in science and in developinginstructional methods to change them. My first step was to begin a wide searchto select and collect documents related to astronomy teaching. I began with myprofessional collection of articles, books, proceedings, and privatecorrespondence, starting a data base in the bibliographic program Endnote.Articles, books, and reports that were referred to in these documents andappeared to have some use were entered into the program. I then begancomputer and manual searches of data bases and reference volumes, moving onto indexes of appropriate journals. Finally, I collected all the materials listed inmy data base, read through all of it, and sought out any useful documentsreferenced by these sources.

Project STAR has been acquiring articles, books, and teaching materialsconcerned with astronomy education since December 1985. This resource librarycontains well over 1,000 items.5 Over forty articles and texts from this sourcewere used for this paper.

A computer search of the ERIC (Educational Resources InformationCenter) CD ROM proved valuable, searching on these descriptors: astronomy;misconception; light; curriculum; evaluation; teaching; and secondary school, invarious combinations. This produced about 200 abstracts, of which I determinedsixty would be useful. Articles dealing with misconceptions are widely dispersedin a variety of journals. I looked through the indexes of the following journalsback to 1980 for appropriate articles: Journal of Research in Science Teaching;Science Education; American Journal of Physics; The Science Teacher; Science &Children; American Educational Research Journal; and International (originallyEuropean) Journal of Science Education.

Since misconception research is recent, dissertations report on researchthat has been carried out in the last few years. I searched through listings ofdissertations and qualifying papers at Harvard and through DissertationsAbstracts International using these descriptors: astronomy; physics; science;misconceptions; and evaluation. Twenty-three dissertations were found usingthis method.

Conference announcements and proceedings also turned out to be avaluable source of information, especially the International Seminars onMisconceptions and Educational Strategies in Science and Mathematics (1983,1987), the American Association of Physics Teachers Conferences (twice yearly),and the International Conference of GIREP (1986), an international physicseducation society. In addition, I searched through the abstracts for the meetings

5 This collection is located at the Science Education Department at the Harvard-SmithsonianCenter for Astrophysics.

Page 14: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

of the American Astronomical Society, the National Science TeachersAssociation, and the International Planetarium Society.

My personal correspondence was also useful. Since I have given overforty workshops and papers on science teaching throughout the world, many re-searchers have sent me private letters and pre-publication reports on their work.

Altogether, approximately 225 documents turned out to be pertinent tothis study (see Figure 1). Some were especially fruitful. A doctoral dissertation(Schoon 1988) had surveyed the literature pertaining to misconceptions in earthscience. A study had evaluated a major NSF-sponsored curriculum developmentproject in astronomy (Klopfer 1964a). The local peaks in the graph appear to bethe result of the publication of two collections of articles on misconceptions (bothin 1985), and four proceedings of major conferences: two on misconceptions(1983, 1987) and two on astronomy education (1986, 1988). Since this period, thenumber of references has been much smaller. There has not been a conferenceon scientific misconceptions since 1987.

A. Cognitive Roots of Misconception ResearchWhat are the roots of the recent work in scientific misconceptions? How

did all this interest in the student’s view of the world begin? A progression ofpsychologists and educators have noted that students have answers forquestions that differ from the answers of knowledgeable adults. These dif-ferences have been explained in several different ways and have roots in the fieldof cognitive psychology.

1. Disequilibrium of PiagetThe work of Jean Piaget is the earliest attempt to investigate the influence

of children’s prior knowledge. Piaget developed the method of the “clinical inter-

0

5

10

15

20

25

30

1950 1955 1960 1965 1970 1975 1980 1985 1990

Figure 1. References and Sources by Year Published

# of Items

Page 15: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

view” to assess how children think about certain concepts. This methodinvolves the child explaining what is happening in a particular situation andresponding to proposed changes. Perhaps this technique of careful inquiry andobservation grew out of Piaget’s formal training in natural history. When hebegan his career working on the standardization of intelligence tests in France,he was amazed at the types of wrong answers children would give and howthese answers changed qualitatively with older children (Novak 1977).

Piaget makes specific mention of “pre-concepts” when discussingyoungsters’ attempts to distinguish between different spatial viewpoints (Piagetand Inhelder 1929). In a series of clinical interviews, Piaget showed a drawing toa child and asked her to select a point on a three-dimensional model from whichthe view would be the same as the drawing. Subjects showed great difficulty indifferentiating between different viewpoints and often mistook right for left.Drawings that were reversed right-for-left were often as acceptable as a trueview.

Piaget theorized that children eventually learn to perform these tasksthrough a process of “assimilation” and “accommodation.” New experiences,such as taking photographs and looking at them later at a different location, willlead a child to fit this experience into his or her way of thinking about the world.The assimilation happens as the child’s cognitive structure is changedpermanently, leading him to give different answers than he had donepreviously.6

Piaget’s ideas had relatively little impact on school curricula. Hisinvestigations focused on areas that were not a part of the traditional schoolscience curriculum, for example, conservation of volume. Piaget’s researchresults were reported in journals and books that use the specialized jargon of thepsychologist, and are not easily understood by teachers and curriculumdevelopers. Little connection was made in trying to improve teaching or studentmaterials as a result (Anderson and Smith 1986). That is not to say that Piaget hashad no influence at all. Most pre-service teachers study some of Piaget’s work intheir educational psychology courses. Some curriculum development projectshave been based on Piaget’s ideas. The Lawrence Hall of Science (University ofCalifornia at Berkeley) developed the SCIS (Science Curriculum ImprovementStudy) Program, which emphasizes Piaget’s stage theory in its student materialsand teachers’ guides. This program saw great success in the 1960s and 1970s inthe nation’s elementary schools.

6 I have found similar difficulties in high school and college students when they were asked toidentify the phase of the moon from different viewpoints (item 7 in the STAR pre-test). Theability to visualize physical systems from different points of view is important in the studyof astronomy. Problems such as the changing brightness of eclipsing binary stars or the ap-pearance of the Earth from the Moon all rely on adequate spatial reasoning abilities, whichare still undeveloped in many older students.

Page 16: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2. Cognitive DissonanceWhen two ideas seem to be in contradiction and yet both are indisputably

true, the learner experiences “cognitive dissonance.”7 Much has been writtenabout trying to avoid or reduce cognitive dissonance in instruction so that these“unhappy” experiences do not occur (Festinger 1957). The preference isexpressed in the literature that students should integrate new concepts into theirway of thinking so that they will not run into the problem of “getting the wronganswer.” Within STAR, we have found that experiences of wrongly predictingevents are not detrimental, but rather can be beneficial. They provide themotivation for students to explore the inconsistency and hold the potential forstudents to change their ideas. Without such obvious examples of thepowerlessness of student conceptions, students will learn new material by roteand promptly forget it after the term is over.

3. Preconceptions and Meaningful LearningDavid P. Ausubel was the first cognitive psychologist to realize that stu-

dent conceptions are amazingly tenacious and almost impossible to changethrough instruction (Ausubel et al. 1978). Ausubel posited that preconceptionsare not simply isolated beliefs, but elements of a very stable and comprehensiveview of the world, and that formal instruction vies for, but often loses in thebattle for, the student’s beliefs. In his Educational Psychology (1978, p. 336), hesays, “the unlearning of preconceptions might very well prove to be the mostdeterminative single factor in the acquisition and retention of subject-matterknowledge.” Ausubel laments that thorough studies of these preconceptions inscience had never been undertaken at the time of his publication. Yet, because ofhis theories and writings, hundreds have been conducted since then.

Ausubel makes a clear distinction between what he calls “meaningful”learning and rote learning. Meaningful learning denotes the incorporation of

7 For example, within the STAR curriculum, students are asked to predict what will happen tothe appearance of a beautiful spectrum (white light broken up by a diffraction grating) whenviewed through a piece of transparent red plastic. Many students believe that a filterchanges the color of light that passes through it or that a filter acts on the spectrum tochange its actual color. Students with either of these beliefs predict that the appearance ofthe entire spectrum will change to red or that portions of the spectrum will change to adifferent color. When students actually try this experiment and look through the filter at thespectrum, many get quite upset that their conception contradicts what they see. Unable toaccommodate this new experience in their scheme of the world, initially many blame this si-tuation on the filter, believing that it has some unusual or magical properties.

Page 17: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

new concepts and facts into a student’s scheme of the world that results in therestructuring of the student’s knowledge.8

Anchoring ideas, or “anchors,” are preconceptions that are correct andmay provide a path to assimilate new concepts. Although anchors may bestructurally connected to ideas that are false, they can still provide some help.9Teachers who search out the anchoring ideas of their students prior to instruc-tion can more effectively eliminate student misconceptions by concentrating onexamples and analogies that act as a bridge between these anchors and thedesired concept (Clement 1986). For example, almost all students are aware thatthe Sun rises and sets, and that the Sun is up in the sky in other parts of theworld at times when it is night here. That the Sun rises and sets is an anchor forbelieving that the Moon may rise and set as well. Over time, students canobserve this behavior. Since Moonset happens about an hour later every day, ittakes only several days for Moonset to begin happening during daylight hours.In this way the Moon can be regularly placed in the daytime sky and a new wayof thinking about its motions must be constructed.

For students to change their misconceptions, four conditions must befulfilled. First, students must be aware of and unhappy with the power ofexisting conceptions. Second, any new conception must be understandable. Thestudent need not believe it, but must be able to express it. Third, this new ideamust be plausible. It must seem reasonable that this concept could be true and

8For example, students in my Celestial Navigation course (Astronomy 2, Harvard University)

must keep journals of their observations of the night sky. Many who go out on a cloudlessnight expect to see the Moon in the sky, regardless of time or phase. Shocked that the Moonis not visible, they often keep an eye out for it on successive nights and days until they finallysee it, then track it for the rest of the month. Incorporation of their observations, throughrecord keeping and class discussion, completely revamps how they think about the Moon'svisibility. Out of the restructuring comes the ability to predict Moonrise, Moonset, and thecycle and cause of phases. Simply memorizing by rote that the Moon is not always up at nightand the names of its phases would have no effect on the structure of their knowledge. Itwould do nothing to change their ability to make any substantial predictions of the Moon'smotions or appearance.

9In the example in footnote 8, although many students sought for the Moon at night, they didremember that on some occasions they had seen the Moon in the daytime. A correct theory ofthe Moon's motions must include the fact that the Moon is found in the daytime sky as muchas in the nighttime sky. The anchor of seeing the Moon in the daytime can help inreformulation of the learner's ideas in a positive fashion.

Page 18: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0that it does not conflict with the way the student sees the world. Finally, it mustbe productive. This new idea must predict and explain more than the student’sold idea (Posner et al. 1982; Roth 1985a).

B. History of Scientific Misconceptions1. Historical Recapitulation

Children’s theories often appear well thought out and quite remarkable.The development of scientific conceptions in the child in many cases appears tobe similar to historical theories in a field of science. Three examples are presentedto document this claim.

Children often talk of heat as a substance that flows from object to object.Flames can put heat into a kettle of water or heat can escape out of an openwindow. This is reminiscent of the “caloric” theory of heat. So-named byLavoisier in 1787, caloric is a fluid that will “flow from the hotter to the colderbody until an equilibrium has been achieved” (Holton 1985). The mechanism ofheat transfer that children describe, however, is very different from that ofcaloric. Caloric could flow by itself, although it is governed by certain physicallaws. Youngsters rarely attribute an inherent motive force to heat, but talk of itsmovement through “fumes, rays, or waves” (Erickson and Tiberghien 1985).

Many students think of physical motion as dependent on a force inherentin an object. Any object, such as a ball thrown into the air, is kept in upward mo-tion by a force that slowly diminishes, after which the ball falls back to Earth(Caramazza et al. 1981). The imparting and dissipation of “impetus” was a keyelement of medieval physics and can be traced as far back in origin as the Greekastronomer Hipparcus (ca. 150 BC) (McClosky 1983). Leonardo da Vinci (1452-1519 AD) believed in this theory. Even the great Galileo (1564-1642 AD), whommany consider the first modern scientist, began his career believing in theimpetus theory.

Many children describe vision as an active process, whereby emanationsfrom their eyes travel out to the object and behold it (Guesne 1985). Plato (ca.427-347 BC) and the Pythagoreans thought of vision in virtually the same way(Lindberg 1976). In their “extramission” theory, vision was caused when:

Visual current issues forth [from the eyes]...and is formed into a singlehomogeneous body in a direct line with the eyes, in whatever quarterthe stream issuing from within strikes upon any object it encountersoutside. (Plato 1937, pp. 152-53)

Although to the Pythagoreans there are many additional processes (e.g.,the interaction of this stream with daylight), vision is definitely an active process.This theory did not go undisputed. Aristotle (384-322 BC), for one, did notbelieve in extramission:

In general it is unreasonable to suppose that seeing occurs bysomething issuing from the eye; that the ray of vision reaches as far asthe stars, or goes to a certain point and coalesces with the object(Aristotle 1957, p. 225)

We should not, however, draw too close a parallel between the scientifictheories of old and the ideas of students. Although some common features can

Page 19: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1be found, students’ theories are never as sophisticated, internally consistent,and coherent, or even as powerful as those of the scientists of old (Driver et al.1985).

Many researchers have suggested that a historical approach to theteaching of science would help reduce misconceptions (Champagne et al. 1980;Nussbaum 1986; Prather 1985). Studying science by simply reviewing thehistorical development of the field cannot be effective in changing students’misconceptions. Understanding what scientists believed in the past is a verydifficult and time-consuming process. The evolution of scientific ideas is a field allits own—

It is not enough to discover what our predecessors believedand leave it at that: we must try to see the world through theiruntutored eyes, recognize the problems which faced them, and sofind out for ourselves why it was that their ideas were so differentthan our own. (Toumlin and Goodfield 1967, intro.)

Others suggest that modern and ancient paradigms could be contrastedby reading original sources. This would help to acknowledge preconceptionsheld by the students and foster a major reconceptualization (Champagne et al.1980). Although this all seems reasonable, most astronomy texts already beginwith a historical treatment of the development of astronomy, and many teachersexpand on this treatment in their courses. Yet misconceptions persist in thesecourses (Sadler and Luzader 1988). There has yet to be a definitive studyshowing whether emphasis on the history of science changes even a singlemisconception.

One university biologist carried out an investigation into whetherstudents’ conception of photosynthesis across grade levels modeled the historicaldevelopment of the concept (Wandersee 1986). By testing 1,405 students in grade5, grade 8, and grade 11, and college sophomores, he found that youngerstudents are more likely to have conceptions of photosynthesis that wereaccepted long ago, but have since been discarded. Their conceptions do notnecessarily pass through the same stages that the field has passed throughhistorically. Yet, Wandersee notes, teachers should still expose students tomisconceptions of the past as a useful activity.

Professionals in every field usually organize their own ideas aroundcertain grand schemes: philosophical (evolution is the engine of nature); his-torical (the place of the Earth in the universe); or symbolic (Newton’s three lawsof motion are the basis of mechanics). We must guard against believing that theway we think as adults (and scientists) can provide a framework for the structureof curricula or teaching methodology. Philosophical, historical, or symbolicapproaches to the teaching of science may not be effective in changing studentmisconceptions. They are expert views and we, as experts, are comfortable withthem. Students, however, may benefit from methods that may seemobjectionable or vacuous. The real determinant is whether they work withyoungsters and not whether they appeal to us as experts who have sinceforgotten how we learned these concepts ourselves.

2. Concept Lists

Page 20: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2Before the modern investigation of scientific misconceptions, severalstudies were conducted that had as a goal the identification of key conceptstaught in secondary school science courses. Some studies undertook themeasurement of concept attainment for certain groups. Others hoped that byenumerating these concepts, one could compare this course content with that ofcollege level courses. With this information, college courses could then bemodified so that teachers-in-training would have a better foundation in thesubject that they were preparing to teach.

In 1983, 100 high schools in Texas participated in the assessment of theearth science knowledge of 492 randomly selected seniors (Rollins et al. 1983). Sixconcepts were tested by a 72-item multiple-choice instrument, with 12 questionsper concept. Two of the concepts were astronomy-related: the reason forseasons and the cause of day and night. The questions were rather simple andmixed factual and conceptual items.10 Each item consisted of a stem that posedthe problem, one right answer, and three distractors. Random guessing withonly four choices would have resulted in an average score of 25 percent.Students had not completely mastered either of these astronomy concepts by theend of twelfth grade. Students answered 79 percent of the day and nightquestions correctly and 67 percent of the seasons questions correctly. Studentswho had taken more science courses performed marginally better on these items(7 to 10 percent) than the others.

In 1966, an Ed.D. candidate at Colorado State College identified 119 keyastronomical concepts being taught by the ESCP (Earth Science CurriculumProject), at the time a new ninth-grade curriculum, and compared these withconcepts in earth science courses being taught in local colleges (Sonnier 1966).The researcher found that the concept content of the high school and collegecourses was very similar. Only 4 out of the 119 concepts covered in high schoolcourses were not covered in college courses.

Sonnier was especially interested in whether teachers’ comfort in teachingearth science was related to the courses they took in college. Sonnier found thatteachers did not feel especially well prepared by their college courses, althoughteachers with more earth science in college knew more of the ESCP concepts. Ofspecial significance, a negative correlation was found between the amount ofcollege preparation in earth science and the number of concepts teachers feltthey learned in college. Teachers with more courses under their belt felt they hadlearned less than those with fewer courses. Perhaps these more advancedteachers recognized after several semesters that they held misconceptions thattook a long time to change and were less likely than others with less experienceto assume that they had learned these concepts in college. These moreexperienced teachers were much more likely to attribute learning these conceptsto other professional activities or simply to reading the ESCP textbook,according to Sonnier.

10An example of the latter is:The tilt of the earth's axis is a cause of:

A. the Coriolis Parameter B. earthquakes C. day and night D. the seasons.

Page 21: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3A list of earth science concepts was prepared by fifty-four earth scienceprofessors in 1972 to help guide curriculum efforts in grades K-12 (Janke andPella 1972). These authors requested and then selected earth science concepts forgrades K-12 in response to this request: “list what you believe to be the three tofive most important concepts in your area of specialization within earth science”(p. 225). This resulted in a list of fifty-two concepts, of which eleven related toastronomy. Of the six top choices, three were astronomical: the reason forseasons; the cause of night and day; and the Sun as a provider of almost all theenergy available on the surface of the Earth. All three are subject to misconcep-tions within school and adult populations.

In 1963, James L. Keuthe investigated the popular understanding of manyfacts and concepts in science by interviewing 100 college-bound high schoolseniors. Drawing from his own store of questions, he discovered several answersthat were very popular, but wrong. He deemed these “misconceptions,”stipulating that they were produced when a question “evoked the same error” inmost of his subjects. These were not simply guesses that would produce a greatnumber of different answers, but the same wrong answers appeared again andagain. Some of Keuthe’s discoveries about these students were:

— 70 percent believed that the Earth’s shadow causes the phases of theMoon.

— 54 percent believed that the Sun is not the nearest star.— 33 percent were unable to state why the Sun rises in the East.

He concluded that these misconceptions were a cause for concern becausestudents really do believe that their answers are correct. In spite of the fact thatperhaps the students had once been correctly taught this material, Keutheexplained why misconceptions appear: “forgetting occurs because thememorization was rote and not in the framework of a logically meaningfulsystem (Keuthe 1963).” This is an idea very similar to the ideas of David Ausubel.

In 1976, a study was conducted with 220 students in grades K-4 toexamine their explanations for various natural phenomena (Za’rour 1976). Onequestion concerned the Moon. While 75 percent of the students remarked that itchanged shape, 49 percent thought it changed weight as its shape changed.Za’rour found that youngsters are very observant, but waste no time in fittingtheir observations into their current conceptual framework. In this case, theythought that the Moon was not a sphere that was being illuminated in differentways, but an object that was losing part of itself.

C. A Hierarchy of Misconception ResearchSince the first clinical interviews with students, misconception studies have

become more sophisticated. Because the interview process is so time-consuming,many researchers have attempted to construct simple written instruments toassess the misconceptions of students. Such tests have been used to informteachers of the potential difficulties that will arise when they are trying to teachcertain concepts. These “pre-tests” have also been used to predict studentachievement. Most encouraging is that attempts have been made to apply testsof misconceptions to the improvement of instruction, both by testing instruc-tional methods and by curriculum revision.

Page 22: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

41. Clinical InterviewsWhile working on his doctoral dissertation testing the efficacy of audio-

tutorial methods, a Cornell graduate student found that he could not make senseof the explanations given by his second-grade subjects. His interviews wereopen-ended and based on getting the answers to these questions : What is theshape of the Earth? How do you know that the Earth is round? Which way dowe have to look to see the Earth? Why don’t we see the Earth as a ball? Whatdoes one have to do to see the Earth as a ball? (Nussbaum and Novak 1976).

Although he started by following Piaget’s structured clinical interviewtechnique, Nussbaum soon found that the props he provided — a globe and atiny figure of a person — were not having the desired result. Student expla-nations still did not make much sense to him. He finally settled on a series ofquestions about drawings he had prepared that helped to place the children intofive categories based on their notions of the Earth in space. Even so, he was surethat he had not discovered all their conceptions, and he recommended that theybe confirmed through further interviews. The notions that Nussbaum foundprevalent in second graders were:

1. The Earth is flat, although it may be circular and hence “round.”2. The Earth is a ball, but we live inside or above an absolute “ground.”3. The Earth is a ball surrounded by space, but there is an absolute up and

down.4. The Earth is a ball, but objects fall only to the Earth’s surface.5. The Earth is a ball and objects fall toward the center of the Earth.

Nussbaum returned again to test more students and developed an “EarthNotion Classification Scheme” that roughly approximated the stages of thoughtstudents pass through in developing their notions of the Earth in space(Nussbaum 1979). Because of the ground-breaking results of Nussbaum’sresearch, from around the world—in Nepal (Mali and Howe 1979); in California(Sneider and Pulos 1983); in Israel (Nussbaum 1979); with Mexican-Americanchildren (Klein 1982); and in Greece (Vosniadou and Brewer 1987)—his studieswere replicated by many researchers. Nussbaum’s results were substantiatedand expanded. His scheme was refined to:

1. The Earth is flat.2. We live within an Earth that is shaped like a ball and surrounded by

space.3. We live only on top of an Earth that is shaped like a ball and surrounded

by space.4. (#3 above) and objects fall to the surface of the ball.5. (#3 above) and objects fall toward the center of the Earth.

Sneider and Pulos added another category to this scheme, which theynamed 3/4, in which students believe that objects fall to the surface of the Earth,but that people do not live all over the Earth’s surface. Vosniadou and Brewerfound that many children believe that there are two Earths, one that is in the sky

Page 23: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5(or in space) and the other on which we live.11 In the studies of Nussbaum andSneider and Pulos, many 13- and 14-year-old students still believe that the Earthis flat or that we all live inside a hollow Earth. This confounded the middle schoolscience teachers with whom I have talked. Could it be that their students havethese beliefs without the teachers being aware of them?

I decided to pursue my own line of inquiry into this matter by askingteachers to predict the Earth notions of their students. For three summers (1988,1989, 1990), as a part of my presentation of astronomy activities to teachers atthe National Science Resources Center Institute and the Independent SchoolAssociation of Massachusetts, I had 111 teachers of grades K-8 predict thedistribution of Earth notions among students in their classes (Lightman andSadler 1986). The level of Earth notion was based on Nussbaum’s scale asdescribed above. For each grade level, an average score was calculated for boththe teachers’ predictions and the actual students’ performance. Sneider andPulos did carried out testing in grades three to eight. Teacher predictions weremade for grades kindergarten to eight. The results were quite surprising (seeFigure 2). At each grade level, teachers predicted that their students would per-form well above the average levels found by Sneider and Pulos.

If the data collected by Sneider and Pulos are representative of the stu-dents in the sample teachers’ classrooms, then teachers are woefully unaware oftheir students’ conceptions of the world. Student conceptions in grade 3 (andmost likely in K, 1, and 2), are of a flat or hollow Earth. Yet, teachers believe thatstudents’ notions are much more sophisticated. Undoubtedly, instruction aboutspace or world geography is very ineffective at these grades because of thedisparity between the conceptions students hold and the concepts their teachersthink they hold. Many students at the lower primary level think the globe in theclassroom is a model of some other planet in outer space, not the one on whichwe live. How can they make sense of geography instruction whenever a globe isused? Teachers’ awareness of their own students’ misconceptions may be afruitful subject for further study.

11I have had discussions with my own son when he was five years old about his notion of theEarth: he does believe that there are two Earths, the one that is in outer space, which hesees in his picture books clustered with the other planets, and the one on which we live.

Page 24: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6

Interviewing students to find out what their own ideas are is a verydifficult and time-consuming affair. It requires attention to finding out whatyoungsters really think, as opposed to having them reiterate what they havebeen taught. The technique of “Interviews about Instances” was developed tohelp get the interviewee to talk about his or her ideas (Bell et al. 1985). Thismethod involves asking a series of open-ended (“What do you think the Earthlooks like from space?”) and closed-ended (“Does the Moon have night andday?”) questions that the interviewer can prepare in advance. Each question canbe written on a card with a simple picture or diagram that helps to illustrate it.These instances, however, are only the starting point. The interviewer must notstick to a rigorous line of questioning, but must always seek to improvisequestions to help him understand the ideas of the child.

Whenever scientific or specialized vocabulary is mentioned by the child,the interviewer must follow up until she is confident that she understands whatthe child means by any such word. Children often mean quite different thingswhen they use scientific terms than do scientists. To young children, the word“animal” usually means four-legged mammals and specifically excludes humansand insects (Carey 1985). The terms “velocity” and “acceleration,” so clearlydistinguishable to the physicist, are used interchangeably by many physicsstudents (McDermott 1984). The phrase “I don’t know” should be a stimulus tothe interviewer to push further, albeit perhaps gently, to determine whether thestudent is avoiding the question because he thinks the interviewer is judgmentalof the answer. It is important that the interviewer be seen as interested in all ofthe student’s ideas and not as someone who is judging the student on his or herworth (Driver and Easley 1978). I have found that students rarely stick to their “Idon’t know” answer when given a little encouragement and support. As a result,they feel safer and are more forthcoming with their ideas.

1

2

3

4

5

K 1 2 3 4 5 6 7 8

Teacher Predictions N=111 (Lightman and Sadler 1986)

Student Interviews N=134, (Sneider and Pulos 1983)

Figure 2, Earth Notions Development Level

Average Earth Notion

Development Level

Grade Level

Student Interviews from Sneider and Pulos 1983Teacher Predictions from Sadler 1987-90

Page 25: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7Videotaping the interview can be very helpful. It frees up the inter-viewer’s attention by removing the need to take notes, so that the subject isnever told to wait while the interviewer catches up. The tapes can later bereviewed to create an inventory of student misconceptions (Sadler 1987). I havefound that students seem less reticent in front of a camera than when their ideasare being recorded on paper.

Researchers recommend that teachers interview their students about theirscientific conceptions (Duckworth 1987; Langford 1989; Novick and Nussbaum1978; Nussbaum and Novak 1982). Such action is recommended to help clarifyteachers’ own ideas and lead to an appreciation of “children’s science” (Osborneand Bell 1983). While this is an admirable suggestion, teachers have great diffi-culty developing expertise in this procedure.12

The misconception literature says nothing about teacher difficulties inlearning to interview their students, and I could find no study that looks at howeffective teachers can become successful interviewers. Judging from my ownexperience, listening to my classmates, and watching videos of teacher-ledinterviews, I have found that teachers have great difficulty simply because theyexpect to teach when they interview. Many express a desire that students willlearn something as a result of the interview. This leads them to correct anystudent statements that they deem incorrect, and, of even more consequence, tolimit the exploration of statements that appear correct or use scientific jargon.Teachers tend to turn interviews into situations with which they are morefamiliar and more comfortable: either testing sessions where they reward rightanswers or tutoring sessions where they instruct the students. Papers that rec-ommend that teachers interview their students should stress two indispensablerules:

• Never try to teach in an interview.• Continue until the student’s conception disagrees with your own.

2. Development of Written InstrumentsThe major shortcoming of finding student misconceptions through in-

terviews is that these investigations, because of their one-to-one nature, alwaysrely on few subjects. Interview studies with more than 100 subjects are rare inthe literature. Because of their small size, these investigations are not really

12The technique of using teachers to interview learners is used by Professor Eleanor Duckworth

in two of her classes at the Harvard Graduate School of Education (T-440, Teaching andLearning; and T-150, Research Based on Understanding). Her students, most of whom haveexperience as teachers or soon will be teachers, learn to use clinical interviews to explorestudents' ideas. Even after an entire semester of practice, I have observed that many of theseteachers and teachers-to-be still appear to be struggling to master this technique. To supposethat this skill can be learned by only reading an article is naive.

Page 26: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8representative of the school-age population they seek to study. Researchershave investigated various ways to get around this problem in order to be able totest larger and more diverse samples of students.

Prather called for the development of “reliable diagnostic tests designedto identify and classify as many common misconceptions of science as possible”(Prather 1985, p. 27). He assigns this task an immediate priority in that any workon improving the teaching of science by reducing misconceptions must wait forappropriate evaluation instruments.

Open-ended written tests, where students fill in an explanation for theexample posed, have been used by researchers: students are asked to give awritten explanation for a particular event; they can include diagrams if they wish.Although this technique can be quite fruitful, students often simply recall the“official” explanation for a particular phenomenon and load their explanationwith jargon to cover their confusion. However, these written instruments canstill lead to identification of many popular misconceptions, from which multiple-choice tests can be constructed (Halloun and Hestenes 1985).

Written tests have been constructed that force a choice between a singlecorrect answer and several misconceptions that have been identified throughinterviews (Freyberg and Osborne 1985). This technique limits the student’sincorrect responses to previously identified misconceptions. The inclusion of an“I don’t know” or “none of the above” category reduces the efficacy of the testbecause students are no longer obliged to choose among conceptions that mayclosely, but not exactly, match their own.

A two-tier pencil-and-paper test has proved to be somewhat morehelpful, in that it combines the elements of an open-ended test with a multiple-choice test (Treagust 1986). Each test item consists of two parts, each of which ismultiple-choice in nature. The first part asks the student to predict the outcome ofa situation. The second part lets him select a reason for this answer and provides aspace for filling in his or her own, if the stated answers are inadequate. Ininterpreting these multiple-choice tests, responses that draw more than 10percent of the answers are usually examined in depth (Gilbert 1977).

A major shortcoming of using written instruments instead of interviews isthat the results have to be interpreted in the light of differences in students’reading and writing abilities, rather than only in terms of their misconceptions(Cohen 1980). Written tests do not allow the interviewer to pursue the subject’sideas until they are clearly captured. However, in an attempt to find out ifmultiple-choice tests could discriminate between students’ Piagetian operationallevels, Gilbert interviewed and later tested with a written instrument 20 collegestudents (Gilbert 1977). He found the students’ performance identical in 93 out of120 items. However, I could find no study in which the effectiveness of multiple-choice tests of misconceptions was compared with that of interviews. This couldbe a valuable research opportunity.

Multiple-choice tests need not be used exclusively. Interviewing studentsalong with giving them multiple-choice tests can be used for the purpose ofvalidation of the tests (Halloun and Hestenes 1985). Interviewing even a modestfraction of students taking a formal written test can help describe in detail anyconceptual changes that have taken place. Interviews, because of their open-

Page 27: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9ended nature, can help uncover areas of conceptual change that may not bepicked up by the prepared pre-tests and post-tests (Finley 1986).

Table I shows the type of misconception test used in each of thirty-eightstudies that I have examined in the course of preparing this paper: clinicalinterview; open-ended statements requiring writing or drawing by the subject;and multiple-choice. The median size of studies using interviews is only thirty-sixsubjects, so results may not often be significant. The mean and standarddeviation of the number of subjects are calculated for these three types of test.

Written tests are quickly and easily administered to large groups ofstudents. Tests requiring written answers have a median size of 113 subjects,roughly three times the size of studies using interviews. Reading level and theability to express oneself through writing become important in these tests (Steadand Osborne 1980).

Open-ended questions have the advantage of uncovering unexpectedmisconceptions, whereas multiple-choice tests produce standardized answersthat are easily compared. Multiple-choice tests show a median of 189 subjects inthe studies I have examined. They require that the misconceptions of the subjectsmust be identified previously and codified into a structure of short,unambiguous answers. Multiple-choice tests are easily scored, even by thosewith no knowledge of misconceptions, and their results can be usedimmediately. They are useful not only for studies, but also for teachers to easilyascertain misconceptions in their own classrooms.

Page 28: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0 Table I. Misconception Study Sample Size by Domain and Type of TestStudy Domain Interview Written Mult. Ch.

Anderson and Karrqvist 1983b light 21 207

Anderson and Smith 1986 light 11 125

Arnaudin and Mintzes 1985 human biology 50

Bouwens 1986 light 639

Brown and Clement 1986 mechanics 50

Caramazza et al. 1981 mechanics 50

Clement 1986 mechanics 132

Cohen 1982 astronomy 50

Dai 1990 astronomy 185

Dufresne et al. 1986 mechanics 42

Finley 1986 magnets 16 32

Gunstone and White 1981, trial gravity 175

Gunstone and White 1981 gravity 468

Halloun and Hestenes 1985 mechanics 1,500

Happs and Coulstock 1987 astronomy 25

Hardiman et al. 1986 mechanics 42

Kenealy 1987 mechanics 513

Keuthe 1963 general science 100

Klein 1982 cosmography 24

Lightman and Miller 1989 cosmology 250 1,111

Mali and Howe 1979 cosmography

Nussbaum 1979 cosmography 240

Nussbaum and Novak 1976 cosmography 60

Ogar 1986 gravity 189

Placek 1987 mechanics 49

Rhoneck and Grob 1987 electricity 10

Roth 1985a biology 18

Sadler 1987 astronomy 25 213

Shipstone et al. 1987 electricity 1,250

Sneider and Pulos 1983 cosmography 159

Stead and Osborne 1980 light 36 144

Thijs 1987 mechanics 162

Page 29: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1Touger 1985 astronomy 113 99

Treagust and Smith 1986 astronomy 24 113

Viglietta 1986 astronomy 25

Vosniadou and Brewer 1987 cosmography 60

Wandersee 1986 biology 1,405

Za’rour 1976 general science 220 0 0

Median = 36 113 189The advantage to interviews is that the reasons behind a student’s

response can be pursued by the interviewer and ambiguous responses can beclarified. However, the interviewer must be very skilled and aware of studentmisconceptions. Tapes or video must be transcribed for a full analysis to becompleted (Stead and Osborne 1980), and this may be both time-consuming andexpensive.

Figure 3 illustrates the relative frequency of use of these three differenttest methods. Each of the studies was grouped by its number of subjects into oneof four categories: 1 to 9; 10 to 99; 100 to 999; 1,000 to 9,999. Interviews aregenerally used in small groups, while open-ended questions and multiple-choicetests are used in large groups, with the latter preferred for very large groups.

Table II. % of Studies Using Different Instruments by Sample Size

Figure 3. Range of Study Size by Instrument Type

# of Subjects in Study

# of Studies

0

5

10

15

10 100 1,000 10,000

Multiplechoice

Written ordrawn

Interview

Page 30: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2Size of Study Multiple-choice Written or drawn Interview

11 to100 29% 38% 78%

101 to 1,000 50% 56% 22%

1,001 to 10,000 21% 6% 0%

3. Comparative Studies of Scientific MisconceptionsSeveral studies have attempted to compile misconceptions that cover an

entire scientific domain. Compared to studies based on interviews, these at-tempts usually are based on earlier interview research. The reason given forpursuing this line of inquiry is to inform the practitioner of the broad range ofmisconceptions that must be dealt with while teaching a specific course.

In 1988, a doctoral student at Loyola University produced an eighteen-item multiple-choice test to identify misconceptions in earth science (Schoon1988). Of these items, thirteen pertained to astronomy.13 Schoon developedmany of his test questions from those of earlier researchers (Janke and Pella1972; Keuthe 1963; Lightman et al. 1987; Sadler 1987). He wished to determinehow the number of misconceptions varied across gender, race, grade level,geographic location, exposure to earth science courses, and last science gradereceived in school. He gave his test to 1,213 students in the greater Chicago areain grades 5, 8, 11, college, and trade school.

Schoon found that females had significantly more misconceptions thanmales (at p≤.05), although by a magnitude of only one-third of a test item. Blackand Hispanic students had significantly more misconceptions than white students(at p ≤ 0.05), although again the magnitude was small, only about one-half of atest item. One might expect students at different grade levels to exhibit vastlydifferent levels of misconceptions; however, the mean number of misconcep-tions of fifth-grade students was only one-half an item higher than the mean forcollege students (at p ≤ 0.05). A significant difference was found between urbanand suburban students (at p ≤ 0.05), although it too was only about one-half atest item. Surprisingly, there was no significant difference between students whohad taken earth science courses (at p≤.05), either in college or in high school, andthose who had not. Finally, no significant difference was detected in totalnumber of misconceptions based upon the last science grade the student

13 Here is an example of one of the test questions:Summer is warmer than winter, because in summer:

a) The sky has fewer clouds.b ) The earth is nearer the sun.c) The earth is better insulated.d) The sun is higher in the sky.

Page 31: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3received (at p ≤ 0.05). The “best” students in science were no different fromtheir classmates in the number of misconceptions they possessed.

Schoon’s study shows very small but statistically significant differences inmean number of misconceptions between his subgroups. This may be a result ofthe similarity in knowledge of these groups, or it could be a result of problemswith his test. It appears that the test was devised with little attention paid to theestablishment of its validity. He did construct a pilot instrument of sixty-threemultiple-choice items, but these questions do not appear to be identical to thoseused in the studies of earlier researchers. Three science teachers rewordedquestions to improve clarity and accuracy and eliminated others. A fifty-questiontest resulted from this procedure.

Schoon’s work has made a major contribution to study of misconceptions;however, several changes would have increased its validity. A discussion ofthese problems of readability, comprehension, expert evaluation, reliability, andnumber of distractors follows.

Three fifth-grade teachers reviewed this instrument for readability andcontent, but no formal readability measurement was applied. I applied tworeadability tests to Schoon’s test. 14 The Flesch grade level (Hopkins 1981) is 5.8and is based on the average number of words per sentence and the averagenumber of syllables per word. The Gunning Fog Index (Microsoft Corporation1991) is 7.6 and is based upon the overall sentence length and the number ofwords per sentence containing more than one syllable.

Since the estimated readability of the test is somewhere around 5.8 and 7.6,the majority of the fifth graders would have had difficulty reading it and some ofthe eighth-grade students would have had difficulty as well. Had the test beendesigned to be more readable, the younger students might have performed thesame as or possibly better than the older students in Schoon’s study.

This preliminary instrument was administered on the very first day of thefall semester to seventy-five high school and college students. Studentssuggested that some questions be reworded. This pilot test was never tried outwith students in the fifth or eighth grades. The suggestions and improvementsmade by high school teachers and students and college students may have madethe test too difficult for younger students. The author does not discuss how hemade the final selection of items to include. An example of the type of questionthat may have been too difficult for fifth graders to understand is Schoon’squestion #2:

14 Applied by the Grammar Checker in the word processing program, Microsoft Word 5.0.

Page 32: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4Each day during the summer months, the amount of daylight:a. is more than the day before.b. is less than the day before.c. is the same as the day before.d. has nothing to do with the day before.

Fifth-grade students may not be aware which are the summer months (“Is Junea summer month?”), that the “amount of daylight” means the number of hoursthe Sun is above the horizon, or may not understand the proposed relationshipbetween daylight on subsequent days. If June is a summer month, then thecorrect answer is not listed.

The logical structure of some of the questions seems very complex andbeyond the ability of younger students to decipher. For example, take question#15:

If a crystal can scratch glass, then:a. it is a diamond.b. it is not a diamond.c. it may be a diamond.d. it probably is not a diamond.

Students may not be able to discriminate clearly between statements that aretrue, may be true, are probably not true, and are not true. In addition, the “it” inthese answers may be confusing to students. “It” may be construed as a crystal,glass, or diamond.

Schoon did not administer the test to earth science experts to determine ifthey agreed on all of the answers. He never administered the test in an open-ended format to determine if students had any additional misconceptions.

Reliability was never formally established. Students were never asked toanswer the test questions orally to check if they would still give the sameanswers. Students were never given the test twice to ascertain the role ofguessing in their answers. There was no comparison of test scores amongdifferent but equivalent groups. The Kuder-Richardson test (Aiken 1985) wouldhave proven useful for examining the internal consistency of the instrument(Anderson 1975).

4. Causal Models of Achievement Using Scientific MisconceptionsStudents have various misconceptions that persist throughout their

schooling and carry over into adulthood. With the development of instrumentsto ascertain misconceptions in various fields, it becomes possible to investigatethe role of misconceptions in learning. Studies have been constructed to deter-mine: How important is the holding of scientific misconceptions to studentachievement? What is the relative importance of preconceptions when comparedto the variety of demographic attributes and the previous schooling on studentachievement ? To what degree is it possible to predict students’ performance in acourse on the basis of the misconceptions they hold at the start?

Page 33: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5One might suspect that students with high Intelligence Quotients holdfewer misconceptions than less bright students, or that students who receivehigher grades in courses hold fewer misconceptions than those who receivelower grades. Using the statistical tools of multiple regression or factorialmodeling procedures (Lohnes 1979), one can investigate the causes of studentperformance.

Two investigations have been carried out to explore the role of IQ and thenumber of scientific misconceptions that students hold. One study involved theadministration of a fourteen-item Newtonian mechanics misconception test totwo groups taking a similar physics course in the same school (Placek 1987). Thegifted group of twenty-five students had a mean IQ of 146. The less gifted grouphad a mean IQ of 116. The test consisted of nine prediction questions thatrequired a written response and five tasks requiring predicting the outcome ofan experiment. The test items were based on those that had been administeredby other researchers (Clement 1982; McDermott 1984; Minstrell 1982; Osborne1984). Although the mean final course grades given by the physics teacher werevery different (93 for the more gifted versus 81 for the less gifted), the miscon-ception test did not show a similar dichotomy. A chi-square test detected nosignificant difference at the p < 0.01 level between the two groups on any of thefourteen questions. On half of these items the more gifted students did better; onhalf they did worse than the less gifted group.

Another investigation in Germany was conducted with ten studentswhose IQs ranged from 107 to 141 (Rhoneck and Grob 1987). These ninthgraders were taught lessons on basic electricity using apparatus with which theycould test their ideas. They took a multiple-choice misconception test based onmisconceptions uncovered by prior investigators (Shipstone et al. 1987). Rhoneckfound no significant correlation between IQ and initial misconceptions ormisconception at the end of the course at the p < 0.01 level. However, hedetected a significant positive correlation between pre-test and post-test scores of0.90. This implies that students tended to hold on to their original conceptionsthroughout the course, regardless of their IQ.

Newtonian mechanics is one area within physics in which many studentmisconceptions have been identified through interviews. There have been somany interviews and small studies that a comprehensive inventory can becreated (Halloun and Hestenes 1985). It has become possible to use such aninventory of items to find out how the holding of misconceptions may affectstudent performance. Three studies have sought to measure students’ qualitativeunderstanding of Newtonian mechanics and to assess the role of misconceptionsin mastery of mechanics.

In 1978, a study was conducted to determine factors predictive of studentachievement in an introductory, non-calculus-based physics course at theUniversity of Pittsburgh (Champagne et al. 1980). Two classes of factors wereinvestigated. The first were measures of the students’ background and includedgender, and the number and type of courses previously taken in mathematicsand science. The second set of factors were measured by testing the studentsusing instruments constructed especially for the study:

Page 34: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6— a motion preconception test in which watched an adult carry outdemonstrations, and then described their observations and offeredpredictions of future events (e.g., many thought objects hung on astring exerted more force on the string when closer to the floor).

— a reasoning test required the application of logical reasoning torepresentations of the real world.

— a math test covered skills such as use of the quadratic formula,scientific notation, and trigonometric functions.These three specially prepared instruments were given at the start of the

course to a total of 110 participants. Subjects also answered questions about theirscience and math backgrounds. Students took mid-term and final one-hourexams that tested their ability to apply physical principles to new problems.

The authors used correlations between each of the variables andregression analysis to examine the effect of the students’ previous knowledge ontheir mechanics achievement score (their examination grade). The onlysignificant correlations (p < 0.01) between the variety of factors and themechanics achievement score were with the instruments that were especiallyprepared for this study: the preconception test, the reasoning test, and the mathskills test. Number and type of science and math courses taken were notsignificant.

A multiple regression analysis was used to determine the amount ofvariance in the achievement test that could be accounted for by each of thecomponents. All of the background measures examined in the study—genderand years of physics, mathematics, or science taken in high school orcollege—had no significant effect (at p = 0.05). The authors were especiallysurprised at the apparent lack of effect (R = 0.09 and was not significant at the p <0.05 level) of taking high school physics (some had taken two years of it, astandard course and an advanced placement course). Since the primaryjustification given by teachers at every grade level for science coursework is thepreparation it provides for the next level, it appears that this reason is unjustifiedby the results of this study (Hofwolt 1985). High school physics courses are, onaverage, ineffective in preparing students properly for introductory collegephysics courses.

The three specially prepared tests were significant (at the p = 0.01 level) inaccounting for some of the variance in the mechanics achievement score. Themultiple regression analysis estimated the contribution of each factor toexplaining the variation of test scores. All correlations were significant at the p =0.01 level. In Table III the cumulative effect of these three significant factors istabulated. Multiple R is the cumulative correlation coefficient. R2 is the amountof variance explained by the listed factors. The “Variance explained” is thecumulative effect of all the previous factors as a percentage.

Page 35: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7Table III. Contribution ofFactors to Variance

Multiple R2 Varianceexplained

Motion preconception test: 0.056 5.6%Logical reasoning: 0.143 8.7%Math skills 0.325 18.3%

The authors were confounded by the relatively small amount of variancein achievement score explained by the motion preconception test. Theyexamined the students’ written answers on the motion preconception test,concentrating on answers from only the highest- and lowest-performingstudents in the class. The contrast between these two groups was obvious in thatthe highest-scoring students made heavy use of the terms and concepts ofmechanics and were consciously aware that they had accepted the Newtonianmodel of the world. Low-scoring students explained motion on the basis of someother paradigm.

Champagne and Klopfer continued this line of inquiry and in 1982 carriedout further analysis using the same data, but different statistical tools(Champagne and Klopfer 1982). Their application of Factorial Modeling gives amore detailed look at the structure and strength of the relationship betweentested variables. It produces factors that are minimized for their intercorrelation.The advantage of this procedure is that predictions can be made concerning theeffect changes of the predictors will have on the criterion variate. In this study,they modeled the impact on the mechanics achievement score of changing themotion concepts, reasoning, or math skills scores. (Raising the score on any ofseveral independent variables will increase the dependent variable by acalculated amount.) The example the authors give is the effect of separatelyraising the scores of each of the pre-tests by one standard deviation. In each casethe predicted achievement score would be for each test: motion concepts, 0.1 SD;reasoning, 0.2 SD; math skills, 0.3 SD. It is interesting that the two componentsthat have the least to do with physics raise the achievement score the most.

To improve the reasoning score, the authors suggest working withpuzzles, word problems, and logical problems. To improve the math score,practice with solving equations and graphing would be most appropriate. I haveoften heard high school physics teachers complain that their students “can’tthink” and “can’t do math.” They lament that these abilities should have beenhoned in earlier grades and there is no time in physics class for the teacher topursue improvement in these areas. Given that high school physics instructionhas virtually no predictive value for college achievement in physics, physicsteachers may be wise to teach less of what they deem physics and more rea-soning and math. In this way they might make a significant impact on thelearning of physics by their students. A study that explores this hypothesis incomparison with conventional instruction would be very useful.

In 1985, researchers at Arizona State University carried out aninvestigation similar in parts to that of the study above (Halloun and Hestenes1985). Two college physics teachers constructed a thirty-six-item physicsmisconceptions test and administered it to a much larger population than

Page 36: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8Champagne and Klopfer: 1,500 college students taking either a first semester,calculus-based or non-calculus-based physics course. They also ascertained thestudents’ math skills, and the number of physics and math courses taken pre-viously. The dependent variable was the course grade. This was determinedalmost entirely from examination results.

These results are consistent with those of Champagne and Klopfer.However, with the help of a large sample size, statistically significant resultswere found for four factors: the physics pre-test, the math pre-test, and thephysics and mathematics courses taken previously. High school physics hadsome impact on the non-calculus-based course, but none on the calculus-basedclass. High school math had some impact on the calculus-based course, butalmost none on the non-calculus-based class (see Table IV). Although both thesefactors were significant, their effect was tiny compared to the contribution of thephysics and mathematics pretests.

Table IV. Contribution of Courses to Variance

R2 for students in these courses:Calculus-based Non-calculus-based

Physics pre-test 0.30 0.32Math pre-test 0.26 0.22All previous physics courses 0.07 0.12All previous math courses 0.10 0.04

The R2 statistic is the fraction of variance of the course grade for thestudent population that is accounted for by each factor independently. By usingthe math and physics pre-test scores together, Halloun and Hestenes found thatthey could accurately predict 53 percent of the variance in letter grades of thestudents in the physics courses they examined. The physics pre-test score had thegreatest predictive power for how well students would learn the subject. Theauthors suggest that professors give the pre-tests in their courses and thatstudents with low scores be offered special instruction or a pre-physicspreparatory course that is more student-centered than large lecture courses.

These two ground-breaking studies have a number of minor flaws. Thefactors may not have a causal effect. Simply because there is a correlation be-tween math ability and performance in college physics does not guarantee thatthe math ability of students is a major cause of physics performance, no matterhow plausible the idea. The authors do not fully explore alternative explanationsfor their data. There may have been hidden variables that accounted for thesecorrelations.

The authors also lost the opportunity to confront high school teachers ofphysics with their results. How would teachers explain the lack of correlationbetween taking their course and doing well in college physics? There is anopportunity to replicate these studies in chemistry and biology as well.Determining the role that high school science courses play in how students

Page 37: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9perform in college science courses may give some hints on what type of sciencecurriculum is appropriate for those students in high school who wish to takescience in college.

Neither of these studies uses item analysis to determine which items oneach of the math and physics pre-tests are the best predictor of studentperformance and which have no predictive power. A much shortened test wouldbe almost trivial to administer on the first day of class, if it were made up of onlya few questions. It would then be easy to shunt those students who wouldordinarily do poorly in introductory physics away from the course or offer themadditional help.15

5. Evaluation of Instruction and CurriculumWe all have misconceptions and it is difficult to comprehend how they

persist in spite of instruction. Many researchers have questioned the efficacy of

15I have taken the advice of Halloun and Hestenes in Celestial Navigation. The courseenrollment is limited by the fact that the department has only twenty-five sextants.Nevertheless, for the last three semesters fifty to seventy students have showed up eachsemester on the first day of class. I administered a misconception test based on navigationalconcepts to students in the spring of 1990. The test also documented each student's math andscience background, as well as his or her year. At the end of the term, a Spearman correlation(Aikens 1990) was carried out to find which questions on the misconception test were mosthighly correlated with students' final grades and which were not, or were negativelycorrelated. A stepwise regression was then performed to find a small set of questions thatwould account for the largest amount of variance in the student grades. I was surprised thatthe math and science background of the students had virtually no impact on theirperformance in the course and that two questions explained 73 percent of the variance in thefinal grade.

I now use this pre-test to winnow down the number of students in the course to twenty-five andI have noticed that the students seem to learn the material much more easily (I can coverseveral more topics) and get much higher grades on exams (there are fewer Ds and Fs). I havehad misgivings about this being the primary criterion for selection of students (I am alsoswayed by students who have tried to enroll previously). This improvement could instead bethe result of improvement in my teaching, the laboratory material, the text, or somecombination of variables. I am continuing to give pre-tests to my students to see if there areother misconceptions that I can test for that are also predictive.

Page 38: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0conventional instruction in changing students’ conceptions (Anderson andSmith 1986; Brumby 1984; Carey 1985; Champagne et al. 1980; Eaton 1984;Nussbaum and Novak 1976). They often found that the conceptions of studentsdo not change as a result of instruction. Others have sought alternative methodsthat do work in changing students’ ideas. These attempts fall into two broadareas: changing teaching methods, or changing instructional materials.

Those who suggest modifications in curriculum materials or teachingmethods bear the responsibility for demonstrating their effectiveness (Atkin1963). To be taken seriously, one must show that any alternative curriculum orteaching technique makes a significant difference, when compared either with acontrol group or with the same group prior to instruction. These methods can bequalitative, such as using clinical interviews, or quantitative, using multiple-choice tests. The number of educational experiments where new schemes aretested is too large to comment on in this paper. I have restricted myself to a fewcharacteristic examples that bear on astronomy teaching or methods that dealwith misconceptions.

The launching of Sputnik by the Soviet Union on October 4, 1957 gave riseto the curriculum reform movement of the late 1950s and 1960s. Many new ap-proaches to teaching science were funded by the National Science Foundation.These programs were unusual in that they were created by large teams thatincluded scientists, science educators, and teachers, not simply by one or twoauthors (Kyle 1985). Several of the programs produced are still in use today.BSCS (Biological Science Curriculum Study) Biology and PSSC (Physical SciencesStudy Committee) Physics are two examples of programs that remain popular.ESSP (Elementary School Science Project) is an example of one project that haspassed from the scene since its initial funding in 1960.

ESSP was developed at the University of Illinois under the direction ofStanley Wyatt, an astronomer, and J. Myron Atkin, a science educator. Their goalwas to produce:

materials that are sound astronomically, that reflect the structure ofthe subject as it is viewed by astronomers of stature, and that can behandled by teachers and children in actual classrooms. (Atkin 1963, p.129).

It took only three years for this team to create a series of six booklets forelementary students, complete with descriptions of activities, text, and cleverillustrations. By 1963, 350 teachers were trying the materials and providingfeedback to the project when formal evaluation of the program began (Klopfer1964b). Leopold Klopfer, then of the University of Chicago School of Education,designed and carried out a study to assess the effectiveness of ESSP materials,beginning in February and ending in May 1964. Klopfer selected a subset of ESSPmaterials, the first book of the six-book series, Charting the Universe, for hisevaluation. He hypothesized that use of the ESSP materials would increasestudents’ knowledge of astronomy and general understanding of science, andpositively affect students’ attitudes toward scientists and science.

The study used a pre-test/post-test format and three test componentswere used to explore the hypotheses. A subject matter pre-test consisted of 28questions: 15 that dealt with the subject matter specifically covered in the book,

Page 39: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1and 13 that tested for general knowledge in astronomy. The post-test includethese 28 questions plus 14 additional questions. A 36-item test of understandingthe processes of science, TOUS (Test On Understanding Science), was also givenas a pre-test and post-test, and a “semantic differential” instrument was createdand administered to determine affective changes (Osgood et al. 1957).16

The tests were given to ninety-two students during the 1963-64 schoolyear, in four classes comprising the entire fifth grade of the University ofChicago Laboratory School. The same teacher taught all the classes in three fifty-minute periods each week for a total of ten weeks. She followed the teacherguide exactly and also chose to include most of the supplementary activities andexercises in the book. She taught each class in the same way so that the fourdifferent classes were considered to be a single population. None of the tests wasadministered by the teacher.

The results of these tests were depressing. The gains on the subject mattertests were quite small, although they were all significant at the p < 0.001 level.Table V shows mean pre-test and post-test scores. The “Book 1 Content” testeditems that were explicitly addressed in the ESSP materials. The “GeneralKnowledge” section tested for gains in astronomical concepts that were notexplicitly addressed in the materials. The authors had hoped that exposure to theESSP materials would increase knowledge of related topics.Table V, ESCP Test Scores Pre-test Post-testBook 1 Content 37% 49%General Knowledge37% 43%

IQ accounted for only 7 percent of the variance in these scores. Klopferwent on to calculate the significance of individual test items using a chi-squaretest. Of the 28 test items on the pre-test, he reported that only 11 had significantchanges: 5 from Book 1 and 6 from the general knowledge test. However,Klopfer used a rather generous probability coefficient in interpreting his data,that of p < 0.05. If we use a more conservative level of p < 0.01, only 7 out of 28questions showed gains that were significant.

Results on the TOUS were even less impressive. Klopfer found asignificant difference on this test, but the gain in the mean of the class scores wasless than one test item. It was only a 2 percent gain. Only 6 of the 36 questionsshowed significant gains, but, again, Klopfer was generous in his setting the levelof significance at p < 0.05, as the differences for only 2 out of 36 questions weresignificant at the p < 0.01 level. Of the test population, 33 students (36 percent)actually scored lower on the post-test than on the pre-test.

The results from the 44-item semantic differential test are a bit moredifficult to interpret. At the p < 0.05 level, 23 items show a significant change. At

16TOUS, of a form in common usage today, asked students to rate certain statements on a scale of

1 to 5; for example:Doing science is: dull A B C D E exciting

Page 40: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2the p < 0.01 level, 17 show a significant change. Surprisingly, most of thechanges are in the negative direction. Students viewed astronomy as lessexciting, reading about science as less enjoyable, scientists as less interesting, andtheir science teacher as less useful after the ten-week experiment.

Clearly this curriculum had some very serious problems. Students whotook it seemed to be in worse shape as a result. Although they learned a fewspecific skills, there was almost no gain at all in their understanding of howscience works. They developed a worse attitude toward science and itspractitioners as a result of the course. There were flaws in the evaluation designas well that could have affected the results. The subject matter test questionswere made up especially for the test, but no mention is made of testing itsreadability or of having any feedback from teachers about the chance that fifth-grade students would be able to understand it. The questions appear to be thesort that would appeal to astronomers, but that even elementary school teachersthemselves would have difficulty answering. 17

I used two tests of readability to determine if this question wasunderstandable by Klopfer’s subjects. This question’s Flesch Grade Level is 5.6and its Gunning Fog Index is 7.0. These are too high for most fifth graders tounderstand the item. Klopfer never interviewed students to determine if theyunderstood the questions. The test’s reliability (Kuder-Richardson 21) was only0.353 for the Book 1 items and 0.475 for the general knowledge items. These arequite low measures for reliability. Low reliability on achievement test isconsidered to be in the 0.66 level (Aiken 1985). For tests with high reliability,students would be expected to answer the same question in exactly the sameway if they took the test more than once. Klopfer made no use of a controlgroup. He could have used half the group for treatment and taught the otherhalf some biology by reading stories; this might have led him to discover someother variable responsible for the poor showing of the course, such as anineffective teacher.

The population of students was quite extraordinary. This was a laboratoryschool. The mean IQ of the subjects was 124. Whatever the results might have

17An example of this is question #13:To find the scale of a model boat, you wouldA. find the difference between the length of the model boat and the length of the real boat.B. measure both the length of the mast and the length of the sail since at least two

measurements are always neededC. divide the length of the sail on the model boat by the length of the sail on the real boat.D. multiply the length of the model boat by the length of the real boat.E. divide the length of the real boat by the length of the model boat. (Klopfer 1964b, p. 21)

Page 41: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3been from Klopfer’s study, he would have a hard time applying them to theaverage fifth grader.

By having the same individual teach all of the ESSP classes, the study maybe demonstrating the poor teaching of this individual and not the failings of thematerials. Using more than one teacher, especially since the treatment classeswere randomized, would have helped determine what factor the teachers playedin the changes in student knowledge and attitude.

In 1985, two researchers at Michigan State University decided to focus ontrying to change the misconceptions of fifth graders concerning light (Andersonand Smith 1986). They developed a simple and efficient method for testinggroups of children and embedded this within a sound evaluative design. Theexperiment consisted of teaching a control group of 102 fifth graders about lightusing a conventional elementary science textbook (Eaton 1984). The second year,another group of 125 fifth graders were taught the same content, but the textwas supplemented by a teachers’ manual and a series of transparencies thathelped explore students’ misconceptions about color and light. Teachers taughtabout light for two or three times a week during a four to six week period.Students were given identical pre-tests and post-tests.

The tests were developed by the authors and were based on the contentof the textbook and commonly held student beliefs. They contained forty-sevenquestions in year 1 and were shortened to thirty-seven questions in year 2. Thetests were a mixture of open-ended, yes/no, and multiple-choice questions. Itcovered four topics:

a. how people see;b. the nature of color vision;c. the interaction of light with various objects; andd. the structure and function of human eyes.Anderson and Smith developed a 5-point scale (-2, -1, 0, 1, 2) that rated the

students’ test answers on the degree to which they were naive or scientificbeliefs. The tests were corrected by coders who assigned ratings to eachstudent’s answer to each question. To increase reliability, only questions forwhich two independent coders had agreements of greater than 80 percent wereused during the first year. For the second year, these other questions weredropped from the test. Comparisons are made of only questions that wereposed in both years. The authors conducted eleven clinical interviews, fivebefore instruction and six after instruction, to determine the construct validity ofthe tests. Only two items were found to have agreement of less than 80 percentbetween interviews and written tests.

The authors found statistically significant differences in the post-test scoresof the treatment group over the control group when they used a “consistentcommitment” to the scientific view as their gauge of student understanding. Forexample, by the post-test only 20 percent of students in the control groupthought that people’s eyes see light that is reflected from objects, whereas withinthe treatment group the score was an impressive 78 percent.

Page 42: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4The large gains in learning of the treatment group may not have simplybeen the result of the misconception focus of the lessons. The treatment groupwas taught by the same teachers as the control group, but a year later. The studyalso found measurable learning in both control and treatment groups, from pre-test to post-test. Undoubtedly, teachers were more experienced in teachingabout light during the second year, so larger gains should have been expected.Also, the teachers may have been less likely to transfer their own misconceptionsto the students during the second year, because they were the objects of studyand they learned from the teaching materials as well. The authors report that“Several of the teachers told us that the transparencies and the manual hadhelped them to understand color and color vision” (Anderson and Smith 1986, p.26). The authors should have taken the precaution of administering their tests tothe teachers as well, to see if their changes of conception played a role inmeasured gain of the students.

The instrument that was constructed for this study relied almostexclusively on requesting explanations for events, and not on predictions, forexample:

Is white light a mixture of colors of light? If you answered yes, list someof the colors that make up white light. (Anderson and Smith 1986, p.20)

The answer used to grade this question is not even scientifically accurate.Light that appears white can be made from two (blue and orange), three (red,green, and blue—the colors of phosphors in a color television set), or up to aninfinite number of colors. The so-called seven colors of the spectrum are anarbitrary set named by Isaac Newton (Newton 1721). This question and otherson the test may not identify misconceptions at all, but may simply elicit the recallof what the teacher has said or what the student has read. To get a high score forthis question in particular, a student needed only to answer “yes” followed by“all colors” or a listing of colors in the spectrum. This answer could easily havebeen memorized. A better discriminator between scientific and misconceivedconcepts would have been, “What happens to white light when it passes througha prism?,” followed by “What happens when you pass only red light through aprism?” Presented with a new situation, students would have had to use theirconceptions to come up with an answer.

The rating scale used by the study changed significantly from the firstyear to the second, so that the results are not so easily compared. The testinstrument had 21 percent fewer questions in the second year compared to thefirst. Students may have had more time to complete the test or they may havejust been fresher from taking a shorter test. The study would have been better iftwo teachers were involved. The first could teach a control group during thefirst year and used the misconception materials the second. The second teachercould have used the misconception materials during the first year and taught thecontrol group in the second.

The textbook defines the science class. It is generally followed slavishly byteachers, who attempt to cover its entire contents and rarely depart from thetopics and the order presented in the book (Hofwolt 1985). This dominance ofthe text discourages alternative techniques of instruction. Texts present facts and

Page 43: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5concepts as truth, with little need for discovery or verification by the student.Students’ own ideas are not involved. Basing instruction on a text is arguably notas productive when compared to alternative instructional techniques. In anattempt to explore why students have great difficulty learning from texts, adoctoral student at Michigan State University created text material that dealtexplicitly with student misconceptions and compared it with conventional texts(Roth 1985b). Roth found that conventional texts do little to change middleschool students’ conception of photosynthesis (that plants make their own food,rather than absorb it from the soil). She selected two conventional textual treat-ments of photosynthesis (control 1 and control 2) and constructed one herself ofthe same length (treatment). She assigned six students each to read each text andinterviewed them immediately afterwards. Pre-tests and post-tests were alsogiven containing multiple-choice and open-ended items.

The experimental text first elicits students’ conceptions by asking fordefinitions and explanations. It then presents experimental evidence thatchallenges their conceptions. Only after carefully examining and ruling out themost popular student misconceptions by giving evidence for their falsity is thescientific concept elucidated. Finally, students must apply this new conception toa variety of situations and problems.

Through interviews, Roth studied how students answered questions afterreading the text. She found that students who read the conventional texts wouldanswer text-based questions by recalling “big” words and facts from the text orby calling on their prior knowledge and ignoring the text. When answeringquestions about the real world, they would always use their prior knowledge,although they would recall examples from the text that they maintained wouldsupport their ideas. Students who read the experimental text were morereported to use the ideas in the text to change their conception.

An analysis of post-test scores showed that all students who had used theexperimental text were successful in giving up at least some of theirmisconceptions for scientific conceptions (these four conceptions are labeled q1,q2, q3, q4). Students with the correct conception are labeled as “1.” Those whohad nonscientific conceptions are marked “0.” The students who changed theirconceptions were aware that the ideas presented in the text were in conflict withtheir own views. They used the information in the text to help discard their ownideas in favor of more accurate ones. Roth did notice that some of the studentswho read the conventional text also changed their misconceptions, but they didnot change them with the same frequency as those in the treatment group (seeTable VI). Also, only those students who had reading levels roughly six yearsabove grade level could effectively use the conventional texts to change theirideas. Those who were closer to average or below average in reading level (RL)made little or no progress (Roth 1985a).

Page 44: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6Table VI, Misconception Test Results from Text StudyGroup Name RL q1 q2 q3 q4Treatment Daryl 3.4 1 1 1 0Treatment Evalina 5.6 1 1 1 1Treatment Allison 7.6 0 1 0 1Treatment Doug 8.1 1 1 1 1Treatment Vera 8.6 1 1 1 1Treatment James 11.3 1 1 1 1Treatment Sheila 13.0 1 1 1 1Control 1 Jill 4.0Control 1 Maria 4.0 0 0 0Control 1 Myra 6.0 0 0 0Control 1 Phil 6.0 0 1 0Control 1 Deborah 10.0 0 1 0Control 1 Parker 13.0 0 0 0Control 2 Linda 4.5 0 0 0 0Control 2 Tracey 5.6 0 0 0 0Control 2 Danny 7.1 0 1 0 0Control 2 Sally 8.4 0 1 0 0Control 2 Kevin 12.6 0 1 0 1Control 2 Susan 12.6 1 1 1 1

The design of Roth’s study is adequate for an exploratory purpose, butthe quantitative analysis is too weak to make its case. The author takes care tobuild a case for the types of learning strategies that each student uses to answerquestions, but each is a qualitative assessment. She performs no statisticalanalysis on her results to find the probability that her impressive post-test resultscould have occurred by chance selection of those students who were ready tochange their ideas. After all, there were only seven students in her treatmentgroup. The study does not fully explore the relationship between reading leveland conceptual change. Nor was statistical analysis performed to determine whatfraction of the variance in outcome might be explained by use of the text, studentreading level, or any other variable.

In my own analysis of Roth’s data, I found her conclusions not to be wellsupported (see Table VII). Separate t-tests (Aiken 1985) were computed for eachcontrol group and the treatment group. Using a t-test to determine theprobability that her results could have occurred by chance, I found that two ofthe concepts fail the test at p < 0.01 for control group 2, and one question for

Page 45: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7control group 1. The control and treatment groups were just too small for theresults of the experiment to be significant.

A simple regression test for student reading grade level against the testresults shows a disturbingly high regression coefficient. One could interpret theresults of the fourth question as resulting from difference of reading level andnot difference in treatment.

Table VII, Regression Analysis Group 1 Group 2 Group 1 and 2Treatment vs. Control group t-test, p= t-test, p= regression R2 =q1. Plants make food 0.0005 0.009 0.037q2. Need light to make food 0.01 0.1132 0.096q3. Get food only by making it 0.0005 0.009 0.037q4. Plants first get food from seeds ——— 0.0586 0.425

Roth’s study makes wonderful reading and many of her ideas have beenincorporated in the writing of Project STAR’s text. Each of our chapters nowstarts with prediction activities with which the students identify and clarify theirown conceptions through either answering questions or interviewing others.Upon reflection, I believe that the Project STAR text does not handle eachmisconception explicitly and give students reason to disbelieve it. Revisions thisspring should probably include the specific evidence so that each misconceptioncan be discarded by the students who hold it.

D. Methodological Problems of Past ResearchMisconception testing is in its infancy. Many of the studies that I have

examined were carried out by science professors on their own students, oftenwith little regard for accepted testing procedures. Many had severemethodological flaws . There is little evidence that their conclusions weresupported by their tests. Test items are rarely analyzed individually forappropriateness. The use of small samples of students inhibits the finding ofstatistically significant results. Probability that the same results can be obtained atrandom drops as the sample size grows. Modern statistical tools for analysis ofdata are infrequently used. Of the thirty-nine studies that I examined in thepreparation of the paper, each had some deficiencies in validity, or reliability, orsampling problems.

Validation is the process of accumulating evidence to support inferencesmade from test results (Cronbach 1990); it includes not ignoring evidence thatdoes not support inferences. It is the most important single measure of testquality. However, misconception tests are often devised with little attempt toassess their validity. In the process of reviewing published misconceptionsurveys, I found that many were deficient in the establishment of their validity.For example:

• Testmakers did not clearly define their objectives. The purpose anddomain of the test were not clearly stated and the need for a newtest was not justified.

Page 46: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8• Test items did not measure only student misconceptions. Individualitems should independently measure a single psychologicalconstruct, namely, a single misconception. Subjects who score wellon the test should, on average, outperform students who dopoorly on the test, on each item.

• There were no systematic criteria to minimize test error byexamining each test item. This often helps to reduce random errors.Reliability was not measured by retesting or some other suitablemethod to determine if students gave the same answers repeatedlyor were guessing. Of course, the process of interviewing studentsto find their misconceptions has an inherently low reliability. Thetest-retest method is not very effective for interviews since learningmay occur during the initial interview, leading to different results inthe second (Hoz 1983).

• The format of the test was often inappropriate. Establishment of testvalidity should include a discussion of why a timed test, multiple-choice, or difficulty of items was chosen and why it is appropriate.

• Items did not appear to be statistically independent from each other.Items should not provide clues to other questions and should notassess precisely the same notions. The correlation betweendifferent test items should not be too high.

• Teachers did not feel that the items were good indicators of learningwithin the curriculum. Tests should be based upon objectives thatteachers feel characterize their courses rather than simply being anassemblage of random items drawn from the discipline.

• Items were not well written. They must be at the appropriatereading level, and minimize complex grammar and scientificjargon.E. Implications of Past StudiesThe discovery of deeply held beliefs that are at odds with scientific

conceptions has had a profound effect within the field of science education.Students are not learning many of the concepts that were thought in the past tobe easy to learn. Students manage to cope with being asked to accept beliefs thatthey feel are foreign by simply memorizing a few terms and regurgitating themon exams, or they may accept portions of the concept and discard others that donot fit with their models. These misconceptions persist in spite of instruction.Many astronomical conceptions are covered in subjects taught in elementaryschool, in junior high earth science, in high school physics, and again in college(where they may be taken for granted by the professor and never discussed).No subset of students is spared from holding misconceptions. Gifted studentshave as many misconceptions as less gifted students. Students who take highschool physics exhibit as many misconceptions in that field as those who do not.College professors are no more effective at changing students’ ideas than highschool teachers. All the while, students continue to pass their science courses withthe same misconceptions they hold upon entering and often they get As.

Page 47: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9Several research groups have attempted to produce inventories ofstudent misconceptions through interviews. These have been found to be usefulin cross-sectional studies that have determined that many misconceptions remainvirtually unchanged over the course of schooling. Pre-test/post-test studies havebeen used to attempt to find teaching strategies and instructional materials thatare effective in modifying students’ beliefs. Although many people have pointedout that the progression of misconceptions that a student may hold over timebears some similarity to the historical development of the discipline, no one hasyet determined if the study of the history of a scientific field has any beneficialeffects in overcoming student misconceptions.

Many researchers have suggested that conventional high school sciencecourses do little to prepare students for courses in college, in spite of what theirteachers believe. Several studies have shown that performance in science coursesis highly correlated with students having few misconceptions and having a highlevel of skill with mathematics. No one has yet proposed doing away with highschool science courses in favor of courses that deal only with studentmisconceptions and math skills. Such an experiment could be performed todetermine if students taught in this fashion achieve more when they takeadditional courses in science.

Many researchers think that teachers will benefit from interviewing theirown students in order to become aware of their misconceptions. Although this isa noble idea, no experiment has yet verified that this is true. Knowledge ofstudent misconceptions on the part of the teacher may not be enough to changethe students’ ideas. In fact, the additional time needed to carry out interviewsmay decrease instructional time and could turn out to be counterproductive. Theconstruction of curriculum materials that effect changes in student beliefs whenused by classroom teachers would be a much more cost-effective solution.Progress has already been made in the development of textual material andvisual aids to teaching. Alternative instructional methods, such as teachersconcentrating on anchoring concepts and reasoning skills, may also prove useful.

The study of scientific misconceptions as a field has grown dramatically inthe last decade. Many researchers have investigated the naive beliefs of studentsin a variety of science disciplines. These investigations cover grade levels fromkindergarten to college. Several involve questions that are astronomical subjects,but none is comprehensive in this field.

Most investigations of students’ misconceptions are presently performedusing clinical interviews or open-ended written instruments and are dependenton the ability of the interviewer to identify the conceptions of the student. Asinvestigatory tools, these open-ended techniques work well, but as evaluatoryinstruments they have many failings. They are not free of the bias of theinvestigator, who may be looking for certain results and may find them evenwhen they do not exist. Some investigators change the instrument used duringthe study or do not control for variables that may invalidate the results. Many ofthese studies examine too few students for their results to be statisticallysignificant.

Page 48: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0Many studies have investigated student conceptions about ideas that areincluded in astronomy courses at the high school and college level. It is possibleto construct a comprehensive multiple-choice test using these investigations.18

The few studies that have attempted to evaluate the efficacy ofexperimental instructional methods or materials in changing studentmisconceptions are often flawed by problems with their design. Several usecontrol groups and post-tests without pre-tests, so one cannot tell if the groupswere really similar at the start. Others use pre-tests and post-tests withoutcontrol groups, so they cannot separate out the effect of changes that wouldoccur naturally over the course of the experiment. Some use populations toosmall for adequate statistical significance or populations not sufficiently ran-domized. Several studies use instruments with reading levels that are too difficultfor their subjects.

Not analyzing results in enough detail is another problem; examining theresults from individual items or even the change in selection of distractors couldprove very useful. Often the validity of a test is only ascertained by thoseconnected with the study and outside validation is not sought. This reduces thepotential discriminating power of the instrument. Sometimes no attempt is madeeven to ascertain the reliability of the test instrument by retesting orinterviewing subjects, or by applying conventional statistical tests.

The major question that still lies unanswered by this review of theliterature is:

18 My investigation has identified these sources of the most common student misconceptions in

astronomy:Cosmography: Klein 1982; Mali and Howe 1979; Nussbaum 1979; Nussbaum 1985; Nussbaum

1986; Nussbaum and Novak 1976; Sneider and Pulos 1983.Cosmology: Lightman and Miller 1989; Lightman et al. 1987; Viglietta 1986.Gravity: Gunstone and White 1981; Mali and Howe 1979; Sneider and Pulos 1983; Stead and

Osborne 1981; Vincentini-Missoni 1981.Light and Color: Anderson and Karrqvist 1983a; Anderson and Karrqvist 1983b; Anderson and

Smith 1986; Bouwens 1986; Brown and Clement 1986; Eaton 1984; Eaton et al. 1983; Feher 1986;Guesne 1985; Jung 1987; Slinger 1982; Stead and Osborne 1980; Watts 1985.

Seasons: Furuness and Cohen 1989; Klein 1982; Rollins et al. 1983.Moon Phases: Camp 1981; Cohen 1982; Cohen and Kagan 1979; Dai 1990; Kelsey 1980; Za'rour

1976.The Solar System: Broman 1986; Dobson 1983; Edoff 1982; Friedman et al. 1980; Touger 1985;

Treagust and Smith 1986.

Page 49: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1Can instructional methods or curriculum materials dramatically andefficiently change student misconceptions?The opportunity exists to develop, field-test, and validate an instrument

for assessing student misconceptions in astronomy. This instrument should bebased on the large body of interviews in this domain. It can be validated througha review by a panel of astronomers and science teachers. Reliability can beincreased by attention to the instrument’s reading level and assured throughinterviewing subjects who have taken the written test. This instrument should beembodied in pre-test/post-test, control group design to test for changes in theconceptions of students who are instructed with experimental materials.Treatment and control groups should be chosen with attention to theirequivalence. The groups should be large enough to ensure a reasonablestatistical significance for small changes in conception.

Such a test can be used to examine conceptual changes in studentswrought by learning with Project STAR (described in the following section)curriculum materials compared with conventional astronomy courses. Ifproperly carried out, this could be the first significant test of science materialsthat seek to modify students’ beliefs in science. Such an investigation has thepotential to change the way curricula are designed and taught, so that miscon-ceptions may be more efficiently replaced by scientific conceptions.

Page 50: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2III. MethodologyThis study has been carried out as a survey of a large nonrandom

population. It is a part of a much larger evaluative study of Project STAR usingpre-testing and post-testing of control and treatment groups. For this study, thecontrol and treatment pre-test groups have been combined and looked attogether, since the pre-test was given before the treatment of using the ProjectSTAR curriculum had begun.

In carrying out this study, a sixty-item multiple-choice instrument wasconstructed based on interviews reported on in the research literature and oninterviews with high school students conducted by myself and members of theProject STAR staff. This instrument was administered to students at the start oftheir earth science or astronomy class. These data were collected and analyzed atProject STAR using a variety of computer-based statistics programs.

In this section, I will review the sample selection procedures, describe thesample as it relates to the larger Project STAR dataset and its history, describe theresearch questions and my hypotheses, describe the origins of the instrument,and review the administration and analysis procedures.

A. Description of the DatasetIn an attempt to remedy some of the current problems in pre-college

science instruction, the Harvard-Smithsonian Center for Astrophysics (CfA) hasundertaken to develop and test a new type of high school science course. Similarto its support of curriculum development efforts in the 1960s, the NationalScience Foundation (NSF) has funded a team of physicists, educators, andclassroom teachers to develop Project STAR (Science Teaching through itsAstronomical Roots), a modern, activity-based physical science course. This six-year project receives support from the NSF, the Smithsonian Institution, andApple Computer.

The need for Project STAR emerged from growing concern that too fewstudents were learning science at the high school level (Welch et al. 1984). Usingastronomy as a focus, the development team hopes not simply to increase theenrollment in high school science courses, but, more importantly, to improve thestudents’ understanding of science and its role in making sense of the world.Unlike most high school science texts, STAR’s materials de-emphasize vo-cabulary and facts, and emphasize powerful scientific concepts. The educationalapproach of Project STAR is based on three principles:

• Students learn best through hands-on activities (Wise 1983, Shymansky1982).• Mastery of a few ideas is more effective than cursory exposure to manyconcepts (Bloom 1971).• Students’ misconceptions, unless confronted and changed, obstructlearning (McDermott 1984).

Classroom observations of students and discussions with teachers showthat students enjoy the “hands-on” nature of this course and actively participatein observation and experiment. Both teachers and students have remarked that

Page 51: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3the limited number of concepts treated by the preliminary materials is awelcome change from the more encyclopedic coverage of other science courses.Whether students actually undergo changes in their strongly held conceptions,however, remains open to question (Sadler and Luzader 1988).

A major component of the curriculum development process for ProjectSTAR has been formative evaluation. Hundreds of students and teachers havebeen interviewed and thousands of subjects have taken multiple-item pre-testsand post-tests to measure their conceptions. This dataset provides a richrepository of information on student misconceptions in astronomy and howthey change in the course of instruction.

1. History of the DatasetIn 1986, I began the program of formative evaluation for Project STAR.

Each year, as new curriculum materials were trial-taught in participating schools,a pre-test was administered in September and a post-test given in either Januaryor June, depending on whether the course was one or two semesters in length.These tests contained items covering misconceptions, astronomical facts, andmath skills that students brought to the course. Each year, the gains on varioustest items were examined and compared with control classrooms. We modifiedthe curriculum materials as a result. These tests were not explicitly designed to beused for summative evaluation, but over the years they evolved into a form thathas proven quite useful for diagnosing student misconceptions. The validity andreliability have risen to such a level that they could be useful in classrooms acrossthe country.

The initial work involved creating a two-tiered misconception test thatcombines the elements of an open-ended test with a multiple-choice test(Treagust 1986). Each test item consisted of two parts. The first part askedstudents to predict the outcome of a situation. The second part let them provide areason for their answer.19 In interpreting these multiple-choice tests, responsesthat drew more than 10 percent of the answers were examined in depth (Gilbert1977). These items were drawn from interviews reported in the literature thatrelated to astronomy.

19An example from this test:On August 7, 1654, J. Kepler went outside, looked up and saw a star explode, a supernova. When

do you think the star really exploded?a. at least a few years before August 7, 1654.b. on August 7, 1654.c. at least a few years after August 7, 1654.d. at some other time.Give your reason: ____________________________________________________

Page 52: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4In the spring of 1987, I used open-ended written tests with twenty-fiveninth-grade students at Cambridge Rindge and Latin High School and thenconducted interviews seeking to validate the written responses. These interviewswere quite successful and documented the responses to the written instrument.Videotapes of these students explaining their astronomical ideas became thecentral theme in the production of a documentary film, A Private Universe, whichcompared the conceptions of high school students with those of Harvardgraduates and faculty.20

As the curriculum project progressed, each year a new pre- and post-testwas developed to match the year’s evolving curriculum objectives. Items thatteachers deemed to be unclear were rewritten. Changes in the objectives of theProject STAR course forced the elimination or modification of some of the items.

Items that were answered correctly more than 80 percent of the timewere deemed “anchors” and were incorporated into the curriculum materials asideas that students would probably know and on which they could build.Examples of such ideas and facts are:

• Light takes time to reach us from the stars.• The light year is a measure of distance.• The Earth is 93,000,000 miles from the Sun.• The Moon is closer to the Earth than is the Sun.

These facts were removed from subsequent tests. Distractors—items that werechosen less than 5 percent of the time—were rewritten and replaced so that theywould have greater appeal to the students. Often this meant combing the recentliterature for more popular alternatives or inserting “scientific sounding”jargon.21 These changes tended to make the test more difficult each year as thedistractors became more attractive.

20The film has since won four major awards:Silver Apple, National Educational Film and Video Festival, Seattle, WA. (1987);Gold Medal, Documentary, Houston International Film Festival (1988);Gold Plaque Award, Chicago International Film Festival (1989); andBlue Ribbon, American Film and Video Association (1990).

21 In 1987 the question dealing with the cause of the seasons was:What causes the seasons?

A. The Earth's distance from the Sun.B. The Earth's axis flipping back and forth as it travels around the Sun.C. The Sun's motion around the Earth.D. The Earth's axis always pointing in the same direction.E. The shifting seasons on the Earth.

Page 53: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

52. Aspects of the DatasetThe Project STAR dataset is too unwieldy to be used in its entirety for this

dissertation. I have therefore chosen to use only multiple-choice data for themost recent academic year, 1990-91 (see Appendix A for test). Interviews andopen-ended questions will not be included except as they apply to issues ofvalidity. This decision also maximizes the number of subjects who have includeddemographic information on their tests.

Of the sixty multiple-choice questions asked, thirteen dealt withdemographics or descriptions of the students’ background. Many of thequestions are based on pictures, diagrams, or graphs, which are intended toreduce the reading level required of subjects.

The target population for this study is students from grades eight throughtwelve who are beginning an earth science or astronomy course. The 1,414subjects are the students of 22 teachers from four different groups:

• Students of seven teachers from the greater Boston area who haveparticipated in the development and testing of Project STARmaterials. This group was recruited by initially contacting each ofthe fifty-four high school science departments; within the route 495area to find who, if anyone, taught astronomy. All twelve of theseteachers of astronomy were interviewed individually by me andwere offered a consultancy with the project. Ten teachers accepted;seven remain with the project today.

• Students of twelve teachers throughout the United States whovolunteered to attend a two-week Project STAR summer instituteat the CfA where they helped to develop Project STAR materialsand agreed to test them in their classrooms.22 This group was self-

By 1990 the question evolved into a different form:The main reason for it being hotter in the summer than the winter is:

A. The Earth's distance from the Sun changes. (46%)B. The Sun is higher in the sky in the summer. (12%, the correct answer)C. The distance between the northern hemisphere and the Sun changes. (37%)D. Oceans carry warm water north. (3%)E. An increase in greenhouse gases. (3%)

22This group initially consisted of respondents to a call for Project STAR participants announcedyearly in StarNews, the project newsletter. The original list of 408 astronomy teachersresponded to a national census of all 11,100 high school science department heads in theUnited States sent out by the project. These census cards were mailed postpaid on May 15,

Page 54: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6selected by accepting all applicants from inner-city schools withfull scholarships and by requesting that teachers from suburbanand private schools pay a larger fraction of their own expenses.

• Students of three teachers who teach earth science or astronomy inthe same schools as the teachers in the above two groups, who arenot involved with Project STAR

• Students of two teachers who were identified as teaching highschool astronomy courses and were randomly selected from theStarNews mailing list. These teachers were successively contactedby telephone until we obtained a total of 200 control students. Theprimary reason for involving these teachers was to include acontrol group for control/treatment studies of Project STARcurriculum materials.These teachers have characterized their schools according to a number of

factors. Three teach in rural areas, sixteen in suburban districts, and three incities. All but two teach in public schools. The economic status of theircommunities varies considerably. Four are characterized by these teachers aslow, five as low to average, nine as average, two as average to high, and one ashigh. School sizes vary from 325 to 2,000 students with a mean of 1,282. Thegeographic distribution of these sites is shown in Figure 4.

Figure 4, Locations of School Sites

1986. Since then, as I and other project staff have given over 160 papers and workshops, thislist of newsletter recipients has grown to roughly 4,000 teachers.

Rockford, IL

Long Prairie, MN

Needham, MAPlymouth, MA

Wausau, WI

Indianapolis, INEuclid, OH

Framingham, MA

Andover, MA

Rochester, MN

Hudson, NH North Andover, MA

Watertown, MA

East Weymouth, MAOak Park, MIWauwatosa, WI

Maryville, MO

Page 55: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7The sixty-item tests were given to subjects within the first two weeks ofthe start of their astronomy or earth science class. They were given by theirregular teachers during an ordinary class period of from 45 to 55 minutes.Students generally finished the test in 30 to 35 minutes. An attempt was made toproduce a positive testing atmosphere by having the teacher explain to each classthat this was a test that would help to “design an even better course for studentsto take in the future” and that it would not count as a part of their grade. Anidentical test was given to each subject within two weeks of the end of theircourse as a post-test. Analysis of these data, however, will not be included in thisstudy.

Tests and answer cards were shipped to participating teachers in time forthe start of school. Teachers agreed not to discuss the items on the test with thestudents prior to or after the test administration.

B. Research Questions1. Question 1 focuses on the validity of the Project STAR

misconception test.1(a) Is the test a valid instrument for measuring the misconceptions of

students entering an introductory astronomy course?1(b) Which test items appear to be most appropriate in assessing student

misconceptions in astronomy, and should be included in a revised instrument?1(c) How reliable is this test?

2. Question 2 focuses on the misconceptions revealed by the test.2(a) For students enrolling in a course where astronomical concepts are

taught, for which concepts will students initially hold conceptions that are at oddswith accepted scientific views?

3. Question 3 focuses on the demographic aspects of the subjects andtheir relation to scientific misconceptions.

3(a) Are differences in the quantity of misconceptions related to gender?3(b) Are differences in the quantity of misconceptions related to ethnic

heritage?3(c) Are differences in the quantity of misconceptions related to the

educational accomplishment of parents or guardians?4. Question 4 focuses on the school-based aspects of the subjects and

their relation to scientific misconceptions.4(a) Are differences in the quantity of misconceptions related to students’

grade level or age?4(b) Are differences in the quantity of misconceptions related to students’

prior completion of specific mathematics or science courses?C. HypothesesIt is difficult to predict reliably how the analyses described above will turn

out, but based on my own experience over the years with student interviewsand a cursory analysis of student responses, I definitely had my own predictions.

Page 56: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8I believe that the majority of test items will expose misconceptions. Distractorswill be chosen by students with much greater frequency than one would expectfrom random choice. In many cases, distractors will be more popular than thescientifically correct answers.

Teachers will, for the most part, characterize the test items as reasonable.They will predict low initial pre-test scores and high post-test scores for theirown students. Some test items will probably be found to be poor indicators ofstudent misconceptions based on correlations with the total test score or throughthe application of classical test theory.

I believe that demographic and schooling factors will not account formore than one-third of the variance in the total scores. Older students will haveno fewer misconceptions than younger ones. This prediction is supported byevidence that students who have taken earth science have no fewermisconceptions than those who have not. Gender and ethnicity account for noreduction in variance.

D. InstrumentThe Project STAR pre-test consists of directions to the student, forty-seven

content questions, and thirteen demographic questions. The content questions,for the most part, deal with misconceptions, but there are seven items that werethought to cover astronomical facts, and six that address mathematical skillsessential to astronomy. Demographic questions identify gender, age, ethnicheritage, grade level, math background, science background, parents’educational level, the students’ educational plans, their view of the need forscience in their future, and their reason for enrolling in the course.

All the questions are of multiple-choice format. Each content questionconsists of a stem and five alternatives that are selected by filling in thecorresponding cell on a computer-readable “bubble” response card. Only one ofthe responses is scientifically correct. The other four represent misconceptionsthat are found in the literature on scientific misconceptions or have beenrevealed through interviews with students conducted either by me or by otherproject staff. These distractors have been written to be as plausible as possible.An effort has been made to ensure that:

• Correct answers are free of scientific jargon.• Correct answers are similar in length to the distractors.• Clues to correct answers are not found in other test items.• Distractors do not overlap conceptually.• Distractors do not have grammatical mismatch cues in tense or

number.• All responses are concise.• All diagrams are labeled correctly.• All responses are within a well-defined content area.• There is sparing use of “none-of-the-above” or “all-of-the-above”

answers.

Page 57: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9Most of all, a concerted effort has been made to include only materialrelevant to uncovering scientific misconceptions. The questions have been keptsimple, and the reading level has been kept low.

E. ProcedureMany operational details seem obvious to me only after completing this

research, so I have included them in this section to help any future researcher,wishing to carry out a similar study, to save time and energy by replicating thesemethods.

1. AdministrationThe misconception test was administered by classroom teachers and 100-

item Chatsworth data cards with five alternatives were filled out by students andthen mailed back to Project STAR where they were read by a Chatsworthcomputer card reader. Cards that were unreadable because they were filled outin pen or hard pencil were entered into the database by hand. Several dozencards were spot-checked to establish the accuracy of the computer input. Dataare stored in text-file format so the dataset is accessible to both Macintosh andIBM-PC microcomputers.

2. VariablesMany variables were directly measured or calculated from the student

tests. They fall into two groups: variables that vary by student and variables thatvary by test item.

a. Measured Variables by SubjectsStudents answered forty-seven questions that dealt with content

knowledge and thirteen that described their demographic background andschooling. A total score was calculated as the total number out of forty-sevenquestions that students answered correctly.

All demographic and schooling variables let students choose an answerthat was a category. For the purpose of analysis all these variables wereconverted into numbers directly, by assigning a ranking of 1 to 5, or by use ofdummy variables.

A simple regression of the data revealed that students’ ages could becomputed by adding 6 to their grade levels. This shows that the range of agesavailable for students to select should have been extended downward by oneyear. Eighth-grade students are for the most part, fourteen years old. By codingall students who chose “other” as their grade and “15 yrs or younger” as theirage as 14-year-old eighth graders, a much better regression fit was possible . Thisrecoded age variable has been labeled Age*. The fraction of variance in gradelevel explained by students age was increased from 46 percent to 73 percentusing this recoded variable.

Math level follows the usual sequence of courses offered in high schools(see Table VIII). Most students begin with Algebra I and continue in the

Page 58: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0sequence, Geometry, Algebra II, and Pre-Calculus or Trigonometry, until theystop taking math. Students who avoid this sequence usually follow a generalmath course. This progression is graphed in Figure 5.

Table VIII. Distribution of Math Level by GradeGrade 8 Grade 9 Grade 10 Grade11 Grade12 missing

General Math 88% 43% 14% 5% 8% 40%Algebra I 8% 41% 56% 29% 10% 40%Geometry 2% 10% 15% 36% 21% 0%Algebra II 2% 3% 14% 25% 39% 0%

Trigonometry 1% 4% 1% 6% 23% 20%missing 4% 13% 1% 1% 0% 0%

count 374 157 136 309 301 5

Figure 5 shows graphically how the profile of math knowledge changes fromone grade level to another. In the eighth grade, most students have only takengeneral mathematics; by the tenth grade the majority have taken at leastAlgebra I, and by twelfth grade the majority have completed Algebra II.

Examining the three questions that relate to educational attainment ofmothers, fathers, or the students’ themselves, I found that choice C, Graduatefrom Trade, Vocational, or Business School and choice D, Some college werereversed in level.

Item 59 on the importance of science to a student’s future occupation wasdesigned to be rank ordered in the same way that it was written. Table IXsummarizes the coding assignments described above.

Page 59: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1

Table IX. Coding Assignments for Student Background VariablesNumericalAssignment

1 2 3 4 5

Grade other 9 10 11 12Math general algebra I Geometry Algebra II Pre-calc.Mother’s ed <HS HS some coll. trade sch. collegeFather’s ed <HS HS some coll. trade sch. collegeEd. Aspir. <HS HS some coll. trade sch. collegeImp. of Sci. not at all somewhat important very imp. essential

The other demographic and schooling factors were readied for regressionanalysis by assigning dummy variables. Each choice was coded as a 1 if it wasselected and as a 0 if it was not. Sex was coded as 1 for male, 0 for female. Similarcodings were made for ethnic heritage, science courses completed, and reasonfor taking the astronomy or earth science course.

Figure 5. Math Profile by Grade Level

Page 60: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2b. Measured Variables by Test ItemsSeveral variables were calculated from student data. These are used for

analysis of the total population and subpopulations, and for analysis of thequality of the items themselves:

— P-value is the fraction of subjects who chose a particular answerto an item. It is usually calculated only for the correct answer.For this analysis of misconceptions, it is calculated for all answerchoices. P-values can only be positive and have a value from 0to 1.

— D-value is the discrimination index of a particular answer to anitem. It is a way of determining differences among subjects onthe basis of some underlying construct. In the case of this test, itis the degree to which the correctness of a test item is correlatedwith the score on the entire test (Hopkins and Stanley 1981). Foreach of the five item answers, I have calculated the coefficient ofcorrelation of the subjects’ scores on an item with their totalscores. The results will test the assumption that subjects who dowell on the test overall are more likely to answer any particularquestion correctly (Osterlind 1989). The D-value of both correctand incorrect answers has been calculated for this study. D-values can be positive or negative and must lie between -1 and1. 23

— Item number is the assigned number of a question on the testand ranges from 1 to 60.

— Reading level can be calculated in any of several different waysto assess the difficulty or ease that students have in reading andunderstanding text.

c. Missing Demographic ObservationsOne of the major objectives of this study is to determine if demographic

or schooling factors help explain the variance in student performance on tests of

23In the equation below, X is the individual student’s score on a specific item (always 0 or 1), Y

is the individual student’s total score, n is the total number of subjects, SD is the standarddeviation, ∑ is the sum over all subjects, and Y is the individual student total score. A barover a variable signifies it is the mean for all subjects.

dxy = n XY - XΣ YΣΣ

n X2 - XΣ 2Σ n Y - YΣ 2

Σ

= n X-X Y-YΣn SDxSDy

Page 61: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3misconception. However, when taking this test, 185, or 13.1 percent, of thesubjects left at least some of the demographic items blank. Lack of these statisticsreduces the statistical significance of any analysis. Yet, all these students took thetest again at the end of their course for post-test analysis. Forty-five studentswho did not fill out the pre-test questions later filled out these questions on thepost-test. Using these latter data for demographic analysis reduces the averagepercentage of missing responses to 9.9 percent. These data are shown in Figure6.

3. Computer SoftwareData have initially been examined using Excel, a Macintosh-based

spreadsheet program. Initial counts, means, standard deviations, and histogramshave been generated with this program. Initial coding and analysis have beencarried out by the Project Evaluator, Dr. Marcus Leiberman, on an IBM-PCrunning SPSS. All of the analyses reported in this study were carried out by meusing DataDesk24 and Statview 512+25—both powerful Macintosh statisticalanalysis programs—Microsoft Excel26—an advanced spreadsheet and graphicsprogram—and DeltaGraph Professional27—a sophisticated graph generator.

24Version 3.0 by Data Description Inc. available from Odesta Corporation, Northbrook, Il.

25Version 1.0 by Abacus Concepts Inc., available from BrainPower Inc., Calabasas, CA.

Item Number

%

of

Resp

onse

s M

issi

ng

0%

5%

10%

15%

20%

0 5 10 15 20 25 30 35 40 45 50 55 60

Missing from pre-test

With substitution from post-test

Demographic andschooling items

Astronomical concepts, astronomical facts, andmathematics items

Figure 6, Missing Observation versus Item Number

Page 62: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

44. How Research Questions Are Answered1(a) Is the test a valid instrument for measuring the misconceptions of students

entering an introductory astronomy course? Many researchers appear to beconcerned with the validity of their test instrument after using it in a study. Thebest time to be concerned with a test’s validity is before it is used. Validity hasbeen built into this test from its inception, rather than being addressed solelyafter administration (Anastasi 1986; Narode 1987). The test items wereconstructed by project staff and participating teachers to measure conceptualunderstanding of students in introductory astronomy courses. Items weredeveloped to match previously developed objectives of the Project STAR course.In this study, I trace the origins of each test item back through the existingliterature or to its source in student interviews by Project STAR staff. Thecorrectness of item answers was validated by six astronomers who took the test,all agreeing on the scientifically correct response:

Dr. Richard Fienberg, Sky Publishing CorporationProf. Owen Gingerich, Harvard-Smithsonian CfADr. Darrel Hoff, Harvard-Smithsonian CfA and retired Professor of

Education and Astronomy, University of Northern IowaDr. Donald Lautman, Instructor, Harvard Extension SchoolMr. Samuel Palmer, Radio Astronomer, Harvard-Smithsonian CfAProf. Charles A. Whitney, Harvard-Smithsonian CfA

I address the question of whether the test is a pure measure of misconcep-tions or whether it is measuring something else in several ways:

i. The characteristic distribution in the selection of distractors due toguessing is calculated and compared with actual results. For example, if studentswere simply guessing at the answers to items, one would expect that, on theaverage, each of the five answers would be selected 20 percent of the time. Witha finite number of subjects choosing answers at random, the P-values of answerswill vary. I will perform a χ2 test on each test item to determine if thedistribution of chosen answers is random at the p = 0.05 level (Tuckman 1988).

ii. The validity of this test as a measure of misconceptions is related to itsreliability. The extent to which a test measures what it is designed to measure isone definition of validity (Aiken 1985) and it is important that a test be consistentin measuring what it was designed to measure. Parallel forms of the same testare often used to establish reliability. It is expected that the same subjects willperform similarly on each form. A less direct, but powerful, method ofestablishing reliability—that of internal consistency—was developed to assess thereliability of a test more easily. By splitting a test into two parts—say, the even

26Version 3.0 available from Microsoft Corporation, Redmond, WA.

27 Version 2.0 by Deltapoint Inc., Monterey, CA.

Page 63: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5and odd questions—and treating them as parallel forms, the correlationcoefficient calculated from all the subjects’ “split scores” can be used as a measureof reliability. There are several tests of internal consistency that will helpestablish the degree to which different items are all measuring the same overallcharacteristic. The most popular calculate a correlation coefficient among allpossible combinations of all the split-scores of a test.

iii. If the test is valid, students who have the lowest ability should show apreference for certain misconceptions. If the test simply measures knowledge,then low-ability students should not know the correct answer and will guessfrom among the five possibilities. By examining the preferences of the lowest-scoring quintile of students, I can determine the degree to which students show apreference for particular misconceptions over the correct response.

iv. I will seek to test plausible, rival hypotheses to the assumption that thistest measures misconceptions (Cronbach 1990). I will examine how other, distincttraits correlate with the test scores. I will examine the correlations among theseand other factors, such as grade level, to determine whether any are goodpredictors of misconception scores.

Seventeen of the teachers in the study were prepared to teach ProjectSTAR; five were teaching classes of their own design. While the content validityof the test can be ascertained by matching it against Project STAR’s curriculumobjectives, this method does not work for the multitude of courses that do nothave the same objectives. I wanted the results from this study to be applicable toclasses across the country, not only to the twenty-two involved in the study. Forthis and other reasons, a colleague28 and I sought a way to find out if our testitems were viewed as reasonable tests of course objectives in a large variety ofintroductory astronomy courses. We sought to have astronomy teachersvalidate our assumption that the misconception questions on our test were goodmeasures of the concepts that they covered in their classes. We hoped thatteachers predicted that students would do poorly on these questions at the startof their course and well by the end of their course.

In May of 1991, we contacted 1,200 teachers in the United States, selectedat random from the Project STAR mailing list, and asked them to participate inpredicting pre-test and post-test scores for their introductory astronomy classes.Of this number, 240 carried out this process for a large fraction of the test items.As part of the validation process, I will analyze the teachers’ predicted pre-testand post-test scores. High predicted post-test scores, coupled with low predictedpre-test scores, are indicative of a teacher’s view that students learned theseconcepts as a result of instruction. I would argue that this affirms a good matchof the test’s content with similar objectives in the teachers course.

1(b) Which test items appear to be most appropriate in assessing student mis-conceptions in astronomy and should be included in revised instruments? Thesimilarity between test items is examined by generating a correlation matrix be-

28Alan P. Lightman, Professor of Writing and Senior Lecturer in Physics, MIT.

Page 64: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6tween each of the test items. Two items with relatively high correlationcoefficients (R > 0.50) are discussed and possible explanations are explored fortheir similarity. Questions that appear to be quite similar and do not help todiscriminate among student misconceptions are recommended to be droppedfrom the test in the future. Low or negative correlations are discussed on anitem-by-item basis. A factor analysis is performed to try to identify combinationsof tested factors that appear to relate test items. A stepwise regression isperformed in an attempt to ascertain which items are the most highly predictiveof the test scores. The wisdom of retaining items with low predictive value infuture tests is assessed.

A critical analysis of items attempts to reduce the sources of errors in mea-surement and to gauge the quality of items. One method that has proven fruitfulfor the task is Classical Test Theory (CTT) .

Classical Test Theory is based on the idea that a student’s true ability isbest measured by her or his true score and the difficulty of a test item (the fractionof subjects who answer an item correctly) (Hambleton et al. 1991). I calculate theP-value for each of the test items. I assume that students who have a greatermastery of the material overall will, on average, score higher on individual itemsthan students whose mastery is lesser. Individual items are then characterized byhow well each discriminates between high-scoring and low-scoring students(Miller and Erickson 1990). The most popular method for doing this involvesseparating subjects into several subpopulations (I have chosen five) based ontheir total score on a test. A graph is then constructed that plots this overallability versus the average P-value for that group. Good test items show P-valuesrising monotonically and steeply with ability. Poor items do not fit this profile.The discriminating ability of an item can be estimated as the slope of the curve,but a D-value can also be calculated to determine discriminating ability (Osterlind1989). This is a measure of the correlation between responses to an individualtest item and the total test score. It can be calculated for each individual responsebut is usually calculated only for the correct answer. I will generate graphs andD-values for each item. Results are compared to the recommended statistics forthis type of test (Tinkelman 1971).

Classical Test Theory can be useful for evaluating test items. There is someevidence that it can characterize items dealing with misconceptions as loweringthe reliability of the test and hence to be discarded (Narode 1987). Items with lowD-values and P-values are often removed from standardized tests. These itemsare judged as poor because they are difficult and present distractors that areattractive to many high-ability students. I will explore how various itemcharacteristics affect D-values and P-values.

1(c) How reliable is this test? The measurement of this test’s reliability willprovide an estimate of the consistency of test results. To be considered reliable,multiple applications of the test in similar forms must yield consistentlyreproducible, noncontradictory results. I will use several methods to determine

Page 65: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7test reliability. All produce correlation coefficients that express the reliability. Iwill compare my results with those of similar types of tests.29

a. Question 2 focuses on the misconceptions revealed by the test.Counts, means, and standard deviations are calculated for each test item

and each distractor. Means, medians, and standard deviations are calculated forthe instrument’s total scores. The questions are rank ordered by difficulty usingpercentage of the population that chose the correct answer. The popularity ofdistractors is discussed for each item. I attempt to explain the source of each ofthese misconceptions through references to the literature, from my interviewswith students, or from the answers to prior open-ended tests in the Project STARdataset.

b. Questions 3a, 3b, 3c, 3d, 3e and 4a, 4b, 4c focus on thedemographic and school-based aspects of the subjects and theirrelation to scientific misconceptions.

A one-way analysis of variance or regression analysis is performed oneach subgroup of students to determine if the null hypothesis can be rejected atthe p > 0.05 level. This helps to identify factors that may be significant incontributing to student misconceptions. A stepwise regression is performed toaccount for the variance in the population to determine which factors account forthe most variance. The maximum amount of variance accounted for by all thestudent characteristics is also calculated.

F. Statistical AnalysesTest scores have been normalized by calculating the total score for each

student as well as the fraction of the forty-seven items they answered correctly.For each of the five possible answers to each item, a P-value and the standarddeviation of the P-value are calculated. This standard deviation is calculated byusing the departure of each subject’s answer from the P-value of thepopulation.30 It is used for calculating other test characteristics. D-values foreach answer are calculated as well.

29Currently the most popular is the Kuder-Richardson Formula 21, which computes thereliability of a test from the mean and standard deviation of the results. As a test of internalconsistency, it is a very easy test to carry out.

KR21 = nn-1

1 - X n- Xn SD 2

n = test length in itemsX = mean of the test scores

SD = standard deviation of the test scores

30 In this equation, n is the total number of subjects, X is the answer on a particular question

Page 66: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8Simple linear regressions are carried out on all demographic andschooling variables. Multiple regressions are carried out when these factors couldbe broken down by use of dummy variables into several factors. ANOVA(analysis of variance) is used when it was useful to preserve the categoricalnature of the factors.

For item characteristic curves (IRC), P-values are calculated for eachsuccessive quintile of subjects (Osterlind 1989). On the basis of their total score,the 1,414 students are assigned to one of five groups by quintile. The 283students with the lowest score are assigned to the lowest quintile and higher-scoring students are assigned to higher-scoring groups, as shown in Table X.The range of student scores was from a minimum of three correct to a maximumof forty correct.

Table X. Student Quintile Assignment by Total ScoreQuintile Population Range range correct Mean total score

1 0% 20% 3 to 10 8.32 20% 40% 10 to 13 11.93 40% 60% 13 to 17 15.04 60% 80% 17 to 21 18.05 80% 100% 21 to 40 25.9

I have paid attention to the statistical significance of all tests carried out inthis dissertation. I have settled upon p = 0.05 as the minimum level of significanceallowed. The software used to carry out ANOVAs automatically calculates anddisplays the probability of the results occurring at random (p). The regressionanalysis software only calculates t-ratios. For tests with over 1,000 degrees offreedom, a t-ratio of 1.96 is significant at p = 0.05. This is very close to the “rule-of-thumb” of a t-ratio of 2 for tests with many degrees of freedom at p = 0.05. Ihave used a t-ratio of 1.96 in all regression analyses and calculations of standarderrors.

For item analyses, it is worthwhile to calculate the standard error of aproportion. This can be accomplished using the equation below, in which n is thenumber of subjects and the t-ratio characterizes the statistical significance of aregression analysis. The standard error gives a range of means in which theactual mean would lie at some level of probability.

standard errorproportion=t ratio p=.05P-value * 1-P-value

n

(either 0 or 1), and X with a bar over it is the mean value or P-value for that answer.

SDP-value=X-X 2

n-1∑s=1

n

Page 67: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9For item analyses using the total number of students, n = 1,414. For itemresponse curves, n = one-fifth the size. The standard error varies with the P-value of the question, so that a table of expected errors can be constructed.

Table XI, Standard Errors of P-valuesp-value SE for 1,414 cases SE for 283 cases

.00 .000 .000

.10 .016 .035

.20 .021 .047

.30 .024 .053

.40 .026 .057

.50 .026 .058

.60 .026 .057

.70 .024 .053

.80 .021 .047

.90 .016 .0351.00 .000 .000

Making use of these standard errors is particularly important wheninterpreting item response curves. Random guessing will result in P-values of0.20 for each answer. The standard error of a proportion gives a range of valuesfor this P-value = ± 0.047 ≈ 5 percent. A P-value should be considered differentfrom a random selection if it is outside the range of 0.15 ≤ P-value ≤ 0.25. Fortotal scores characterizing all 1,414 students, the range of random selection issmaller: 0.18 ≤ P-value ≤ 0.22. I have not plotted error bars on the included itemresponse curves since the key errors are those that distinguish P-values asrandom. Additional markings would only unnecessarily clutter the graphs. Forstandard errors associated with any P-value, the reader should consult Table XI.

Page 68: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0IV. Reliability and ValidityA. ReliabilityThe reliability of a test is a measure of the degree to which a test is

consistent and stable. It is a measure of how similar results would be if the testwere taken at different times by the same subjects. Many factors can contributeto the “unreliability” of a test, including familiarity with a test (from taking itmore than once), conditions at the site of the test, or the health or condition ofthe subject (Tuckman 1988). These factors can be unpredictable and test resultsthat vary a great deal with changes in these factors are considered unreliable.

Standardized tests, such as those used for college admissions, should havereliability coefficients (correlation coefficients between equivalent forms or partsof a test) of 0.90 or greater (Hopkins and Stanley 1981). Misconception tests thathave been used with large populations have somewhat lower reliability. The testinstrument used in the Arizona State University survey discussed earlier had aKuder-Richardson 21 reliability of 0.86.

I have calculated the reliability of this instrument by three differentmethods. The Kuder-Richardson 20 formula gives a value of 0.78, the Kuder-Richardson 21 formula gives a reliability of 0.76, and the Flanagan formula givesthe reliability of 0.80. These reliability scores are not at the high level ofstandardized tests, yet they are within the range of many specialized orexperimental tests. There may be ways to modify these test items to increasereliability. Eliminating or changing items that are not highly correlated with thetotal test score (i.e., that have low or negative discrimination indices) may help.Increasing the number of test items is another possibility.

B. Validity TestsThe validity of the test instrument was established in three different

ways to determine if the test is a good measure of conceptions in astronomy.These were establishing the degree to which test questions were seen byteachers as measures of learning, test taking by experts in the field, and ananalysis of missing answers.

Introductory astronomy teachers were asked to predict the percentageof students who would get each of a subset of the items on the testcorrect as a measure of how well this test dealt with the conceptsthey teach in their own classes.

Two groups of experts, graduate students in astronomy andastronomy teachers took versions of the test to see if they chosethe correct answers.

Item characteristics were explored to seek plausible explanations ofwhy students chose not to answer certain questions.

1. Teacher PredictionsIn a nationwide survey, 240 astronomy teachers predicted the pre-test and

post-test scores of their students on sixteen selected questions from the ProjectSTAR pre-test. These questions were chosen by Professor Alan Lightman and meas having the most similarity with the concepts taught in introductory

Page 69: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1astronomy classes. As a group, the teachers predicted that only 36 percent ofthe questions would be answered correctly on the pre-test and that 73 percentwould be answered correctly on the post-test. The lowest mean predicted post-test item was 65 percent. Results calculated by individual item are shown in TableXII. Items are labeled with the same numbers as on the entire forty-seven-iteminstrument for comparison. Student scores are shown in the second column.Teacher predictions of pre-test and post-test score are shown in the third andforth columns, respectively. These results lend support to the claim that teachersthought the test was a reasonable test of many of their course objectives.

Table XII. Teacher Predictions Pre-test Pre-test Post-testtestStudent Pre-test Teacher Pred. Teacher Pred.

1. Day/night 0.66 0.65 0.892. Phase change 0.25 0.34 0.723. HR Graph 0.30 0.47 0.819. Model of Sun & Earth 0.24 0.20 0.6112. Sun overhead 0.18 0.27 0.7313. Earth Diameter 0.29 0.36 0.7417. Seasons 0.11 0.29 0.7618. shuttle->planets->stars 0.44 0.49 0.8620. Moon revolution 0.38 0.48 0.8523. Moon revolution (around Sun) 0.52 0.34 0.7331. time zones 0.46 0.25 0.6335. astrology 0.22 0.40 0.8042. filters 0.15 0.23 0.5043. light sources 0.39 0.35 0.7144. light propagation - night 0.40 0.38 0.7346. gravity 0.29 0.23 0.65Averages 0.33 0.36 0.73SD 0.14 0.12 0.10

Corr. coeff. with student pre-test 0.63 0.752. Expert Validation

Fourteen graduate students in Harvard’s Department of Astronomyvolunteered to take a forty-question version of the pre-test in the Fall of 1988.Many also added comments to the scoresheet pointing out inconsistencies orother problems with items. Their average score on this test was 36.7 items out of40 (92 percent correct). The lowest-scoring student answered 33 questionscorrectly. One student answered all the questions correctly. Only two questionswere answered incorrectly by more than 40 percent of the students. One askedthem to estimate what size object would just cover the Moon when held at arms’length. The other asked them to choose properties that would be the same fortwo stars of equal apparent brightness. Both questions were eliminated fromfuture studies.

3. Missing Item AnswersWhereas a rather large fraction of students did not answer demographic

and schooling questions on the test, fewer did not answer individual contentquestions. I have attempted to identify factors that would help explain thesemissing data. If these missing answers are related to problem difficulty or type,

Page 70: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2this would enter into an analysis of results and affect conclusions. If studentsskipped answering a question because it was too hard, then the actual difficultyof the question, calculated by dividing the number correct by the total number ofsubjects, should be revised upward because, if students had guessed at theanswer rather than skipping the problem, more would have gotten it right.

I created several factors that help characterize test items:

P-value: the difficulty of the question.Item #: from the order in which the question appeared.Picture: whether the problem had an accompanying graphic.Concept: whether the problem dealt with an astronomical concept.Fact: whether the problem dealt with an astronomical fact.Math: whether the problem required the exercise of a math skill.Readability: Gunning Fog Index.

Data for each test items are presented in Table XIII.

Page 71: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3Table XIII. Test Item Characteristics for Missing Answer RegressionItem ## MissingP-value Picture Concept Facts Math GF Index

1 0 .66 0 1 0 0 2.92 6 .26 1 1 0 0 2.73 4 .31 1 0 0 1 8.24 1 .49 1 1 0 0 8.45 1 .10 1 1 0 0 10.16 6 .13 0 1 0 0 3.37 4 .45 1 1 0 0 12.48 11 .34 0 1 0 0 9.49 5 .25 0 1 0 0 12.4

10 8 .40 0 0 0 1 12.911 5 .13 1 1 0 0 8.212 3 .18 1 0 0 0 14.813 8 .29 0 0 1 0 8.114 12 .24 0 0 1 0 8.115 10 .30 0 0 1 0 8.116 16 .28 1 0 0 1 6.217 11 .12 0 1 0 0 4.218 5 .44 0 1 0 0 6.419 11 .66 0 0 0 1 7.020 19 .37 0 0 1 0 7.221 12 .62 0 0 1 0 7.222 20 .68 0 0 1 0 7.223 11 .51 0 1 0 0 7.224 10 .23 0 1 0 0 7.225 15 .37 1 0 0 1 5.826 16 .46 1 0 0 1 13.527 19 .39 0 1 0 0 7.328 17 .42 0 1 0 0 6.429 19 .33 0 1 0 0 6.430 14 .31 0 1 0 0 11.531 16 .45 0 1 0 0 3.632 13 .44 0 1 0 0 9.433 22 .36 0 0 0 1 11.334 26 .28 1 1 0 0 6.535 20 .23 0 1 0 0 9.136 25 .33 0 1 0 0 5.637 23 .19 1 1 0 0 6.538 30 .24 0 1 0 0 5.639 33 .39 0 1 0 0 4.640 30 .33 0 1 0 0 8.141 33 .07 1 1 0 0 6.042 35 .15 1 1 0 0 9.143 33 .38 0 1 0 0 6.944 38 .40 1 1 0 0 5.845 43 .30 1 1 0 0 5.746 51 .29 0 1 0 0 6.747 44 .48 1 1 0 0 9.1

Page 72: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4These characteristics were entered into a regression equation to explainthe variance in the number of missing answers. If the frequency of missinganswers is the result of some random process, a regression analysis will explainnone of the variance in the number of missing answers. This multiple regressionexplained 88 percent of the variance.

Regression Analysis of Missing Answers by Item CharacteristicsDependent variable is: # Missing Answers in Each ItemR2= 86.4% R2(adjusted) = 84.0%s = 4.989 with 47 - 8 = 39 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 6187.31 7 884 35.5Residual 970.907 39 24.8950

Variable Coefficient s.e. of Coeff t-ratioConstant -3.96001 7.128 -0.556Item # 0.838298 0.0564 14.9P-value -2.20499 5.529 -0.399Picture 3.16731 1.688 1.88Concept 3.33546 5.728 0.582Facts 6.74269 6.078 1.11Math 4.05691 5.730 0.708GF Index -0.396620 0.3068 -1.29

Reducing the regression equation to only a single factor—that of itemnumber—still explained 84 percent of the variance. I feel that this single factor isindicative that students became bored or discouraged with the test, or ran out oftime for completion. It does not appear that other item factors have much of arole in explaining students’ choice not to answer a question. In particular, thedifficulty of the item (as represented by its P-value), has a t-ratio < 2. Itscontribution is not significant at the p = 0.05 level. Students are not avoidingquestions because they are difficult. This could be tested by changing the orderof questions in a new test and carrying out this analysis again. R2 should besimilar if Item # is responsible for missing answers.

Page 73: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5Regression Analysis of Missing Answers by Item CharacteristicsDependent variable is: # Missing Answers in Each ItemR2= 83.9% R2(adjusted) = 83.6%s = 5.056 with 47 - 2 = 45 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 6007.78 1 6008 235Residual 1150.43 45 25.5652

Variable Coefficient s.e. of Coeff t-ratioConstant -2.68455 1.499 -1.79Item # 0.833488 0.0544 15.3

Page 74: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6V. Item Analysis ResultsIn this section I examine each test item individually using a variety of

techniques and tests. First, I discuss the descriptive statistics for students’ totalscores. I have grouped the test items into categories for discussion based onconventional astronomical or curricular areas: Earth and Sun; Earth and Moon;Mathematics; Solar System; Stars; Galaxies; and Light and Color. Within eachgrouping, the items are subjectively ordered by complexity of the underlyingconcept.

I trace the origins of each item in the literature or through my owninterviews, and identify the correct answer and the reasons for including eachdistractor. This is followed by the calculated P-value and D-value for the correctanswer and each distractor. Distractors with P-values greater than 0.20 arediscussed. A plot of the P-value of each answer is plotted for each quintile and itsmeaning is discussed. I have also included suggestions for improving items.

A. Total ScoreThe distribution of the 1,414 total scores on the test is a slightly skewed

distribution with a mean of 16.0 and a standard deviation of 6.4 (see Figure 7).The highest score of any student on the test was forty items correct, while thelowest score recorded was only three items.

Figure 7, Histogram of Total Score

Summary Statistics for Total Score

Total Cases = 1414Mean = 16.028Median = 15SD = 6.407Range = 37Variance = 41.062Minimum = 3

Maximum = 4010th percentile = 990th percentile = 25

Randomly guessing the answer to each question on this test would haveresulted in a average score of 47/5 or 9.4 answers correct. A Monte Carlo modelof 1,414 subjects each with a random probability of choosing one of five answerson a forty-seven-item test, results in an average score of 9.45 with a standarddeviation of 2.76 (see Figure 8). There may have been many students whosimply guessed at the answers to questions.

0.00 6.00 18.00 30.00 42.00

50

100

150

Total Score

Page 75: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7Figure 8. Monte Carlo Model of Random Selection of Answers

0.00 6.00 12.00 18.00

50

100

150

200

250

Total Score��

Graphs and tables describing individual test items are not numbered orlabel in this section. Every item analysis contains a table of P-values and D-values and an item response curve. Each item starts on a new page and isthereby separated from others.

Page 76: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8B. Earth and SunThe idea that the Sun is a star and that the Earth orbits around it is a

fundamental concept that can be found in most primary school science books(Nussbaum 1986). Most teachers take it for granted that their students know thereason for day and night and the length of the year. In the prediction studycarried out with Alan Lightman, participating teachers thought that two-thirds oftheir students would enter their classes knowing the reason for day and nightand that only 36 percent would know the diameter of the Earth.

Item 21, The Earth’s Rotational PeriodChoose the best estimate of the time for the Earth to turn on its axis.A. Hour B. Day C. Week D. Month E. Year

For all items on this test dealing with astronomical periods, the same fivechoices were given for periods: hour, day, week, month, or year. These includethree periods that are astronomical in nature:

— the rotational period of the Earth (a day);— the orbital period of the Moon about the Earth (a month);— and the orbital period of the Earth around the Sun (a year).Two measurements of duration that have no astronomical significance are

also included: the hour and the week.A small study (N = 24) of second-grade students found that half of the

sample thought that the Earth made one turn in 24 hours. Incorrect answers forthe Earth’s rotational rate ranged from “6 minutes” to “200 hours” (Klein 1982).Six of these could demonstrate the reason for day and night; the other twelvecould not.

Item 21 A B C D EP-value .08 .62 .06 .10 .13D-value -.18 .48 -.17 -.21 -.22There are no distractors chosen with greater than .20 frequency. The

majority of students appear to know the rotational period of the Earth.

Page 77: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9

The results show a classic item response curve that rises steeply withoverall test performance. Although there may be some lower-performingstudents who think the Earth rotates on its axis once a year, all distractorsdiminish as the student performance increases. This curve is characteristic offactual information: the better students have it memorized while others do not.Distractor curves have roughly the same shape, diminishing from close to the 20percent random level to almost zero for the high-performing students.

To improve this question, the least chosen distractor—that the Earth spinsin a week—could be replaced by a choice that the Earth does not turn.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 78: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0Item 1, Reason for Day and NightWhat causes night and day?

A. The Earth spins on its axis. D. The Earth moves into and out of the Sun’s shadow.B. The Earth moves around the Sun. E. The Sun goes around the Earth.C. Clouds block out the Sun’s light.

The reason for day and night is perhaps the most basic idea assumed byteachers of astronomy of their introductory students. In the prediction survey,participating teachers assumed that 65 percent of their students, on average,would enter their classes with this concept understood and, by the end of thecourse, 89 percent would leave knowing it. The Earth turning on its axis causingday and night has been described as one of the “most essential ideas which formthe Earth conception” (Nussbaum 1985).

Children as old as twelve, however, have been shown to believe that theworld is a spherical shell with the Earth consisting of the bottom of thehemisphere, and air in the top half. The sun travels along the surface of thesphere in this model. Children think that the Sun is in the sky in the daytime andtravels below the Earth at night (Nussbaum 1986). One student had integratedthis with his other knowledge, explaining that, “at night the Sun travels belowus... and this is how the lava in the Earth is heated.” Students with this beliefwould choose answer “E.”

In a study of elementary school students, the reasons stated for day andnight included that the Earth revolves around the Sun (B), that the Earth or Sunmoves into a shadow (D), or that clouds block out the Sun’s light (C) (Vosniadouand Brewer 1987). Another study of second-grade students found that manyknew that the Sun was “on the other side of the Earth” at night, but showed noclear preference for whether it was the Earth or the Sun that moves (Klein 1982).

A item similar to this one was included in the 1969 National Assessment ofEducational Progress of nine-year-old (third-grade) students. The percentagechoosing each answer follows in parentheses:

One reason that there is day and night on Earth is that theSun turns. (8%) Moon turns. (4%) Earth turns. (81%)Sun gets dark at night. (6%) I don’t know. (1%)

This is a fine example of a question that is not designed to identifymisconceptions. The high percentage of correct answers can be attributed to thelack of plausible distractors (Schoon 1988). Had the question included “the Earthgoes around the Sun” and “the Sun goes around the Earth,” the students’ choicesmight have been quite different.

One researcher found that although college students prefer a heliocentricexplanation of the solar system and reject a geocentric model, the majority couldnot give convincing arguments for their view when answering an exam questionon the subject after an introductory astronomy course (Touger 1985).Justifications for heliocentrism took many surprising forms; two examples: “TheSun is the center... by observation we can see that the planets move around the

Page 79: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1Sun” (p. T-5) or “all our pictures and telescopes and space flights tell us thatthere is one big star, the Sun, with several smaller planets (moving) around it”(p. T-8). Touger argues that students’ belief in heliocentrism is derived almostexclusively from secondary sources and lacks an empirical base. Many studentsbelieve that scientists have actually viewed the entire solar system from avantage point in space.

I would argue that this acceptance of heliocentrism as dogma, without anability to muster a shred of supporting evidence, makes this concept an attractiveand almost universal answer to astronomical questions. Much as one researcherfound when interviewing young children that God was invoked frequently toexplain certain natural events (Za’rour 1976), the hard-learned belief inheliocentrism is called upon as justification for any astronomical problem forwhich the individual cannot give evidence supporting her or his view.

The follow table lists the P-values and D-values for each answer to Item 1.For all the following items a similar table is presented.

Item 1 A B C D EP-value .66 .26 .00 .03 .04D-value .39 -.29 -.06 -.09 -.19Students do very well on this question, with 62 percent of them selecting

the correct answer—that the reason for day and night is that the Earth spins onits axis. From this table, one can clearly see that answer B, day and night arecaused by the Earth moving around the Sun, is preferred by .26 of the studentsin the survey. Surprisingly, this item is not highly correlated with Item 21 (R =0.37). Twenty percent of the 877 students who get this question right answeritem 21 wrong. These students may not have connected the 24-hour period ofthe Earth’s rotation with day and night.

The following graph plots the P-values of each answer to Item 1 for eachperformance quintile of students. This and similar graphs for following items arenot labeled.

Page 80: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2

From this graph, it appears that answer “B,” the Earth circling the Suncauses day and night, is a major misconception for all but the best performingstudents in the test population. Note that the answer, “the Sun goes around theEarth,” appears unattractive to students. It does not seem that they are confusingthe Earth’s rotation with their observations of the Sun apparently circling theEarth. They actually think that our orbiting the Sun causes day and night.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 81: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3Item 31, Time ZonesBoston is 90° east of Hawaii. If it is noon in Hawaii, in Boston it would be about:A. Sunrise. B. Sunset. C. Noon. D. Midnight. E. Noon the next day.

This item attempts to have students apply the knowledge that the Earthmakes one complete rotation in twenty-four hours along with the fact that thereare 360° in a circle. With Boston 90° east of Hawaii, there should be a timedifference of one quarter of a day or the time between noon and either sunriseor sunset.

Item 31 A B C D EP-value .18 .45 .08 .22 .06D-value -.05 .34 -.19 -.14 -.13

This question is of moderate difficulty and discriminating power. None ofthe distractors appears to stand out based on the total score statistics. In thenationwide survey, teachers predicted that only 25 percent of entering studentswould be able to answer this question correctly. Roughly twice as many studentsunderstood this concept than teachers thought.

I expected that students might get the direction of rotation wrong andpick answer “A” with greater frequency. Answer “D,” however, seems to attracta fair number of students. This can be explained by students not knowing thatthere are 360° degrees in a circle, and thus thinking that there would be a twelve-hour time difference between Boston and Hawaii. This misconception isdiscussed in more detail in the Mathematics section.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 82: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4Item 22, Earth’s Orbital PeriodChoose the best estimates of the time for the Earth to go around the Sun.A. Hour B. Day C. Week D. Month E. Year

Much like the reason for day and night, the concept of the Earth’srevolution around the Sun in a year is fundamental to understanding solarsystem astronomy. Students who believe that the Sun orbits the Earth in a daywould choose “B.”

Item 22 A B C D EP-value .03 .16 .06 .06 .68D-value -.16 -.24 -.24 -.18 .48

This question appears to be relatively easy and is moderatelydiscriminatory. No distractor appears to stand out when examining the statisticfor the total test.

This appears to be an easy question for most students. More students getthe correct answer to this item than to any other item on the test. However,there are still low-performing students who think that the Earth orbits the Sun ina day.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 83: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5Item 12, When is the Sun Overhead?How often is the Sun directly overhead at noon in your hometown?A. Every day. D. Only for one day each year.B. Only in the summer. E. Never.C. Only for the week of the summer solstice.

This test was given to students only in the continental United States. TheSun can only be seen directly overhead between the Tropics of Capricorn andCancer (between 22.5°N and 22.5°S latitude). The Sun is never overhead in thecontinental United States. The correct answer is “E. Never.” In Boston, the Sun isonly 25° above the horizon at noon on the winter solstice. On the first day ofsummer it is much higher, but still rises only to 71° altitude at its maximum.

Schoon found that 12 out of 13 participating teachers and 20 out of 32student teachers believed that the Sun is always overhead at noon (Schoon 1988).

Item 12 A B C D EP-value .41 .11 .12 .18 .18D-value -.14 -.14 -.07 .09 .27

This appears to be a difficult question, with “A” being students’ preferredanswer. A plurality of students believe that the Sun is always directly overheadat noon. Teachers predicted that students would do poorly on this question, thatonly .27 would get this question right. Students do much worse than teacherspredict. Not knowing that the Sun is lower in the sky in the winter precludes aproper understanding of the reason for seasons. The connection between thegeocentric and heliocentric frames of reference is key to understanding thisconcept.

Page 84: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6

The misconception that the Sun is overhead at noontime is the mostcommon answer among all performance levels. These youngsters have notnoticed how much longer their shadow is at noon in the winter than in thesummer or how the Sun always seems to be in their eyes in the winter. Onlystudents in the highest performance quintile show a substantial reduction in thismisconception. However, many still cannot let go of the belief that the Sun isdirectly overhead at some time, choosing “D, only for one day each year,” butwith greater frequency than other students.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 85: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7Item 39, Sun’s Path at the PoleDuring July at the North Pole, the Sun would:

A. be overhead at noon. D. set in the northwest. B. never set. E. none of the above.C. be visible for 12 hours each day.

This item is an attempt to get students to apply the idea that the Sun’spath through the sky differs at different latitudes. In the summer, the “Land ofthe Midnight Sun” enjoys twenty-four hours of sunlight each day and wouldnever set. In 1986, three Italian researchers found that the majority of 11-13-year-old pupils believed that “the Sun always rises from the same point on thehorizon, the East, and always sets in the opposite point, the West (Loria et al.1986).”

Item 39 A B C D EP-value .12 .39 .23 .09 .15D-value -.15 .41 -.17 -.19 -.04A surprising number of students answer this question correctly,

considering that they do so poorly on other questions dealing with the Sun’smotion. Perhaps they have heard the fact that the Sun never sets at the pole inthe summer and could recall it for this test.

Many students think that the Sun’s apparent motion changes little atdifferent latitudes and dates. Many students chose “C,” that the Sun would bevisible for twelve hours each day at the North Pole. The same student wouldprobably describe the length of daylight no differently for our latitude. Only the

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 86: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8highest-performing students (in the highest quintile) seemed not to be taken inby this distractor.

Page 87: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9Item 40, Duration of DaylightWhich date below has the most hours of daylight in your hometown?A. June 15 B. July 15 C. August 15 D. September 15 E. All dates are the same.

That there is more daylight in summer than in winter is easily noticed, butdays begin to get shorter after the summer solstice (June 21) has passed. June 15is less than a week from the solstice, while July 15 is over three weeks from thatdate. This question helps to find out if students know what the solstice signifies.

Item 40 A B C D EP-value .33 .34 .13 .07 .10D-value .28 .00 -.14 -.11 -.13

This was a difficult question for most students. A large fraction of studentsat all performance levels think that the amount of daylight increases from thesummer solstice or at least is longer during the summer than the spring. Manystudents appear to think that, because the summer is warmer than the winter,days must be longer (Schoon 1988).

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 88: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0Item 4, Shape of the Earth’s OrbitOf the following choices, which looks most like the Earth’s path around the Sun?

A.Sun

B .Sun

C. Sun

D.Sun

E. Sun

Many students believe that the change of seasons is evidence of theEarth’s elliptical path (Touger 1985). Students who have this misconceptionwould show a preference for answer “C,” since without the Earth’s varyingdistance from the Sun, there would be no summer and winter. Others explainthat the Earth is simply closer to the Sun in the summer than the winter(Furuness and Cohen 1989); these students should show a preference for anychoice but “A.” The Earth’s orbit is almost perfectly circular with the Sun veryslightly displaced from the center of the circle. At the scale of these drawings, theorbit of the Sun is indistinguishable from a perfect circle.

Item 4 A B C D EP-value .49 .14 .28 .08 .01D-value .08 -.13 .14 -.20 -.06

This is a question of moderate difficulty with virtually no discriminatingpower. Students who score well on the test overall do no better on this item thanstudents who do poorly on the test. The choice of the correct answer, “A,”appears to be virtually independent of student performance on the entire test.With a D-value of only 0.08, a question similar to this one would be rejected fromany standardized test. From the overall scores on this item, it appears thatanswer “C” appeals to a larger fraction of students than one might expect whenbeing chosen at random.

Page 89: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1

The distractor “C,” that the Earth’s orbit is highly elliptical, is popular withall groups but is preferred more strongly by higher-performing students.Perhaps these students are more likely to have heard that our orbit is elliptical,but do not know how tiny its eccentricity really is. This fact is used by better-performing students in thinking that the orbit of the Earth is highly elliptical.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 90: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2Item 17, The Reason for SeasonsThe main reason for it being hotter in summer than in winter is:

A. the Earth’s distance from the Sun changes. D. ocean currents carry warm water north. B. the Sun is higher in the sky. E. an increase in “greenhouse” gases. C. the distance between the northern hemisphere and the Sun changes.

The Sun is lower in the sky in the winter than in the summer. This changein altitude spreads the Sun’s light over a much broader area on the Earth. TheBoston Curriculum Objectives (Marshall and Lancaster 1983) for fifth gradeexplain correctly that the reason for winter is that “the Sun is lower in the sky,”but then go on to qualify this reason with the incorrect statement, “its rays haveto shine through more atmosphere before they reach us, losing heat energy inthe process.”

Item 17 A B C D EP-value .45 .12 .36 .03 .03D-value -.15 .04 .21 -.08 -.10

This is a question that is both extremely difficult and does not discriminatebetween students based upon overall performance. Teachers in the predictionsurvey thought students in introductory astronomy courses would score .29before their courses. The actual P-value is less than half of that score. Bothanswer “A” and answer “C” are far more popular than the scientifically correctresponse.

In answering this question, students appear to be torn between twodistractors that mention changing distance. Many students believe that the

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 91: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3Earth’s orbit is highly eccentric so that the entire Earth is physically closer to theSun in the summer than in the winter. A more “evolved” explanation is that theEarth leans toward the Sun in the summer and away from the Sun in the winter.This is consistent with many diagrams in textbooks that show the one poleproportionally much closer to the Sun in the summer. The correct answer, “B,”appears to be avoided by most students at all performance levels. It is clear thatstudents have not connected the Earth’s tilt with the altitude of the Sun in the skyduring different seasons.

Page 92: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4Item 13, Diameter of the Earth.Choose the best estimate of the diameter of the Earth.

A. 1,000 miles. B. 10,000 miles. C. 100,000 miles. D. 1,000,000 miles. E. 10,000,000 miles.

The Earth’s diameter is roughly 8,000 miles. A study of twenty-foursecond-grade students in 1982 found that boys and girls did not have asignificant preference for the Sun being larger than the Earth (Klein 1982), eventhough the Sun is roughly one hundred times larger in diameter and one milliontimes larger in volume.

Item 13 A B C D EP-value .06 .29 .32 .25 .08D-value -.07 .30 .00 -.15 -.18

The correct answer to this problem is preferred less often than themisconception represented by answer “C.” Although one may argue that thisitem tests for the knowledge of a fact, it is apparent that many students prefer awrong answer to the correct one. It is doubtful that they have been taught thatthe Earth is 100,000 miles in diameter. There must be some reason why theyprefer this answer to the correct one.

Not many students appear to know the diameter of the Earth. At all levelsbut the highest, students seem to prefer a much larger diameter for the Earth,ranging from 100,000 miles to 1,000,000 miles.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 93: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5Item 14, Diameter of the Sun. Choose the best estimate of the diameter of the Sun.

A. 1,000 miles. B. 10,000 miles. C. 100,000 miles. D. 1,000,000 miles. E. 10,000,000 miles.

The diameter of the Sun is approximately 865,000 miles, so answer “D,1,000,000 miles,” is the closest choice.

Item 14 A B C D EP-value .03 .09 .17 .24 .46D-value -.12 -.21 -.08 .11 .14

Students prefer answer “E” to the correct answer by almost a two-to-onemargin.

Students do not appear to be guessing when they answer this question.Students in all performance groups appear to have a preference for 10,000,000miles for the diameter of the Sun. Coupled with the answer to the previous item,it appears that many students believe that the ratio of the Sun’s diameter to theEarth’s diameter is from 10:1 to 100:1. The actual ratio is 110:1, so most studentsthink the Sun and Earth are both much larger than they really are and that theEarth is much closer in size to the Sun than it really is.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 94: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6Item 9, Scale Model of the Sun and the EarthIf you used a basketball to represent the Sun, about how far away would you put a scale

model of the Earth?A. 1 foot or less B. 5 feet C. 10 feet D. 25 feet E. 100 feet

The Earth is about 110 solar diameters from the Sun. Yet, textbookillustrations show the Earth and Sun to be very close in size and just a few solardiameters from each other.

Item 9 A B C D EP-value .07 .22 .23 .22 .25D-value -.07 -.14 -.10 .03 .25

This is a difficult question for most students. They get little practice inbuilding scale models in school, and so they may have little idea what the Earth-Sun system would be like in scale.

The pattern of responses shows no overall misconception that appealsapproximately equally to all performance groups. Students with low overallscores have a preference for a scale distance from 5 feet to 10 feet from the Sun.This translates to a ratio of the Earth orbiting from 6 to 12 solar diameters fromthe Sun. The correct ratio is only preferred by the top-performing students.

This item could be improved by making it more like the previousquestions. There appears to be little discrimination between answers “B,” “C,”“D,” and “E” among subjects, so expanding the dynamic range of the answersmay be useful. Answers of 1 foot, 10 feet, 100 feet, 1,000 feet, and 10,000 feet mayhelp to focus students on a single misconception.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 95: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7Item 46, GravityWhich of the following would make you weigh half as much as you do right now?A. Take away half of the Earth’s atmosphere. D. More than one of the above.B. Double the distance between the Earth and Sun. E. None of these.C. Decrease the Earth’s rate of spin so that 1 day equals 48 hours instead of 24 hours.

Gravity has nothing to do with the atmosphere, the distance to the Sun, orthe length of the day. None of the first three actions would make you weigh less,so “E” is the correct answer.

The idea that air pressure is the cause of gravity could be the result ofincorporation of the fact that people weigh less on the Moon, where there is noatmosphere. As a fourteen-year old student explained, “There isn’t any gravityon the Moon...because...there’s hardly any air there, is there?” Helping out withanother explanation is a sixteen-year-old student who stated, “there ain’t no airin space so they’re as light as anything...if they were on the Moon they’d have towear steel boots to keep them on the ground” (Watts 1982). Even high schoolphysics students are not immune to this idea; roughly 15 percent believe thatgravity is the result of air pressure. One expresses his idea of what holds a bookon a table: “If the air was taken away, the book might drift off” (Minstrell 1982b).A 1981 study involving interviews of 179 college physics students during theirfirst week of class uncovered similar ideas (Gunstone and White 1981). Whenasked to predict the effect of a change in altitude on the weight of an object,several students explicitly stated that lower air pressure would make objectsweigh less, “[It is] ... common sense that the rarefied air will make the bucketweigh less” (p. 298).

Through interviews with twenty-four tenth-grade students, several werefound to believe that the Earth’s gravity was entirely dependent on either theEarth’s distance from the Sun or related to the Earth’s rate of rotation (Treagustand Smith 1986). When asked to explain from which of three identical planets, atdifferent distances from the Sun, a rocket would have the easiest time lifting off,many students expected that the planet furthest from the Sun would manifestthe least gravity. One explained: “it [the planet] is furthest away from the Sunand the gravitational pull is less there” (p. 365). When asked to explain if a rocketwould have an easier time lifting off from a planet that was not spinning, amajority of students (47 percent) argued that the spin was related to gravity.Here is an example of one student’s explanation, “the Earth is a fast rotatingplanet and it takes an enormous amount of fuel to lift a rocket off the Earth. Itwill be easier with a planet (other than the Earth) or (if the Earth exhibited) norotation” (p. 365). Another student argues for his point with an analogy,“Swinging a stone on a string slowly, it will go out a little way, and fast, a longway. That is the same as gravity. Then the ‘no rotation’ planet would be easiest”(p. 365).

The above misconceptions are accounted for in answers “A,” “B,” and“C.” Students who think that more than one of these ideas is operating wouldchoose “D.” Students with a scientific understanding of gravity would choose“E.” The reason that the question was phrased to read “weigh half as much asyou do right now” is that there are very slight effects on weight from each of

Page 96: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8two distractors. Removing the atmosphere would make you weigh a fewounces more, since the atmosphere provides some buoyancy. If the Earth werenot spinning each of us would weigh a bit more. Both these effects would makeus heavier, not lighter.

Item 46 A B C D EP-value .21 .13 .17 .17 .29D-value -.09 -.14 -.01 .00 .25

This is a difficult question with limited discriminating power. The correctanswer is chosen more frequently than other answers by students, although notby much.

The item response curves for this problem show that only answer “A,”that the Earth’s atmosphere affects gravity, is a possible misconception. Airpressure does affect weight at sea level by less than 1 percent. We must notconfuse these minor effects with the major misconceptions. Replacing answer“B” with a less attractive alternative might help improve this problem as a testfor misconceptions.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 97: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9C. Earth and MoonThe Moon is the Earth’s closest neighbor. It is about 2,000 miles in

diameter and, as it orbits the Earth, it keeps an average distance of 240,000 miles.It is a spellbinding sight in the night sky. Over the course of a month in its orbitabout the Earth, it goes through a cycle of phases.

Item 11, Scale Model of the Earth and the MoonWhich is the most accurate model of the Moon in relative size and distance from the Earth?

A

B

C

D

E The larger object in each model is the Earth.

The Moon appears large in the night sky and even larger on the horizon,but its angular size is small, only one-half of a degree. That is small enough forthe tip of your little finger to cover it with your hand outstretched. Picturing theMoon and Earth together from outer space, the Moon is a quarter of the Earth’sdiameter and about thirty Earth diameters away. Answer “E” is an accurate scalerepresentation of the Earth and the Moon.

Item 11 A B C D EP-value .10 .30 .20 .26 .13D-value -.10 .12 -.16 .10 -.01

The correct answer to this problem is the second most unpopular answer.Distractors “B,” “C,” and “D” are chosen more often than the correct answer.This is a very difficult question with no discriminating power.

Page 98: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

0

All students, regardless of performance, appear to prefer a model of theEarth-Moon system that is not to scale. Most students share a belief that theMoon is relatively close to the Earth, roughly from 3 to 10 diameters away.Many of the lower-performing students prefer a Moon that is close to the size ofthe Earth as well.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 99: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

1Item 15, Distance to the Moon from the EarthChoose the best estimates of the distance to the Moon from the Earth.

A. 1,000 miles. B. 10,000 miles. C. 100,000 miles. D. 1,000,000 miles. E. 10,000,000 miles.

The Earth is 240,000 miles from the Moon, on average. “C” would be theclosest answer to being correct.

Item 15 A B C D EP-value .09 .18 .30 .25 .17D-value -.02 -.06 .15 .02 -.11

The correct answer is the one that is most frequently chosen for this item.Answer “D” attracts almost as many students.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 100: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

2Item 20, Moon’s Orbital PeriodChoose the best estimates of the time for the Moon to go around the Earth.A. Hour B. Day C. Week D. Month E. Year

Most students in introductory astronomy courses know that the Moonorbits the Earth (Targan 1987). Watching the Moon over twenty-four hours, it iseasy to see that the Moon appears to orbit the Earth in a day. The Earth’s motionconfounds the Moon’s. While the Earth is spinning, the Moon takes a leisurelytrip around the Earth, rising about an hour later each day. Schoon found that 42percent of his population knew that the Moon’s orbit period was one month,that 36 percent thought it took a day, and 20 percent thought it took a year.

Item 20 A B C D EP-value .05 .37 .12 .37 .07D-value -.13 -.21 -.12 .46 -.16

The correct answer, “D,” shares the same popularity among students asanswer “B.” This is a moderately difficult question with good discriminatingpower. From the point of view of an Earth-based observer, the Moon doesappear to orbit the Earth in a day. This is precisely the misconception that thisitem tests. Even though most students think that the Earth spins, they cannotuse this idea in helping to understand the positions of celestial bodies.

Among lower-performing students, the idea that the Moon orbits theEarth in a day is extremely popular. Higher-performing student have a muchgreater preference for the correct answer.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 101: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

3Item 23, Moon’s Solar Orbital PeriodChoose the best estimates of the time for the Moon to go around the Sun.A. Hour B. Day C. Week D. Month E. Year

As the Moon circles the Earth, the Earth orbits the Sun, taking the Moonalong for the ride. So the Moon goes around the Sun in a year, just as the Earthdoes.

Item 23 A B C D EP-value .06 .10 .14 .19 .51D-value -.18 -.21 -.21 -.25 .57

This appears to be a relatively easy question for students.

Only the lower-performing groups have any preference for the Moon’sorbital period about the Sun being shorter than a year.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 102: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

4Item 7, Frame of Reference

Mars

Deimos

Person Sun

The diagram above represents a model of the Sun, Mars, and one of Mars’ moons, Deimos.Please look at the model and determine which object looks most like Deimos to the person inthe model who is observing from the north pole of Mars.

A B C D EThis item originally appeared in a book on astronomy teaching as an

example of Piaget’s stage of “formal reasoning” (Schatz et al. 1978). To answerthis question correctly, students must be able to switch their frame of referencefrom outside of this system to being on one of the objects within the system.Looking out from Mars, the dark portion of Deimos would be on one’s right, noton the left as seen from the outside.

Item 7 A B C D EP-value .45 .07 .37 .04 .07D-value .31 -.04 -.23 -.08 -.04

For all students, the notion that the dark portion of the Moon remains onthe left is a powerful misconception. Notions that the Moon would appear anyway other than half-illuminated appears unattractive. Students unable to answerthis question correctly would have problems trying to change frames ofreference.

Page 103: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

5

Among students who perform less well on the entire test, answer “C” isvery popular. None of the other distractors appears to have much popularity.This is another problem that tests a student’s ability to change his or her frame ofreference. Although the problem appears much easier than, say, changing fromgeocentric to heliocentric frames, many students still have great difficulty withthis problem. Learning about the phases of the Moon, the light curves ofbinaries, the apparent motion of the Sun at different latitudes, or the appearanceof galaxies all requires some agility with spatial thinking. Without the ability toimagine what objects look like from different perspectives, students will findmany astronomical concepts virtually impossible to learn.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 104: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

6Item 2, Reason for Moon PhasesOne night you looked A few days later you at the Moon and saw this: looked again and saw this:

Why did the Moon change shape?A. Something passed in front of it. D. Its far side is always dark.B. It moved out of the Earth’s shadow. E. None of the above. C. It moved out of the Sun’s shadow.

The Moon’s phases are caused by the fact that our view of the lighted sideof the Moon changes as the Moon orbits the Earth. The Moon has no light of itsown and is illuminated by the Sun. This answer is not listed in the first fourchoices, so “E” is the correct answer.

Even teachers are confused by this concept. The Boston CurriculumObjectives (Marshall 1983) urge teachers to test their students on identifyingMoon phases with a drawing of two “phases,” a crescent Moon and a partiallunar eclipse. Clearly, whoever made up this guide would have answered Item 2with “B” instead of the correct answer. In a study that interviewed fifty pre-service and in-service elementary school teachers, 74 percent of respondentswere found to have incorrect concepts (Cohen 1982). In this study, eleventeachers thought that clouds, a planet, or a star blocked the Moon. Two thoughtthat the Moon is black and white, and rotates, and twenty-four implicated theEarth or its shadow.

An early precursor to misconception studies examined the “sophisticatederrors” of 100 recent high school graduates in 1963. Seventy percent believedthat the Earth’s shadow caused the phases of the Moon (Keuthe 1963).

Item 2 A B C D EP-value .03 .41 .27 .04 .26D-value -.09 .19 -.21 -.05 .06

This is a difficult question, especially because the correct reason for thephases is not listed in the answers, only “none of the above.” However, teachersin our nationwide survey predicted that .34 of entering students would know theanswer to this question and that the fraction who would learn it by the end oftheir course would rise to .72. Two distractors, “B” and “C,” appear to be morepopular than the correct answer

Page 105: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

7

This question reveals that most students have the wrong idea about thecause of the phases of the Moon. Higher-performing students appear muchmore likely than lower-performing students to think that the Moon’s phases arecaused by the Earth’s shadow.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 106: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

8Item 30, Time from Moon PhasesApproximately what time could it be if you saw a thin crescent Moon on the western horizon?

A. Sunrise B. Sunset C. Noon D. Midnight E. Anytime of day or night.

If the thin crescent Moon is on the western horizon, the Sun must be closeby it. It could only be around the time of sunset, since the Sun sets somewhere inthe western part of the sky. Students who think that the Earth’s shadow causesthe phases of the Moon might choose “A,” since for the Earth to be between theMoon and Sun, the Sun would have to be on the opposite side of the sky fromthe Moon.

Item 30 A B C D EP-value .25 .31 .09 .12 .22D-value .14 .05 -.19 -.15 .07

This question was designed to examine whether students could applytheir theory of the phases of the Moon to predict the time of day from itsposition and phase. This item is difficult and has virtually no discriminatingpower.

“B” appears to be the most popular answer, but I do not believe it hasbeen chosen for the right reason. It has been chosen with uniformity by allperformance groups. Perhaps this question is just too difficult. One can see thatmisconception “A” appears to be more popular among better-performingstudents.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 107: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

9Item 24, Moon’s RotationChoose the best estimate of the time for the Moon to turn on its axis. A. Hour B. Day C. Week D. Month E. Year

The Moon turns on its axis once every month. It therefore always keepsthe same face directed to the Earth. Until the space age, we never knew what thefar side of the Moon looked like.31 Keuthe found that 19 percent of the highschool graduates whom he studied had the common belief that the Moon doesnot rotate (Keuthe 1963). I was not aware of this study when I originally wrotethis test. In future versions of this test, this item should be modified so that “itdoes not spin” replaces answer “E.”

Item 24 A B C D EP-value .23 .23 .21 .23 .10D-value .02 .00 -.03 .12 -.13

Students do not appear to have a clear preference for one answer overany other. Perhaps changing “year” to “it does not spin” would attract manystudents.

31The Center for Astrophysics has several globes of the Moon with only one side painted. The

other sided was left unpainted.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 108: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

00D. MathematicsMany astronomy classes rely on mathematical presentations to present

facts or concepts. Scientific notation is used to express astronomical sizes anddistances. Angular measure is used to locate heavenly objects in a variety ofcoordinate systems. Ratio and proportion are relied upon to explain identicalangular sizes. Graphs are used to illustrate relationships and patterns. None ofthe mathematical skills listed above is unfamiliar to students by the eighth gradeand many science teachers assume familiarity with these abilities.

Item 3, Graph InterpretationWhich star on the graph has a temperature most like that of Betelgeuse?

Brightness

Temperature

A B C D E

The ability to interpret graphs is a fundamental skill for students studyingscience. Graphs can show patterns and relationships that are virtually impossibleto ascertain from data alone. They usually present concepts in a concise mannerthat would otherwise require a great deal of descriptive writing (Weintraub1967). The position of a point can represent two values simultaneously. Severalcurriculum projects in science have placed a heavy emphasis on the use of graphsincluding SCIS (Science Curriculum Improvement Study), SAPA (Science-AProcess Approach), and ESS (Elementary Science Study) (Padilla et al. 1991).

In the graph above, the datapoint closest in temperature to the star,Betelgeuse, is “D.”

Many students believe that graphs are concrete representations ofphysical systems. To these students graphs are maps, not abstractions. A graph isthought to be a picture of a situation (Bell et al. 1987). Interpreting the“closeness” of datapoints on a scatter-plot is considered only as anomnidirectional physical closeness and not related to position with respect to oneaxis only. Bell, Brekke, and Swan found that only 26 percent of British highschool students were able to correctly answer questions about the data in ascatter-plot similar to the one above.

The graph above is a simplification of the Hertzsprung-Russell diagram,which relates stellar type (or temperature) and luminosity (or absolutebrightness). During its lifetime, the star’s position on this graph will change.

Page 109: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

01Many students believe that this movement represents a real spatial movementof a star (Schatz et al. 1978). Graphs similar to this one are fixtures inintroductory astronomy texts, although the data represented in this particulargraph are of no import for solving the posed problem. Students who interpretthe above graph in this way would think of “B” as the physically closest point toBetelgeuse. Those students who do not know how to interpret the axis mightchoose “A” because it is closely aligned horizontally with Betelgeuse. One wouldexpect “E” and “C” to be chosen only by students who are guessing.

Item 3 A B C D EP-value .33 .31 .03 .31 .01D-value -.13 -.18 -.14 .38 -.02

The correct response to this question is not the most popular answer.Answers “A” and “B” attract many students.

Lower-performing students show a clear preference for distractors “A”and “B.” Those who choose “B” are choosing the closest datapoint. Those whochoose “A” are answering in terms of the wrong axis.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 110: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

02Item 26, Graph ExtrapolationThis graph shows a plot of the distances of several galaxies from the Earth and the speeds atwhich they are moving away from us. If a galaxy were discovered to be 2,200 million lightyears from Earth, a good estimate of its speed would be:

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

0 500 1,000 1,500 2,000 2,500Distance (millions of light years)

Speed(miles/sec)

A. 0 miles/sec B. 200 miles/sec C. 16,000 miles/sec D. 25,000 miles/sec E. 32,000 miles/sec

Scientists infer relationships from collections of observations. That theuniverse is expanding is evidenced by the fact that distant galaxies are speedingaway from us more quickly than are our closer neighbors. This relationship istypically presented in graphical form and is a good example of how graphs areused in teaching astronomy. In this item, the correct answer can be found byextrapolating a straight line through and beyond the datapoints. At a distance of2,200 million light years from Earth, a galaxy would probably have a speed ofabout 32,000 miles/sec away from us.

One of the nine objectives for TOGS (Test of Graphing in Science) is:“Given a graph and a situation requiring interpolation and/or extrapolation, thestudents will identify trends displayed in a set of data” (McKenzie and Padilla1986). This item was included in the test to determine whether students had thegraph-reading ability to comprehend cosmological arguments based on graphsof recessional velocity.

Item 26 A B C D EP-value .04 .12 .15 .22 .46D-value -.11 -.21 -.22 -.16 .49

This is a relatively easy question for many students. No misconceptionstands out from the table of total statistics.

Page 111: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

03

For the lowest-performing group there is a preference for theextrapolation to be maintained at the level of the final datapoint, answer “D.”

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 112: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

04Item 33, Scientific Notation ConversionConvert to scientific notation: 25,600,000

A. 2.56x105 B. 2.56x106 C. 2.56x10 7 D. 2.56x108 E. None of the above.

Scientific notation is a shorthand for representing very large or very smallnumbers. It is particularly useful in astronomy because astronomical objects areso large that manipulating quantities represented in conventional form would bevery unwieldy. Most textbooks present astronomical quantities in scientificnotation and assume that students know how to interpret them. The correctanswer for this question is found by moving the decimal point over seven placesand representing this as the seventh power of ten.

Item 33 A B C D EP-value .32 .12 .36 .06 .12D-value -.11 -.20 .40 -.13 -.10

This item reveals a misconception that appears in students at allperformance levels. Those who choose answer “A” are simply adding up thenumber of zeros in 25,600,000 and using this quantity of zeros, 5, as the exponentfor the power of ten. These students have probably learned a rule by rote, tocount the zeros only and use this number as an exponent. They most likely donot know what this exponent represents. The others who get the right answermay also not know.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 113: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

05Item 10, Addition of ExponentsIf there are 100,000,000,000 stars in a galaxy and 100,000 galaxies in the Local Supercluster,how many stars are there in the Local Supercluster?

A. 105 B. 1011 C. 10 16 D. 1055 E. None of the above.

Multiplying large numbers is easier using scientific notation. These twonumbers can be multiplied by first converting both to scientific notation and thenadding the exponents (1011 stars/galaxy x 105 galaxies = 1016 stars). Adding upexponents is much easier than multiplying quantities out longhand.

Item 10 A B C D EP-value .11 .11 .40 .13 .25D-value -.14 -.14 .35 -.09 -.12

This skill does not appear to be particularly difficult for most students,although those at the low end of the performance spectrum appear to beguessing at the answer. The students who choose answer “E” are unable tocalculate the answer correctly using the addition of exponents. They are made upof a larger fraction of lower-performing students than of higher-performingones.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 114: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

06Item 16, Similar TrianglesLooking out of a living room window, you see the following:

3 ft.

The window measures 3 feet from top to bottom. The house that you see is 24 feet tall. If you are5 feet from the window, calculate how far you are from the house.

A. 40 ft B. 72 ft. C. 120 ft. D. 360 ft. E. None of the above

The application of reasoning about proportions allows students to solveproblems involving angular size. In this case, the problem can be solvedalgebraically or geometrically.

HeightwindowDistance window

= HeighthouseDistance house

3'5'

= 24'Distance house

Distance house = 24'*5'3'

= 40'5'

3'

?

24'

Researchers have found that students often are not able to reasonformally when it comes to thinking about proportions. Yet, explanations intextbooks require the handling of many variables (Karplus et al. 1978). Karpluslater went on to suggest that “cross-multiplication” may actually inhibit thedevelopment of the understanding of proportion (Karplus et al. 1983). In a test ofreasoning about proportions, only 22 out of 474 college-bound high schoolstudents could use proportions to solve a two-step problem involving shadowsand similar triangles (Farrell and Farmer 1985).

Students who multiply the various dimensions together will end up withthe other numerical choices. Those who cannot find the result among the choiceswill choose “E.”

Item 16 A B C D EP-value .28 .24 .24 .10 .14D-value .19 -.05 -.08 -.02 -.06

This is a difficult question for most students. Answers “B” and “C” arepopular, compared with “D” and “E.” The relatively small fraction of studentschoosing “E” implies that students are at least attempting to calculate the answerto this problem. The difficulty that they have is that they made the calculationincorrectly.

Page 115: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

07

Among lower-performing students, solving this problem appears to be anexercise in guesswork. They remember that one must use multiplication, somany simply multiple two or three of the numbers together to get an answer.All student groups appear to have great difficulty with solving this probleminvolving an application of simple ratios.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 116: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

08Item 25, Degrees in a CircleWhile at sea on a small boat, you see a ship on the horizon. It appears 5° in length. How manyships of the same size and at the same distance could fit around you in a circle?

A. 5 B. 36 C. 72 D. 180 E. None of the above.

Angular measure is very important in astronomy. Most objects in the skyare too far away to measure their size in any direct fashion. Astronomers mustinstead try to find the distance to the object and then use its angular size tocompute its actual size.

This question is an attempt to determine if students think of a circle ashaving 360° and whether they are able to use this information. Many teachersstart with this as a given in teaching astronomy, as stated in an article in TheScience Teacher: “All of us know that circles are divided into 360°” (Russo 1988).

My belief is that for many students, any question dealing with the numberof degrees would produce a rote response of 180°, since there are 180° total inthe interior angles of a triangle. They would have some preference for “B”(180°/5° = 36).

Item 25 A B C D EP-value .14 .17 .37 .08 .22D-value -.20 -.14 .45 -.11 -.13

Page 117: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

09

This question appears to be easy for the highest-performing students butvery difficult for those at the other end of the scale. Some students appear tothink that there are only 180° in a circle. Many cannot use division to solve thissimple problem in angles.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 118: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

10Item 19, ProbabilityYou have flipped a coin six times and it has come up heads each time. What is your bestestimate of what will happen on the seventh flip.

A. definitely tails D. probably headsB. probably tails E. definitely headsC. equal chances of heads or tails

There are a few concepts in introductory astronomy courses that dealwith probability. Most texts discuss astrology and the search for extraterrestrialintelligence and some do so at great length. The idea that events can exert “oddspressure” is common in people’s conversations. Sports fans talk of basketballplayers with “hot hands,” assuming that it is more likely for a player to sink abasket after a long run of success than after a long run of failures. This test wasconstructed to see if students had any preference for the outcome of a coin flipafter many successes of flipping heads.

Item 19 A B C D EP-value .04 .10 .66 .16 .03D-value -.12 -.08 .27 -.17 -.06

Students do quite well on this problem and seem comfortable with aninability to predict the exact outcome of a random event.

This appears to be a relatively easy question for groups of all performancelevels. Only a few of the lowest-performing group had any preference for ananswer above the .20 level. It appears, at least as measured by this problem, thatstudents do not believe in “odds pressure.”

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 119: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

11E. Solar SystemThe solar system consists of the Earth, Moon, and Sun and all the planets

that orbit the Sun. The various planetary moons, asteroids, comets, and otherobjects gravitationally bound to the Sun can also be included. The names of theseobjects are often covered in elementary school, but the scale is often distortedwhen represented in diagrams. The distances in the solar system are vast whencompared with the sizes of the objects it contains.

Item 27, Visual ParallaxObjects that can be seen with the unaided eye and appear to move against the background ofstars during one month are always:

A. farther away from us than the stars.D. at the edge of the visible universe. B . within the solar system. E. a part of a binary star system.C. within the Earth’s atmosphere.

Objects that move against the background of stars are closer to us thanthe stars themselves. Airplanes and meteors are certainly within the atmosphereand move noticeably in seconds. Satellites, just above our atmosphere, movemeasurably in minutes. Objects such as planets or comets may move noticeablyagainst the background of stars in a few days or months. So, objects that movein a month are definitely within our solar system—“B.”

Item 27 A B C D EP-value .13 .39 .24 .15 .07D-value -.16 .35 -.11 -.15 -.04

Page 120: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

12

This question is typical of most factual information. Students prefer thecorrect explanation. There are a few students, however, who characterize anyobject that moves in the sky as within the Earth’s atmosphere. Perhaps theyhave never seen an artificial satellite or noticed the slow trek of the planetsagainst the fixed stars.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 121: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

13Item 28, Relative Distances in the Solar SystemWhich answer shows a pattern from closest object to the Earth to farthest from the Earth?

A. Sun → Saturn → Moon D. Moon → Saturn → Sun B. Saturn → Moon → Sun E. Sun → Moon → Saturn C. Moon → Sun → Saturn

Many children’s books show drawings of the solar system as vastly out ofscale, with planets and the Sun all being about the same size and distance fromeach other. Measuring from the Earth, we find the Sun is about 400 times furtheraway from the Earth than is the Moon. Saturn varies between 3,500 and 4,500times further away from us than the Moon, as Saturn orbits the Sun. So thecorrect answer is C.

Item 28 A B C D EP-value .05 .10 .42 .32 .10D-value -.19 -.24 .29 .09 -.20

Many students have little difficulty with this question. Most know that theMoon is closer to the Earth than either the Sun or Saturn. A substantial numberof students, however, believe that Saturn is closer to the Earth than is the Sun.The fraction of students who choose this distractor is larger for higher-performing students.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 122: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

14F. StarsItem 18, Relative Distances of Stars and Planets

Which answer shows a pattern from closest object to the Earth to farthest from the Earth?A. Space Shuttle in orbit → Stars → Pluto D. Stars → Pluto → Space Shuttle in orbitB. Pluto → Space Shuttle in orbit → Stars E. Space Shuttle in orbit → Pluto → Stars C. Stars → Space Shuttle in orbit → Pluto

Stars are very far from us on the scale of the solar system. The closest starto our solar system is about 200,000 times further away than the Sun is from theEarth. Or, to put it another way, if the Sun were a grape, the earth would be aspeck of dust three feet away and the closest star would be another grape 100miles from us. At this scale, even Pluto would be close to the Earth, at a distanceof 100 feet.

Since stars and planets are almost indistinguishable in the night sky,students may not make a distinction between their distances from us. Among200 eleven- to thirteen-year-old Italian students interviewed, there was nodistinction between stars and planets (Loria et al. 1986).

Item 18 A B C D EP-value .26 .07 .17 .06 .44D-value -.17 -.20 -.27 -.17 .55

The results of this question support the idea that many students believethat there are stars within the solar system, between the Earth and Pluto. Some

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 123: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

15students actually believe that the space shuttle goes out beyond the stars. I havetalked to students who are mystified about the reason why our space ships havenot visited other solar systems, since they think our spacecraft can reach them.Many of these students can see no impediments to human colonization of thegalaxy and view the possibility that visitors from other solar systems have cometo the Earth as totally reasonable.

Page 124: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

16Item 8, Scale Model of the Sun and a Close StarTwo grapes would make a good scale model of the Sun and a close star, if separated by:

A. 1 foot. B. 1 yard. C. 100 yards. D. 1 mile. E. 100 miles.

As described in the preceding item, a good model of the Sun and a closestar is two grapes separated by 100 miles.

Item 8 A B C D EP-value .19 .18 .14 .14 .34D-value -.20 -.14 -.04 .04 .29

This is a moderately difficult question for students. Surprisingly, the majormisconception here appears to be that some students think that stars are only afew dozen diameters away from each other.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 125: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

17Item 37, Sun’s Movement against the StarsIf you could see stars during the day, this is what the sky would look like at noon on a given day. The Sun is in the constellation of Gemini.

€°

°°°°

°°

°

°

°°

° €°

°

° °°

°°

°

°

°€°°

°

€°°

°

°

°

€ °

°

°°

°°°° °

°°

°

°Sun

Taurus

Orion

Canis Major

Canis Minor

Gemini

Cancer

Leo

In what constellation would you expect the Sun to be located at sunset on this day?A. Leo B. Canis Major C. Gemini D. Cancer E. Orion

From a geocentric point of view, the celestial sphere makes one completeturn about the earth in a day, with the Sun pretty much stuck in position. Overthe course of a year, the Sun slowly makes its way through the zodiac at a rate ofabout 1°/day until it arrives back in its starting position. If the background ofstars could be seen along with the Sun, its movement in a day would be barelyperceptible against its background. The Sun would appear to be in the sameconstellation for a month at a time. The correct answer to this item is “C”: theSun would set in the same constellation that it was in at noontime.

Many students view the night sky as permanent and unchanging. Thiscould be because they view the universe as static (Lightman et al. 1987). I havefound that even Harvard students are surprised to find the “stars have moved”when asked to measure the position of stars over several hours. Students whothink of the starry sky as static could interpret this problem in a few differentways. If they view the above chart as a picture of the sky from the northernhemisphere, west would be to the right and the Sun would set down to the right,in Canis Major (answer “B”). Viewed as a map, with north always up, west is tothe left, so the Sun would set in Leo (answer “A”).

Students who know that the Sun is always in the zodiac might recognizethe span of Taurus, Gemini, Cancer, and Leo as part of the zodiac. They couldreason that the Sun would set in the zodiacal constellation closest to the horizon,Leo (answer “A”).

Page 126: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

18Item 37 A B C D EP-value .32 .28 .19 .09 .10D-value .17 -.05 .05 -.19 -.02

The use of a celestial sphere can allow students to test their theories bymodeling the appearance of the sky to determine whether the model matchesactual observations. This has been done in Project STAR activities and by others(Carter and Stuart 1989). The discrimination power of this question is close tozero; students who perform well on the test do no better than those who dopoorly. Two major misconceptions are apparent. The better-performing studentsappear to be attracted to the Sun setting in the constellation Leo. Perhaps theyrecognize Leo as a sign of the zodiac. Other students appear attracted to the Sunsetting in Canis Major.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 127: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

19Item 34, Stellar Parallax

**** * *

**

The Big Dipper would have a noticeably different shape to the unaided eye: A. if viewed from another star.B. if viewed from Pluto.C. if you looked at it a year from now.D. if you viewed it from China.E. never, it would always look the same.

The pattern of stars in the sky is a unique arrangement that remainsunchanged to the naked eye wherever one looks in the solar system. The starsare so far away that changes of viewing location of a billion miles areinsignificant. Only from another star would the constellation look different.Indeed, our own Sun would then appear as a part of some constellation if wewere in another star system.

Item 34 A B C D EP-value .28 .14 .07 .08 .41D-value .39 -.10 -.17 -.15 -.09

Page 128: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

20

The invariance of the starry sky has a powerful pull on the minds ofstudents. The view of the heavens as unchanging is much more popular than thescientific explanation that the patterns of stars change from different viewinglocations. For all but the highest-performing students the stars are fixed.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 129: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

21Item 35, AstrologyMost astronomers consider astrology to be:

A. a science. D. more than one of the above.B . a good way to determine personality traits. E. none of the above.C. helpful in predicting world events.

The topic of astrology often comes up in introductory astronomy classes;indeed, many teachers have told me that new students are often disappointed tofind they will not be casting horoscopes. In Western countries, roughly oneperson in a thousand is practicing or studying serious astrology (Dean 1987).Although scientists view astrology as a pseudo-science and choose “E” as thecorrect answer, the inclusion of this question sought to determine how studentsviewed the subject.

With more than 100 periodicals and about 1,000 books in print (about thesame as for astronomy), astrology can be a highly technical and mathematicalundertaking (Dean 1987). Many students can confuse this analyticalsophistication with science and choose “A.” This may only reflect ignorance ofwhat science is. More confounding is that students may actually believe thatastrology can predict either personality traits, “B,” or world events, “C.”Selection of “D,” more than one of the above, must include belief in either “B” or“C.” A choice of “B,” “C,” or “D” can be viewed as a true belief in astrology aspredictive.

Item 35 A B C D EP-value .37 .07 .07 .25 .23D-value -.08 -.17 -.16 -.11 .45

Page 130: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

22

Students think there is something to astrology. A plurality think it is ascience. Many think it helps determine either personalities or world events.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 131: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

23G. GalaxiesGalaxies are vast collections of stars and gas that are gravitationally bound

to each other. They are so large that if one were to shrink our own Sun down tothe size of a basketball, the center of our Milky Way galaxy on the same scalewould be 100,000,000 miles away, or at the distance of the Earth from the Sun.

Item 29, Relative Distances in the UniverseWhich answer shows a pattern from closest object to the Earth to farthest from the Earth?

A. center of Milky Way → Andromeda galaxy → North Star

B. center of Milky Way → North Star → Andromeda galaxy

C. Andromeda galaxy → North Star → center of Milky Way

D. North Star → Andromeda galaxy → center of Milky Way

E. North Star → center of Milky Way → Andromeda galaxy

The Sun exists as a not very special star among 100,000,000,000 others inour galaxy, the Milky Way. The closest large galaxy to us is Andromeda, whichcan be seen as a faint patch of light with the naked eye in the night sky. TheNorth Star in comparison is relatively close to us. The correct answer is “E.” Thecenter of the Milky way is 100 times further away than the North Star.Andromeda is 3,000 times further away.

Item 29 A B C D EP-value .12 .23 .14 .16 .33D-value .04 .02 -.18 -.13 .21

Page 132: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

24

Students do not seem to prefer one distractor over another in this item,except for a slight preference for the Milky Way being closer to us than theNorth Star. Perhaps some students do not know that the Milky Way is the nameof our own galaxy.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 133: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

25Item 38, Observable GalaxiesThe best place to look for other galaxies in the night sky is:

A. near the Moon. D. away from the Milky Way.B. near dense concentrations of stars. E. close to planets.C. in the constellation Sagittarius.

The topic of galactic structure is often included in introductory astronomyand earth science courses. Activities on the classification of galaxies based ontheir structure are common, leading to the explanation that our own Milky Wayis a spiral galaxy. The distribution of observable stars and globular clusters in thenight sky is often used as evidence that we are within a large, relatively flatcollection of stars. Since we are within a galaxy, we cannot see its structure aseasily as we can view others from afar. Some teachers have proposed activitiesto help students construct models to describe our stellar system. Throughthinking out what the night sky would look like if we were within different typesof galaxies and at different positions within galaxies, a student can develop avery good idea of the shape of our galaxy and our position within it (Doménechand Casasús 1991). Since we are on an arm of our own spiral galaxy, we do notsee any galaxies in the plane of the Milky Way. They are blocked by stars anddust in our own galaxy. One must look outside the galactic plane to see othergalaxies. The correct answer is “D.”

Galaxies are dense concentrations of stars, so students may be swayed toanswer “B,” but galaxies do not appear so in the sky. Galaxies visible to thenaked eye, Andromeda or the Magellanic Clouds, appear as faint patches oflight. The dense concentrations of stars that we can see, such as the Pleiades, arewithin our own galaxy. Planets within our solar system and the Moon havenothing to do with the far-away galaxies and would actually obstruct our view ofsuch faint objects. Sagittarius, a constellation that coincides with the center of theMilky Way, is particularly devoid of other galaxies and was simply included as ajargon-laced distractor.

Item 38 A B C D EP-value .08 .45 .12 .24 .10D-value -.19 .13 -.16 .22 -.13

Page 134: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

26

The most popular answer to this item is that galaxies can be found neardense concentrations of stars in the night sky. This is a misconception. When welook out at the night sky, dense concentrations of stars are inevitablyaccompanied by invisible dust clouds that block our view of galaxies.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 135: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

27Item 36, Expansion of the UniverseWhen the observable universe was half its present age, it was:

A. larger than it is now. D. exactly the same size as it is now. B . smaller than it is now. E. collapsed into a black hole.C. roughly the same size as it is now.

The universe is expanding. The galaxies are shooting away from eachother at enormous speeds. At half of the universe’s present age it had to besmaller than it is now (“B”). In a telephone study of 1,111 American adultsconcerning their cosmological beliefs, only 24 percent believed that the universeis expanding (Lightman and Miller 1989). The majority preferred to think that theuniverse is static. Greater preference for an expanding universe was foundamong males, college graduates, those younger than fifty years of age, andthose who were not church members. This study went on to probe for thereasons supporting each individual’s belief in the size of the universe. The mostprevalent reason was “observation.” The stars in the night sky appearmotionless and this observation appears to be a fact that strongly motivates thebelief in a static universe. An earlier study found that among eighty-three highschool students, many expressed “fears of catastrophe to Earth” with the idea ofa changing universe (Lightman et al. 1987). There appears to be a strongemotional component related to beliefs concerning the nature of the universe. Ina series of interviews of Italian eleven-year-olds, a variety of interestingexplanations were given for a static universe. One study put it succinctly: “Thestars do not move. If only one will, all the universe will be untidy” (Viglietta1986).

Item 39 A B C D EP-value .12 .39 .23 .09 .15D-value -.15 .41 -.17 -.19 -.04

Page 136: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

28

All but the highest-performing students appear to prefer that the universeis constant in size over thinking that it is expanding.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 137: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

29H. Light and ColorItem 43, Role of Illumination

You are in a completely dark room. There are no lights and no windows. Which group of objectsdo you believe you might be able to see? A. bicycle reflectors, a cat’s eyes D. more than one of these groups B. silver coins, aluminum foil E. none of these C. white paper, white socks

For objects to be seen, they must either produce light or reflect light.None of the objects listed produces light and in a completely dark room, nonecould reflect light. The correct answer is “E,” none of the above.

So how could students choose any answer but “E”? The answer is thatthey do not understand the role of light in illumination. A study of 102 fifth-grade students found that the most prevalent belief about the role ofillumination is that we see things because light shines on objects and “brightens”them, not because light is reflected from them (Eaton 1984). Only three studentsfrom this group mentioned reflection or bouncing light in their explanation.Bright objects that are “dazzling” or unusual in their appearance may be sensedas active in some way and not passive scatterers of light (Jung 1987).

In a study of twenty high school students, pupils were asked to explain“What is it that makes you see this object?” Most students never mentioned anylinking mechanism between the object and the eye. Others explained that the actof seeing takes place by a “look” or “vision” going from the eye to the object(Anderson and Karrqvist 1983). Students who hold the latter view would havelittle problem thinking they could see nonluminous objects in the dark, since allthey must do is “look” at an object to see it. Students with this view wouldchoose answer “D.”

In responding to item 43, students who view some objects as “naturallybright” would choose those objects as being seen without illumination. Whitepaper and white socks seem to fit this category. Objects that are seen in the dark,such as bicycle reflectors and cat’s eyes, may be thought of as emitting their ownlight, even though they are just directional reflectors of light.

Item 43 A B C D EP-value .17 .07 .20 .16 .38D-value -.19 -.18 -.04 -.17 .45

Page 138: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

30

This relatively easy question still catches a few misconceptions. Amonglower-performing students there is a slightly greater than random choice of “A,”bicycle reflectors and cats’ eyes. Since these objects are usually seen at night andappear to glow in the dark this answer may seem reasonable. These objects,however, glow only when reflecting light. Among higher-performing students,there is a slight preference over random for white paper and socks. Since theseobjects are very reflective, they would probably be the most noticeable in adarkened room, but they would be unseen in a completely dark room.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 139: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

31Item 44, Light Propagation at NightIt is nighttime. Headlights from a parked automobile light up the road brightly from point Ato point B. A person standing at Point D can see the headlights glowing.

A B C D

Which statement best describes the farthest point that light from the headlights can reach? A. Light does not leave the headlights. D. The light reaches only as far as point C. B. The light reaches only as far as point A. E. The light reaches at least as far as point D. C. The light reaches only as far as point B.

This question, in a slightly different form, was first proposed in a study ofSwedish students 12 to 15 years of age (Anderson and Karrqvist 1983). Manystudents said that the light did not reach the observer, even though he could seethe headlights. The explanation appears to be based on the conception that thelight from the headlamps reaches only as far as the brightly lit road and that aseparate activity of the observer “looking” at the headlamps allows him to see it.For many students there is no connection between the headlights beingilluminated and them being perceptible. As one student remarked, “The lightdoesn’t reach further (than the spot on the road). The light is too weak, but thepedestrian can see it all the same” (p.30). Another student had a similarprediction, but a different mechanism, “Light gets weaker the further away itgoes. In the end it fades out” (p. 31). All in all, only 216 out of 558 studentsbelieved that light actually reached the observer.

Fifty-nine students in grades four through ten were interviewed abouttheir ideas concerning the propagation of light (Stead and Osborne 1980). Inindividual interviews and later in a multiple-choice test, they were asked toexplain how far light “could go” from a variety of sources of light. Studentsexpressed a variety of misconceptions, including, “it stays there [in the candle],”and “it comes out a certain distance depending on the brightness ...but it stopsafter a while....” The majority of students thought that light either stayed in thesource or could travel only a certain distance before it “fades away” or “just getsduller.” These students would prefer answers “A,” “B,” “C,” or “D.”

Many students confuse light propagation and vision. They do notunderstand how they are linked. While it may seem to be contradictory that astudent can agree that the observer in the problem can see the headlightsglowing and yet choose that the light never reaches him, many do not know thatlight must enter the eye to be seen. As one student remarked, “he can see it [thesource], but the light doesn’t reach him” or “it wouldn’t reach him [the light] buthe could still see the TV” (p. 86).

The word “light,” as in light coming from the headlamps, is somewhatambiguous in this context. Students view light as not coming to the observer

Page 140: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

32because the source does not perceptibly illuminate the observer (Jung 1987).Light exists only where it can be seen, in the headlights and on the road.

Item 44 A B C D EP-value .06 .10 .29 .12 .40D-value -.08 -.20 -.17 -.14 .45

The correct answer, “E,” is the most popular answer chosen. Thisquestion, at a P-value of 0.40, is of average difficulty and, with a D-value of 0.45,is of average discriminating power.

Many students choose the distractor “C,” that the light only reaches as faras the end of the spot on the road; it does not reach the eye of the observer.Among lower-performing students, it is the preferred answer.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 141: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

33 Item 45, Propagation in the DaytimeImagine that the parked car described in the item above has its lights on during a bright sunnyday. A person standing at point D can see the headlights glowing.

A B C D Which statement best describes the farthest point that light from the headlights can reach? A. Light does not leave the headlights. D. The light reaches only as far as point C. B. The light reaches only as far as point A. E. The light reaches at least as far as point D. C. The light reaches only as far as point B.

In the Stead and Osborne research discussed above, the authors noticedthat students’ answers concerning the distance that light traveled weredependent on ambient lighting conditions. Among the students who thoughtthat light traveled some distance from the source, most thought that it wouldtravel less far in the daytime, and many thought that in the daytime the lightwould remain inside the source. It appears that many students believe that lightis present only if there is enough light for visual effects such as shadows orbright spots to be noticeable (Stead and Osborne 1980).

Item 45 A B C D EP-value .26 .16 .15 .10 .30D-value -.11 -.17 -.13 -.13 .47

Page 142: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

34

The misconception that light does not reach the observer still has apowerful hold on students in this item. Fewer students choose the correctanswer in this problem than in the one above.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 143: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

35Item 47, ShadowsLightbulb A (ON)

Book

Lightbulb B (OFF)

Book's Shadow

Two identical lightbulbs are placed behind a book. If lightbulb A is on and lightbulb B isoff, the book casts a shadow as shown to the right. If both lightbulbs are now turned on, whichdiagram best represents the shape of the shadow that will be cast by the book?

A. same shadow B. no

shadow

C. longershadow

D. doubleshadow

E. shadow pointing directly toward you.

Teachers have misconceptions too. Elementary school teachers have beenfound to think that shadows are concrete entities (Apelman 1984). As a part of alarge study of the process of conceptual change among elementary schoolteachers, ten teachers were studied to determine how their ideas about light andshadows changed as a result of a four-week intensive physics workshop (Smith1987). Nearly all could state that a shadow was produced when light was blockedin some way, but failed in predicting the outcome of experiments with shadows.This lack of an accurate conceptual model resulted in many predicting thatobjects placed within a shadow would themselves have shadows. Teachers tooklittle note of the role of light sources when discussing shadows.

Item 47 A B C D EP-value .08 .14 .11 .48 .17D-value -.16 -.14 -.20 .37 -.04

This item is of average difficulty and average discrimination. There are noobvious misconceptions.

Page 144: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

36

This question reveals no strongly held misconceptions by any subgroupsbased on whole-test performance. All the distractors are below the 20 percentlevel, which would indicate the probability of random guessing for students whodo not know the correct answer.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 145: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

37Item 32, Relative BrightnessStars A and B appear equally bright in the night sky. However, star A actually gives off morelight than star B. Which of the following is true about star A?

A. It is the same distance from us as is star B. D. It is the same temperature as is star B. B . It is farther away from us than is star B. E. It is the same diameter as star B.C. It is closer to us than is star B.

Stars vary greatly in their intrinsic brightness or luminosity and in theirdistance from Earth. It is important for students to be able to puzzle out theeffect of distance on the apparent brightness of stars. This question presents asituation of two stars that appear equally bright.

Item 32 A B C D EP-value .07 .44 .37 .07 .04D-value -.16 .43 -.19 -.19 -.11

This item is of moderate difficulty and discrimination. Answer “C”appears to be very attractive to many students. This distractor along with thecorrect answer make up 81 percent of the student choices. Since the answersrepresent opposite responses, it may be that students are confused by thecomplexity of the question.

Answer “C” appears to be preferred by lower- and average-performingstudents in the survey. Only the highest-performing students appear not to beparticularly attracted to this response.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 146: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

38Item 5, Inverse Square Law on EarthThe man is reading a newspaper by the light of a single candle 5 feet away. How many

candles would be needed to light up the paper to the same brightness, if the candle holder weremoved 10 feet from the paper?

A. 1 candle B. 2 candles C. 3 candles D. 4 candles E. More than 4 candles

This item seeks to quantify the nature of students’ view of thepropagation of light. A prevalent view is that light is matter, emitted from thesource, which slowly loses mass. In this view, the intensity of light slowly falls inintensity until light is no longer present (Reiner and Finegold 1987).

Teachers’ guides also perpetuate misconceptions. For example, one guidefor eighth-grade science explains that light’s “brightness diminishes the further itgets from its source (except in the case of laser beams).” Light spreads out thefurther it gets from its source, but it never “tires” and laser beams behaveexactly the same way (Marshall and Lancaster 1983).

Item 5 A B C D EP-value .05 .72 .07 .10 .07D-value .00 -.09 -.09 .22 -.01

This is a very difficult question for students, with very little discriminatingpower. The misconception represented by answer “B,” that light falls off as theinverse first power of distance, is the most popular answer chosen by students.

Page 147: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

39

The popularity of answer “B” is uniform across all performance groups.Only for the highest-performing students does the choice of the correct answereven climb above the .20 line.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 148: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

40Item 6, Inverse Square Law in SpaceSaturn is 10 times farther from the Sun than is the Earth. From Saturn the Sun would appear:

A. 100 times brighter than from the Earth. D. 10 times dimmer than from the Earth.B .10 times brighter than from the Earth. E. 100 times dimmer than from the Earth. C.the same brightness as from the Earth.

Item 6 A B C D EP-value .04 .05 .08 .71 .13D-value -.12 -.14 -.09 .13 .07

This is a very difficult question with little discriminating power. The mostpopular response, “D,” represents the 1/r misconception. This item is verysimilar to the one above, but asks students to apply the concept of lightpropagation in an astronomical context.

There is very little difference between performance groups for this item.The selection of misconception “D” actually increases with student performanceon the whole test. Higher-performing students have a higher probability ofselecting this misconception than lower-performing students.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 149: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

41Item 42, Filtering of LightWhen green glass is placed between the flashlight and the white movie screen, a green spotappears on the screen.

White Movie ScreenFlashlight

Green Spot

Green Glass

If green glass and red glass are placed between the flashlight and the movie screen (as shownon the right), what will happen to the spot?

Green Glass

Red Glass

Flashlight?

White Movie Screen

A. It will be green. D. It will be red.B. It will be yellow. E. It will disappear.C. It will be brown.

Objects that appear to change the color of light, such as theatrical gels andstained glass, actually selectively absorb different colors from the beam andallow some colors to pass. In the top diagram, all but green light would beremoved from the beam. The addition of a second filter that absorbs all colorsbut red would absorb the green light and let nothing through. The correctanswer is “E.”

This question was extended into its present form so that students’concepts would be discernible from their selection of multiple-choice answers.

A simpler version of this question was first developed by Anderson andKärrqvist as an “interview about instances.” They asked students how a piece ofcolored glass could change the color of a white flashlight beam, showing them adiagram much like the first illustration in item 42. Many said that the coloredglass caused the light to change color, but when questioned about themechanism, revealed that the red glass added color to the light, bent the light(perhaps like a prism), or lit up the glass so that it produced red light (Andersonand Karrqvist 1983). Only 30 out of 558 students could explain that the glassallowed selective transmission of light, that is, “It’s only the red rays thatpenetrate the sheet of glass” (p. 59).

A study of 227 fifth-grade students showed that 72 percent thought thatwhite light was not made up of a mixture of colors, but that white light was“clear” or “colorless” (Anderson and Smith 1986). So what does colored glass doto a beam of light? Many people believe that colored objects transform the light

Page 150: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

42beam into another color, so that the green light would emerge from the redfilter as being changed into red, “D” (Watts 1985).

Other students believe that the action of filters is the same as when paintmixes. They believe that color is a property of objects, not of light (Eaton et al.1983). Combining green and red paint would produce the color brown. Thesestudents would answer the question as “C.”

Students who think that the first filter determines the color of the lightwould answer “A.”

Those who have mixed lights might know that two separate beams oflight, a green one and a red one, when combined would appear yellow andanswer “B.”

Item 42 A B C D EP-value .05 .13 .52 .13 .15D-value -.14 -.12 .11 -.04 .15

Misconception “C” is the most frequently chosen response by students.The choice of the correct answer is low. This is a very difficult question with littlediscriminating power.

With a D-value of 0.11, the choice of answer “C” actually increases withstudent performance.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 151: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

43Item 41, Mixing of LightYellow light

Blue car

A driver came out of a shopping mall one night and looked at his car. His car is painted blueand the lights illuminating the parking lot are yellow. What color did his car appear to be?

A. white B. green C. yellow D. blue E. black

Color is described as an innate property of an object, e.g., “the book isred” (Anderson and Smith 1986). Scientists view color as the selective reflectivityby objects of light at different wavelengths. The source of illumination plays alarge role in determining the colored appearance of objects. In a study of fifth-grade students, Charles Anderson found that few students share this view. Only2 students of 125 described a green book as reflecting green light. Not a singlestudent could successfully determine the appearance of an object when viewed incolored light.

Light of a given color is often thought of as a material of that color, so thatcombining colors, whatever their origin, is just a mixture of objects (Reiner andFinegold 1987). In this way, students can predict that a mixture of yellow lightand blue car would be green (“B”). No distinction is made between the color ofthe source and the reflective properties of the car.

Item 41 A B C D EP-value .04 .73 .06 .08 .07D-value -.12 .20 -.16 -.07 .05

This is a very difficult problem for students. At 0.07, it has the lowest P-value of any correct answer on the entire test. The misconception represented byanswer “B,” that a blue car would look green under a yellow light, is chosen farmore often than any other distractor on the test.

Page 152: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

44

The result that students chose distractor “B” with increasing frequency astheir overall test performance increases shows that it is a very powerfulmisconception.

Quintile

p-value

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

1.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

A

B

C

D

E

Page 153: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

45VI. Whole Test ResultsThe preceding section discussed how subjects answered individual test

items. This section deals with the test items in a more general fashion. Statisticsare calculated that characterize the test as a whole. Comparisons are madebetween test items and their average characteristics, beginning with an analysisof differences in the P-values and the D-values of test items. This is followed by adiscussion of mean item responses. P-values and D-values are graphed againstone another to characterize the different types of answers to questions. Thefrequency of answer codes (A, B, C, D, or E) chosen by subjects is discussed inrelation to the guessing. Suggestions are made as to which questions should beincluded on a shortened test based on a stepwise regression model. This followswith an analysis of item characteristics and a discussion of the degree to whichthese help to characterize item discrimination and difficulty.

A. Ranking of Test Items by P-valueP-value denotes the probability of choosing an answer as determined a

postiori from a large sample of subjects. It most often refers to the probability ofchoosing the correct answer, although I have used it to characterize each answer,whether correct or not. The P-values of correct test items range from a low of0.07 to a high of 0.68 with a mean of 0.34 and a standard deviation of 0.15. Thesevalues are quite different from the P-values of problems that most teachersgenerate; in constructing a test for which the average student grade is 75/100,the average P-value of the items must be 0.75. Compared with those of moststandardized tests, P-values of teacher-constructed problems are higher becausethey are easier. The common procedure of “grading on a curve” does notchange the fact that these problems are relatively easy; this technique simplyemploys artificial means to change the inherent discriminating power of the test.

A histogram of the forty-seven P-values of correct answers is plotted inFigure 9. There is a wide range of difficulties represented by the items on thistest.

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70

2

4

6

8

P-valueEasyDifficult

Figure 9, Histogram of Item P-values

Page 154: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

46In many standardized tests of academic achievement, items are restrictedto only those with P-values between 0.40 and 0.80 (Osterlind, 1989). Using thisconvention, Figure 10 shows that only fifteen out of forty-seven items of theProject STAR pre-test are acceptable. One can conclude that items that deal withmisconceptions appear to have much lower P-values and may be excluded frommany tests based on such “rules-of-thumb.”

Distributing test items by difficulty reveals that four items have P-valuesgreater than 0.60 and comprise a separate grouping distinct from the otherquestions (see Figure 9 and Table XIV). These questions can be deemed anchors,in that the majority of students have answered them correctly.

Table XIV. Anchors Revealed by Misconception Test22 Earth's Orbital Period 0.6919 Probability 0.67

1 Reason for Day and Night 0.6621 Earth's Rotational Period 0.63

Clement (1986) suggests that anchors can be used to advantage ininstruction as starting points to help overcome misconceptions. To do this,teachers should try to build upon these known concepts. For the four problemsabove, it is relatively easy to imagine ways in which the preexisting knowledgeof the students can be used in teaching new concepts. Four examples follow:

•Reference should be made to the Earth’s orbital period whendiscussing the periods of other bodies such as planets.

•The reason for day and night should be revisited when discussing thephases of the Moon; after all, the “dark side of the Moon” is simplynighttime on the Moon.

•The 24-hour rotational period of the Earth can be brought up toexplain the periodicity in the positions of the Sun and stars.

•The randomness of astrology or meteor impact can be compared tothe flipping of coins.The forty-seven-item pre-test was designed to uncover misconceptions.

To that end, I have calculated the P-values of the twenty-five most populardistractors in Table XV. Those that were chosen with a frequency greater thanthe correct answer for each problem are marked with an “X.”

Page 155: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

47Table XV. Ranking of Misconceptions by P-ValueAnswer#

P-value > Correctanswer

Misconception exhibited

41B .75 X Colors of light mix like paint.

5B .72 X Light intensity drops as 1/r.

6D .71 X Light intensity drops as 1/r.

42C .53 X Colored filters mix like paints.

14E .46 X The Sun is 10x larger than it really is.

17A .46 X Changing distance is responsible for seasons.

38B .46 X Galaxies can be seen near star clusters.

34E .42 X Constellations look the same from any star.

2B .41 X The Earth’s shadow makes Moon phases.

12A .41 X The Sun is overhead every day.

35A .38 X Astrology is a science.

20B .37 The Moon orbits the Earth in a day.

32C .37 Inability to reason with two variables.

7C .37 Objects look the same from the back.

17C .37 X Hemispheres are at different distances from the Sun.

40B .35 X Daylight lengthens in the summer.

3A .33 X Inability to use one axis.

33A .33 Number of zeros is the power of ten.

37A .33 X The Sun moves rapidly against the celestial sphere.

28D .32 Saturn is closer than the Sun.

13C .32 X The Earth is 10x larger than it is.

36C .32 The universe is constant in size.

3B .31 X Inability to use one axis.

11B .30 X The Earth and Moon are a few diameters from each other.

44C .30 Light exists only where it can be seen.

37B .29 X The Sun moves rapidly against the celestial sphere.

4C .28 The Earth’s orbit is highly elliptical.

45A .27 Light does not leave sources in daylight.

2C .27 X Moon moves through the Sun’s shadow.

Page 156: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

48It is not unusual for students to prefer a misconception to the correctanswer on this test. The majority of the misconceptions listed in Table XV haveP-values greater than the P-values of the correct answer. Moreover, an attractivemisconception has a powerful effect on the P-value of an item’s correct answer.The more attractive the misconception is, the lower the P-value of the correctanswer. The P-value of the maximum distractors is plotted against the P-value ofthe correct answer in the graph below (Figure 10). Note the overall trend thatthe items with the most attractive misconception have the lowest P-values forthe correct answer. All datapoints must be beneath the dotted line, whichrepresents the limiting case of no other distractors being chosen by subjects.

I have performed a simple linear regression on this data, shown in Figure10 as a solid diagonal line. Knowing the P-value of the maximum distractoraccounts for 48 percent of the variance in the P-value of the correct answer. Thisconclusion is significant at the p = 0.05 level, since this corresponds to a t-ratio of2.01 (Tuckman 1988).

Figure 10. Effect of Distractor Popularity on Item Difficulty

Maximum Distractor P-value

Corr

ect

Ans

wer

P-

Val

ue

.00

.10

.20

.30

.40

.50

.60

.70

.80

.901.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

Regression Fit

Page 157: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

49Regression Analysis of Correct Answer P-value by Distractor P-valueDependent variable is: Correct AnswerR2 = 47.6% R2(adjusted) = 46.4%s = 0.1056 with 48 - 2 = 46 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 0.465495 1 0.465495 41.8Residual 0.512753 46 0.011147

Variable Coefficient s.e. of Coeff t-ratioConstant 0.571137 0.0387 14.7Max. Distractor -0.726616 0.1124 -6.46This regression analysis produces a model for predicting the fraction of studentsthat choose the correct answer to any items based upon the P-value of the mostpopular distractor. The equation is:

P-value correct answer = 0.57 – 0.73 * P-valuemaximum distractor

It may appear as obvious that the larger the P-value of the maximum distractor,the smaller the P-value of the correct answer.

B. Ranking of Test Items by D-valueThe discriminating power of each test item has been calculated to

characterize the items on the basis of how well they discriminate betweenstudents who do well or poorly on the test as a whole (see Figure 11). These D-values range from –0.01 to 0.57 with a mean of 0.29 and a standard deviation of0.16. Items with low D-values are of little help in discriminating between studentsbased upon their overall performance

Researchers suggest that items with D-values less than 0.40 are subject toimprovement and less than 0.20 are unacceptable (Hopkins 1981, Ebel 1991). TheD-values of test items for the Project STAR pre-test are graphed below.According to these standards, only fourteen out of forty-seven items are “verygood” items and should remain unchanged.

Page 158: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

50

To what degree does the inclusion of an attractive misconceptiondistractor affect the D-value of an item? In Figure 12, I have plotted the D-valueof the correct answer against P-value of the maximum distractor.

In this case, one can see a trend in that items with the highest distractor P-values have low D-values. Only those items with low distractor P-values—e.g.,less than 0.40—have D-values greater than the desired 0.40. How can the effectsof the inclusion of very attractive distractors on the difficulty and discriminatingpower of items be characterized? The inclusion of misconception distractorssignificantly lowers the P-values and D-values of test items. Test-makers wouldprobably exclude such items based on these standard measures. Classroomteachers, who have a preference for test items with relatively high P-values, have

-0.100 0.100 0.300 0.500

2

4

6

8

10

D-value

Figure 11, Histogram of Item D-values

Figure 12, Effectiveness of Distractor Popularity on Item Discrimination

P-value of Maximum Distractor

D-V

alue

of

Co

rrec

tA

nsw

er

.00

.10

.20

.30

.40

.50

.60

.70

.80

.901.00

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00

Page 159: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

51no choice but to discard these distractors. Inclusion of such items could easilylower average test grades to 50 percent or below. In order to use such items,teachers and their students must feel comfortable with these lower average testscores.

C. Mean Item Characteristic CurveIn addition to looking at item response curves (graphs of P-value versus

student quintiles for each of the five answers), as we have done for eachquestion, one can combine them all to generate an average curve. This was doneby first averaging the P-values of the correct answer for each item. Next, thedistractors for each item were arranged separately by popularity, from thedistractor with the maximum P-value to the distractor with the minimum P-value. These were then averaged across all the items. A pie graph of these datais shown in Figure 13. The correct answer was chosen 34 percent of the time.The most popular distractor was chosen almost as frequently—32 percent of thetime. These differences are not significant at the p = 0.05 level. A further analysisis carried out below. The remaining distractors were chosen much lessfrequently.

Figure 14 shows the averages for all five answers in which P-values arebroken down by quintiles to produce composite item response curves for theentire test. This graph shows that the correct answers, as a whole, display amonotonically increasing behavior with respect to the overall performance ofstudents. This result is essentially a consequence of overall student performanceon the test. The maximum distractor average has a P-value almost independentof students’ overall performance. It is surprising to see the flatness of this curve.Successively more unpopular distractors have item response curves that aresimilar to each other in shape. These lowest three distractors each diminish in P-value roughly by about a factor of 2 from the lowest-performing quintile to thehighest.

Correct Answer34%

MaximumDistractor 32%

2nd Distractor16%

3rd Distractor10%

MinimumDistractor 6%

Figure 13, Average P-value of Correct Answers and Distractors

Page 160: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

52

D. Discrimination/Difficulty GraphP-values and D-values can be examined simultaneously by plotting them

against each other, as suggested by Hopkins and Stanley. Items that are verydifficult have little potential to discriminate between individuals. The same can besaid of items with high P-values. Items with P-values between 0.25 and 0.75 havethe potential to be highly discriminatory. In Figure 15, I have plotted three setsof data on one graph. Correct answers are plotted as circles, the most chosendistractor as squares, the least chosen distractor as diamonds. The mean value inboth P-value and D-value for each of these groups is plotted as the same shape,but filled in. Almost all test items fall very near or within the diamond shape.

Only two items fall into the region of “ideal” items, where their potentialdiscriminating power is greater than 0.50. Almost all are scattered throughoutthe top half of the graph, where the D-value is greater than zero. Few correctanswers are in the bottom half of the graph. A D-value less than zerocorresponds to an answer that is chosen more frequently by studentsperforming better on the entire test and chosen less frequently by studentsperforming poorly on the test.

Minimum distractors, those chosen least frequently by subjects, havecharacteristics similar to each other. They appear constrained to one smallsection of graph in Figure 16. Their average P-values and D-values are tightlygrouped. The average P-value of this class of answers is 0.06 with a standarddeviation of 0.03. The average D-value is –0.11, with a standard deviation of 0.06.

Maximum distractors, on the other hand, appear to be much more variedin their P-value and D-value and are much more similar to the correct answers inthese characteristics. Their average P-value is 0.32, with a standard deviation of0.14. This is very similar to the P-value of correct answers of 0.34. The averageD-value of maximum distractors is –0.06, with a standard deviation of 0.13. Thisis quite different from the average D-value of the correct answers of 0.29. The D-

Quintile

p-va

lue

.00

.10

.20

.30

.40

.50

.60

.70

.80

.90

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90

Correct Answer

MaximumDistractor

2nd Distractor

3rd Distractor

MinimumDistractor

Figure 14, Average P-value of Correct Answers and Distractors by Quintile

Page 161: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

53values of the maximum distractors are much smaller than the correct answerand very close to zero. This characteristic is reflected in the flatness of thecorresponding curve in Figure 14.

From the detailed graph in Figure 16 of the correct answers only, it ispossible to identify those items with the highest D-values as numbers 23, 18, and26.

P-value, difficulty coefficient

D-v

alue

, disc

rimin

atio

n co

effic

ient

-1.00

-.50

.00

.50

1.00

.00 .25 .50 .75 1.00

CorrectAnswers

MaximumDistractors

MinimumDistractors

Ideal Items

Figure 15, P-value versus D-value for Item Responses

Page 162: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

54

E. Distribution of Correct AnswersIn any multiple-choice test, the ideal distribution of correct answers draws

equally from each of the possible choices. In this test, the distribution of correctanswers is heavily biased toward category “E” (see Figure 17 and Table XVI).

Table XVI. Answer Response FrequencyA B C D E χ2

Count 6 9 7 7 18 10.34

% 13% 19% 15% 15% 38%

This was certainly not my intention in creating this test. I had thoughtthat I had randomly assigned the answers. We can apply a test to find out, inmuch the same way as in other statistical analyses in this paper, whether thisdistribution could be considered random.

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

D-Value

P-value

1

2

3

4

5

6

78

9

10

11

1213

14

1516

17

18

19

20 21 22

23

24

25

26

27

28

29

30

31

323334

35

36

37

38

39

40

41

42

434445

46

47

A

Figure 16, P-value versus D-value for Correct Responses

Page 163: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

55

Chi-square, χ2, is a measure of the departure of P-values from thoseexpected by chance. A χ2 test was performed on each category to determine ifthe distribution of answers could be explained by a random selection at the p =0.05 level. I used the following formula to calculate χ2 for the test answers:

c2= xobserved-xexpected 2xexpected

∑a=1

5 = xobserved -.20*47 2

.20*47∑a=1

5

= 6-9.4 2+ 9-9.4 2+ 7-9.4 2+ 7-9.4 2+ 18-9.4 29.4

= 97.29.4

= 10.34

In this equation, “a” ranges though each of the five answers: A, B, C, D,and E. The observed frequency count (xobserved) of each correct letter answervaries according to Table XVI. The expected frequency (xexpected) for each of thefive answers is the same: xexpected = 0.20 * 47 items. So one would expect eachanswer to be selected 9.4 times out of a total of 47. For four degrees of freedom,the value of χ2 = 9.49 at p = 0.05 (Tuckman 1988, p. 484). I have calculated χ2 =10.34. This distribution could not have occurred by chance at the p = 0.05 level. Itappears that the test has two many answers that are “E,” the last choice in eachquestion. In revisions of this test, answers should be redistributed so that thecorrect answers fall more equally into the five possible choices.

A chi-square test was also performed on individual test items todetermine if the distribution of responses to any particular test item could beexplained by random guessing. For four degrees of freedom, the expectationvalue of χ2 would be the same as in the test above, χ2 = 9.49.

Letter Alternative of Correct Item Answer

0%5%

10%15%20%25%30%35%40%

A B C D E

Figure 17, Histogram of Answer Response Frequency

Page 164: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

56I have calculated χ2 for each item; all are greater than 9.49 (see Appendix3). This means that selection of answers by students cannot be accounted for bychance at the p = 0.05 level. Although some subjects were undoubtedlyanswering some questions in a random fashion, too few were answering thisway to characterize any item as being answered randomly. The distribution ofanswers one would expect for each item if one were answering randomly(roughly 20 percent for each answer) does not occur in this dataset.

I have also calculated χ2 for each item’s four distractors to determine if thechoice of distractor could be explained by chance selection. For three degrees offreedom, the expectation value of χ2 would be the same as in the test above, χ2= 7.82. All groups of four item distractors have χ2 > 7.82. Distractors, as a whole,were not chosen randomly at the p = 0.05 level.

F. Which Questions Should Be Included on a Shortened Test?Many items on this test have little or no discriminating power. A few

items have a great deal of discriminating power. It is possible to build a modelthat is made of items that account for the most variance in total test scores. Ihave built such a model using the technique of stepwise regression. Starting withthe item with the highest D-value, one tests each other item to find the one thataccounts for the most incremental variance. This process is iterated until all itemsare used. Figure 18 below shows the amount of variance accounted for bymodels of increasing numbers of items.

Figure 18. Stepwise Regression of Test Items by Variance in Total Test Score

# of Items in Regression Equation

R^2,

% o

f V

aria

nce

Acc

ount

edF

or

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 5 10 15 20 25 30 35 40 45

Page 165: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

57Any number of test items can be used to build a shortened test; thenumber of items to include is purely subjective. The more items, the morevariance that can be accounted for (see Table XVII). Since the entire test has aKR-21 Reliability Coefficient of 0.76, one could argue that a shortened test neednot account for more of the variance in the total test score than the entire testdoes of itself. The KR-21 can be thought of as the correlation coefficient thatrelates all the various sets of one-half of the test with each other.

One item alone accounts for almost one-third of the variance (how long ittakes the Moon to go around the Sun). Adding two additional items will accountfor more than half of the variance (graphical extrapolation and knowing how toorder the space shuttle, Pluto, and stars in distance from the Earth). Eight itemsaccount for 75 percent of the variance in the total score. Fewer than one-sixth ofthe test questions are needed to build a good shortened test. By includingtwenty-one items, we can account for 90 percent of the variance. Teachers maywish to use these shortened tests so that less class time is spent in testing.

Table XVII. Stepwise Regression Results for Shortened TestStep Item R^2

1 23 32%

2 26 45%

3 18 54%

4 45 61%

5 35 66%

6 25 69%

7 20 72%

8 32 75%

9 31 77%

10 8 78%

11 38 80%

12 33 81%

13 7 82%

14 28 83%

15 43 85%

16 27 86%

17 21 87%

18 13 88%

19 36 88%

20 10 89%

21 9 90%

Page 166: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

58G. Predictors of Difficulty and DiscriminationWriting questions to test misconceptions is difficult. One would like to

identify factors that help to identify questions that have high D-values.Conversely, finding markers for low D-values or for very high or very low P-values would help to weed out items with low utility. To this end, I have soughtto identify factors that account for the P-values or D-values of items. I havehypothesized that certain attributes could make questions very difficult forstudents. These include questions that are too ambiguous to answer, questions atthe end of the test, those that present a problem in diagrammatic or graphicalform, those that can be characterized as conceptual, factual, or mathematical,those that require graph reading or calculation, or those that have high readingdifficulty. Questions at the end of the test would be difficult to eliminate, but ifitem # is a factor, different forms of the test with items in different orders wouldbe called for.

By carrying out a multiple regression analysis to account for the variancein either P-value or D-value, I have investigated the possibility that these factorscould improve or degrade misconception questions. In this analysis, I have usedonly the Gunning Fog Index as a measure of readability. The Flesch-Kincaid andFlesch Grade Level could not be calculated reliably for some of the questionswith few words.

For statistical significance at the p = 0.05 level, the t-ratio for 39 degrees offreedom (47 test questions minus 7 factors and a constant) should exceed 2.03.

A regression model built with all of these factors accounts for 17.1 percentof variance in P-value. However, no single factor is significant at the p ≤ 0.05level. None of these factors appears to be useful in determining the difficulty oftest questions. A regression model built with all of the above factors accounts foronly 15.6 percent of the variance in D-value. Again, none of these factors issignificant at the p ≤ 0.05 level.

The various item characteristics cannot be used to significantly predict thediscriminating power or the difficulty of these test items. One could argue thatthis analysis shows that many factors play no role in how students answer thesequestions. The position of a particular item on the test—its item #—affects thenumber of students who choose not to answer the item, but not its D- or P-value. There is no significant difference in discriminating power of difficulty ofquestions whether or not they contain a picture, or whether they deal withconcepts, facts, or math. The readability of a question also appears to play norole. This lends support to the view that the reading level of the test items issufficiently low as not to affect the choices that students make in choosing ananswer.

Page 167: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

59VII. Demographic and Schooling Factors ResultsThe purpose of this section is to investigate the connection between a

variety of factors and the number of misconceptions held by students. Dogender, age, parents’ education, school background, and attitude each affectmisconceptions in astronomy?

The thirteen items at the end of the test question students’ backgroundand attitudes. I am interested in finding out how, if at all, these factors relate tohow well or poorly students do on this test of misconceptions. I used severalprocedures to help determine these relationships. First, I graphed the responseto each item from a table of these raw numbers. Next, I calculated the mean totalscore and standard deviation for each subgroup of students, based on theiranswers to each of the demographic and schooling questions. I created a“boxplot” graph that shows this median and several other statistics in graphicalform as described in Figure 19.

The purpose of these “boxplots” is to present data visually, showing thedifferences in students’ total test score by demographic and schoolingsubgroups. They allow easy comparison of key statistical features of the data.These graphs present two ways of comparing the “central tendency” of the data.The “box” in the boxplot small rectangle encloses 50 percent of the students whochose that particular answer; this shows the difference in data as the relativeheight of the boxes. The boxplots also show the median value of the data. Whilethese are not the means, they still make it easy to see and compare subgroups(Velleman 1988). Displaying central tendencies as medians is much less sensitiveto outliers than using means.

The 75 percent cutoff is called the high hinge; the 25 percent cutoff is calledthe low hinge. The median value of the total test score is shown as thehorizontal line inside the rectangle, between the high hinge and low hinge. A 95percent confidence interval is superimposed as a gray area around the median.The “whiskers” extend from the box to the highest data value not above thehigh hinge + 1.5 * (high hinge–low hinge) and to the lowest data value not below

Extreme OutlierOutliers

Highest Connected Data Value WhiskerHigh Hinge ≈ 75%

Low Hinge ≈ 25%

Lowest Connected Data Value Whisker

Median, surrounded by 95% Confidence Interval

Figure 19. Key to Graphical Icons

Page 168: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

60the low hinge –1.5 * (high hinge–low hinge). Beyond this limit, datapoints areplotted with a circle. The extreme outliers, datapoints beyond 3.0 * (highhinge–low hinge), are plotted as starbursts (Velleman 1988).

When two medians appear widely separated, one may think that thedifference is statistically significant. However, medians do not take into accountsample size. These plots are augmented so that one can tell if a difference inmedian is statistically significant (Velleman 1988). The shaded areas about themedian line are 95 percent confidence intervals. Since the sample size of thissurvey is only 1,414 subjects out of a theoretically infinite population, thecomputed median is only an approximation of the population median. Since onecannot predict this unknown value with certainty, a range of medians for theentire population can be generated with some degree of probability. With aconfidence level of 0.95, the median of the population can be calculated to be the± 1.58 * (high hinge–low hinge)/√n. Velleman and Hoaglin (1981) discuss thederivation of this interval and boxplots.

One can see from inspection if the means of these subgroups are differentor if there appears to be a trend in the data. Following this is a simple linearregression model to fit each factor to the total scores and to calculate if thedifference in mean is significant. This determines the amount of varianceaccounted for by this single factor alone. The significance of these models isdetermined by the size of the t-ratio. For p = 0.05 with greater than 120 degreesof freedom (beyond this number, the t-ratio changes only by a tiny amount), t =1.960 (Tuckman 1988). T-ratios greater than this number have probabilities lessthan 0.05. For a level of significance of p = 0.01, the t-ratio must be 2.62.

Page 169: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

61A. Demographic FactorsItem 48, Gender

Sex: A. Male B. Female

Frequency breakdown of gender

Group Count %female 609 43.1male 658 46.5missing 147 10.4

Total 1,414

About equal numbers of girls and boys specified their gender in thisstudy. This fact is supported by teachers who describe their astronomy and earthscience classes as having roughly equal numbers of boys and girls. In chemistryand physics classes, they are relatively more skewed with boys .

female male missing

200

400

600

800

Gender

Page 170: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

62Total score by gender

Group Mean SDfemale 14.80 5.48male 17.50 6.68missing 14.51 7.27

There is a difference of 2.70 items answered correctly in the means of thetwo subgroups, with boys answering more questions correctly than girls. Thisdifference is significant at the p = 0.05 level.

10

20

30

40

Female Male missing

Gender

Total

Page 171: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

63Regression Analysis of Total Score Based on GenderDependent variable is: Total1,414 total cases of which 147 are missingR2 = 4.6% R2(adjusted) = 4.5%s = 6.136 with 1,267 – 2 = 1265 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 2292.63 1 2293 60.9Residual 47630.2 1265 37.6524

Variable Coefficient s.e. of Coeff t-ratioConstant 14.8046 0.2486 59.5Male 2.69236 0.3450 7.80

That boys score higher than girls on this test of science should come as nosurprise. Results are similar to those obtained on National Assessment ofEducational Progress science content questions (Schoon 1988). Boys score about 5percent higher than girls on the NAEP. On this test they performed about 7percent better than girls. Lightman and Miller found that males scored 13 percenthigher than females on their test of cosmological beliefs.

Girls and boys exhibit statistically significant differences in answeringfifteen out of the forty-seven items. Girls score higher than boys on items 10, 11,29, 32, and 35. Boys do better on items 3, 12, 13, 18, 19, 20, 21, 36, 43, and 47. Ihave examined these problems and have found no patterns or similarities thatwould help explain this result.

Page 172: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

64Item 50, Ethnic HeritageWhat is your ethnic heritage? (Indicate the one that you consider the most important part ofyour background.)A. Latin American/Caribbean B. African C. Asian D. European E. Other

Frequency breakdown of heritage

Group Count %African 86 6.08Asian 70 4.95European 532 37.6Latin 99 7.00Missing 118 8.35Other 509 36.0

Total 1,414

The analysis of total scores by ethnic heritage is made problematic by theway in which students answered this question. A large percentage of subjectschose “other” as their heritage. About 8 percent of the subjects chose not toanswer this question at all. Relatively small numbers of students were of African,Asian, or Latin descent.

Total scores by heritage

Group Mean SDAfrican 13.91 5.51Asian 13.90 5.37European 18.71 6.50Latin 13.61 5.34Missing 16.30 7.42Other 14.26 5.44

Afn Asian Eur Latin msg Other

200

400

600

Ethnic Heritage

10

20

30

40

Afn Asian Eur Latin Msg Other

Ethnic Heritage

Total

Page 173: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

65

Students who considered themselves of European heritage score muchhigher on the total test score, while all other groups (except “other”) appear toscore an average of about five points lower. I have built a multiple regressionmodel by creating “dummy” variables for each of the ethnic heritage subgroups.I have excluded the “other” subgroup, since it is accounted for by the presence ofthe other four variables.

Regression Analysis of Total Score Based on EthnicityDependent variable is: Total1,414 total cases of which 118 are missingR2 = 13.0% R2(adjusted) = 12.7%s = 5.895 with 1296 – 5 = 1291 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 6699.02 4 1675 48.2Residual 44864.0 1291 34.7513

Variable Coefficient s.e. of Coeff t-ratioConstant 14.2692 0.2613 54.6European 4.44889 0.3655 12.2African -0.350551 0.6873 -0.510Latin -0.652994 0.6475 -1.01Asian -0.369155 0.7515 -0.491

This analysis shows an accounting for 12.7 percent of the variance in totalscore by students who select their heritage as European. None of the minoritysubgroups is significant at the p = 0.05 level. Only the European subgroup issignificant. Excluding all the subgroups but European still accounts for 10.6percent of the variance. As far as this analysis is concerned, it does not appearthat the particular minority group of a subject matters, only whether they seethemselves as of European heritage or not. This is an important finding.

Page 174: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

66Item 56, Mother’s EducationWhat was the highest level of education that your female parent or guardian completed?

A. Did not complete High School. D. Some collegeB. Graduated from High School only. E. College degreeC. Graduated from Trade, Vocational, or Business School.

For this question and the following one concerning fathers’ education,answers “C” and “D” have been reassigned in coding, so that graduating fromtrade, vocational and business school” is rated higher than “some college.” Amore extensive analysis is contained in Section III, Methodology under thesubheading of 2. Variables.

Frequency breakdown of mother’s education

Group Count %1 - A 96 6.792 - B 411 29.13 - D 229 16.24 - C 126 8.915 - E 402 28.4Missing 150 10.6

Total 1,414

Most of the subjects in this study reported on the education of their parentsthat allows for an analysis based upon their schooling. Looking at the graph ofthe means, it appears that students with more highly educated mothers do betteron this test.

1 2 3 4 5 Msg

100

200

300

400

500

Mother's Education

Page 175: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

67Total score by mother’s education

Group Mean SD1 - A 14.15 5.862 - B 15.14 5.643 - D 16.06 6.184 - C 16.73 6.975 - E 17.56 6.64Missing 14.86 6.99

Dependent variable is: Total1414 total cases of which 150 are missingR2 = 3.2% R2(adjusted) = 3.2%s = 6.222 with 1264 – 2 = 1262 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 1639.55 1 1640 42.4Residual 48854.9 1262 38.7123

Variable Coefficient s.e. of Coeff t-ratioConstant 13.4965 0.4459 30.3Mother’s Ed. 0.818985 0.1258 6.51

Mothers’ education only accounts for 3.2 percent of the variance in totalscore.

10

20

30

40

1 2 3 4 5 Msg

Total

Mother's Education

Page 176: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

68Item 57, Father’s EducationWhat was the highest level of education that your male parent or guardian completed?

A. Did not complete High School. D. Some college.B. Graduated from High School only. E. College degreeC. Graduated from Trade, Vocational, or Business School.

Frequency breakdown of father’s education

Group Count %1 - A 103 7.282 - B 361 25.53 - D 133 9.414 - C 162 11.55 - E 478 33.8Missing 177 12.5

Total 1,414

Total score by father’s education

Group Mean SD1 - A 14.00 5.262 - B 15.69 6.313 - D 16.93 6.364 - C 15.50 5.905 - E 16.936.54Missing 15.246.85

1 2 3 4 5 Msg

100

200

300

400

500

Father's Education

10

20

30

40

1 2 3 4 5 Msg

Father's Education

Total

Page 177: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

69Dependent variable is: Total1414 total cases of which 177 are missingR2 = 1.2% R2(adjusted) = 1.2%s = 6.300 with 1237 – 2 = 1235 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 610.621 1 611 15.4Residual 49013.5 1235 39.6870

Variable Coefficient s.e. of Coeff t-ratioConstant 14.4698 0.4617 31.3Father’s Ed 0.484472 0.1235 3.92

Fathers’ education accounts for only 1.2 percent of the variance in totalscores.

Page 178: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

70B. Schooling FactorsItem 49, Age*

Age: A. 15 yrs or younger B. 16 yrs C. 17 yrs D. 18 yrs E. 19 yrs or older

It originally appeared that the greatest numbers of students chose answer“A,” 15 or younger. However, since there were many eighth-grade students inthe sample, this called for a recoding, as the 15-year-old group is composed ofboth 14- and 15-year-old students. Students’ answers to this question have beenrecoded to a new variable, Age*, based on their answers to Item 51 on theirgrade level. This has allowed an additional age level to be added. The recodingis discussed in detail in Section III, Methodology under the subheading of 2.Variables.

Frequency breakdown of age*

Group Count %14 389 27.515 275 19.516 279 19.717 295 20.918 99 7.0019 17 1.20Missing 82 5.80

Total 1,414

14 15 16 17 18 19 Msg

100

200

300

400

Age*

Page 179: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

71Total scores by age*

Group Mean SD14 13.63 4.9215 16.38 6.3416 16.90 6.0417 17.67 6.9818 16.64 6.82≥19 11.64 3.67

Missing 16.728.26

By examining the 95 percent confidence intervals about each of themedian test scores for each age, one can see a significant overlap in students ofages 15, 16, 17, and 18. These medians are not different at the p = 0.05 level.There appears to be a curvilinear trend in the data, not a linear trend. Students atthe extremes of age in this study have lower test scores than those in the centralregion; students who are 14 or 19 appear to perform significantly worse thanothers. The oldest students, those who are 19 years of age or older, have thelowest scores of any age group, but they make up only 1 percent of the sample.They represent a group of students who are older than their classmates. Moststudents this age have already graduated from high school. These students havemost likely repeated at least one grade. I have created a new variable,acceleration, to account for students who are behind or ahead of their classmates(discussed later) to account for this factor separately.

10

20

30

40

14 15 16 17 18 19 Msg

Age*

Total

Page 180: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

72Dependent variable is: Total1414 total cases of which 82 are missingR2 = 3.7% R2(adjusted) = 3.7%s = 6.162 with 1332 – 2 = 1330 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 1968.21 1 1968 51.8Residual 50530.7 1331 37.9644

Variable Coefficient s.e. of Coeff t-ratioConstant 1.80174 1.978 0.911Age* 0.910080 0.1264 7.20

With a t-ratio of 7.20, the age of subjects in this study is significant at the p= 0.05 level. The variance in total test scores accounted for is 3.7 percent . Theresults on this test do not depend in any large way on the age of the students.

Page 181: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

73Item 51, GradeCurrent Grade Level: A. 9 B. 10 C. 11 D. 12 E. other

Frequency breakdown of grade

Group Count %8 389 27.59 177 12.510 137 9.6911 311 22.012 302 21.4Missing 98 6.93

Total 1,414

Students who took this test were in grades eight to twelve with a meangrade level of 10.0. The large number of eighth-grade students is a naturalconsequence of astronomy being taught as a part of earth science, as shown inthe graph below.

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

grade 7 grade 8 grade 9 grade 10 grade 11 grade 12

Science Enrollments in US Schools 1981-82

Total Enrollment

other

PhysicsChemistry

Biology

PhysicalScienceEarth

ScienceLife

Science

General Science

from "How Many are Enrolled in Science?" The Sci ence Teacher, NSTA. December 1984

10 11 128 9 Msg

100

200

300

400

Grade

Page 182: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

74Total scores by grade

Group Mean SD8 13.63 4.919 15.18 6.4410 16.15 5.4011 17.03 6.1912 18.65 6.93Missing 15.59 8.01

Students at higher grade levels appear to have fewer misconceptions. Themean total test score rises with each grade level. This gain is significant at the p =0.05 level. Grade level accounts for 9.1 percent of the variance. Examining the 95percent confidence intervals for the median test scores, one can see that theseintervals overlap in grades 9, 10, and 11. It appears that students in these gradesdo not perform significantly differently from each other at the p = 0.05 level.Those in grade twelve do significantly better and those in grade eight dosignificantly worse.Dependent variable is: Total1414 total cases of which 98 are missingR2 = 9.1% R2(adjusted) = 9.0%s = 5.984 with 1316 – 2 = 1314 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 4720.72 1 4721 132Residual 47047.5 1314 35.8048

Variable Coefficient s.e. of Coeff t-ratioConstant 4.04602 1.059 3.82Grade 1.20506 0.1049 11.5

Although this result is significant, one must withhold judgment until afterperforming a multiple linear regression on these data. Students in higher gradeshave generally taken more mathematics and science courses. These maycontribute to student scores more than grade level.

10

20

30

40

10 11 128 9 Msg

Grade

Total

Page 183: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

75Item 53, Earth ScienceHave you completed a course in Earth Science? A. Yes B. No

One would assume that taking courses in science would help studentsovercome misconceptions. In particular, earth science, usually taught in gradeseight or nine, typically deals with astronomical concepts for as long as one-quarter of the course. However, earth science taught at the eleventh- andtwelfth-grade levels is frequently recommended to students who are notpursuing the more rigorous science sequence of biology, chemistry, and physics(Schoon 1988).

Frequency breakdown of earthscience

Group Count %No 709 50.1Yes 538 38.0Missing 167 11.8

Total 1,414

Total score by earth science

Group Mean SDNo 15.58 6.11Yes 17.17 6.41Missing 14.26 6.97

No Yes Missing

200

400

600

800

Earth Science

10

20

30

40

No Yes Missing

Earth Science

Total

Page 184: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

76A large fraction of the students in this study have taken earth science.

These students appear to do somewhat better on the test than do students whohave not taken the subject. I followed up on Schoon’s suggestion that somestudents who take earth science may actually perform less well on certain items.On this test, students who had taken earth science did better on average on someproblems and worse on others. One of the items exhibited a difference significantat the p = 0.05 level and was more often answered incorrectly by earth sciencestudents. This was item 4 which deals with the shape of the Earth’s orbit.Students were more likely to choose a highly elliptical shape for the orbit aftertaking earth science. This weakness was offset by statistically significant gains infour other items, numbers 3, 7, 23, and 35, which include the ability to readgraphs, switch frames of reference, know that the Moon takes one year to orbitthe Sun, and knowledge that astrology is not a science or useful for predictingevents.

Dependent variable is: Total1414 total cases of which 167 are missingR2 = 1.6% R2(adjusted) = 1.5%s = 6.244 with 1247 – 2 = 1245 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 770.915 1 771 19.8Residual 48545.0 1245 38.9920

Variable Coefficient s.e. of Coeff t-ratioConstant 15.5853 0.2345 66.5Earth Science 1.58753 0.3570 4.45

Taking earth science appears to help students with a few misconceptions, buthaving taken the subject accounts for only a one and one-half item gain in totaltest score.

Page 185: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

77Item 54, ChemistryHave you completed a course in Chemistry? A. Yes B. No

This question was included to account for students who had takenmultiple science courses. It was thought that the total number of science coursestaken might be a good predictor of the number of misconceptions in astronomythat students hold.

Frequency breakdown of chemistry

Group Count %No 925 65.4Yes 317 22.4Missing 172 12.2

Total 1,414

Almost one-quarter of the students in this study have taken chemistry.

Total score by chemistry

Group Mean SDNo 15.34 5.74Yes 19.16 6.89Missing 13.92 6.92

No Yes Missing

200

400

600

800

1000

Chemistry

10

20

30

40

No Yes Missing

Chemistry

Total

Page 186: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

78Dependent variable is: Total1414 total cases of which 172 are missingR2 = 7.0% R2(adjusted) = 7.0%s = 6.058 with 1242 – 2 = 1240 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 3445.59 1 3446 93.9Residual 45514.1 1240 36.7050

Variable Coefficient s.e. of Coeff t-ratioConstant 15.3438 0.1992 77.0Chemistry 3.82025 0.3943 9.69

Students who have taken chemistry do an average of almost four pointsbetter on total test score than student who have not. The reason for thisdifference, however, may have more to do with other factors than with havingtaken chemistry. Correlation and cause and effect are quite different.Unexamined factors, such as IQ, may be responsible for this difference, or factorsexamined in this study, that are highly correlated with chemistry, such asmathematics or grade level, may explain even more variance.

Page 187: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

79Item 55, PhysicsHave you completed a course in Physics? A. Yes B. No

Many of the questions on this test deal with material that is covered inphysics courses. Many items are presumed by physics teachers to be known bystudents upon entering their physics classes. Of particular note are the set ofquestions dealing with light. One would expect that students who have takenphysics would do quite a bit better than others who have not.

Frequency breakdown of physics

Group Count %No 1097 77.6Yes 147 10.4Missing 170 12.0

Total 1,414No Yes Missing

500

1000

1500

Physics

Page 188: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

80Few students who takeintroductory astronomy or earthscience have taken physics. Physics isusually offered only at the twelfthgrade level for regular students andat the eleventh grade level foraccelerated students. An analysiscalls the accuracy of studentresponses into question. Fifty-eightstudents in grades eight, nine, andten have designated that they havetaken physics.

If these answers, which can bechecked in some way, exhibitproblems with accuracy, what canwe conclude about the reliability of

the answers to other demographic questions? In this case, students may beconfusing an eighth- or ninth-grade physical science course with a high schoolphysics course. Errors in answer questions may only extend as far as thisquestion alone. If they extend further, however, this may have the result ofreducing correlation and regression coefficients. Somewhat more variance couldbe explained by factors if students had answered them accurately.

Total scores by physics

Group Mean SD0 15.86 6.041 19.32 7.32Missing 14.24 6.88

In spite of this problem, students who say they have taken physics scoreabout three and one-half points higher than those who have not. This result isstatistically significant at the p = 0.05 level.

% at Each Grade Level Reporting that They Have Taken Physics

Grade

0%5%

10%15%20%25%30%

8 9 10 11 12

5%

14%10%

6%

26%

10

20

30

40

0 1 Missing

Physics

Total

Page 189: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

81Dependent variable is: Total1414 total cases of which 170 are missingR2 = 3.1% R2(adjusted) = 3.1%s = 6.208 with 1244 – 2 = 1242 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 1549.52 1 1550 40.2Residual 47858.2 1242 38.5332

Variable Coefficient s.e. of Coeff t-ratioConstant 15.8624 0.1874 84.6Physics 3.45738 0.5452 6.34

Page 190: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

82Item 52, Math LevelHighest level math course you have completed:A. General Math B. Algebra I C. Geometry D. Algebra II E. Pre-calculus or Trigonometry

The four-course sequence of Algebra I, Geometry, Algebra II, and Pre-calculus or Trigonometry is standard in most American high schools. Algebra Istarts the exponential decline in college-level math sequence. A student can dropout of the high school math sequence at any grade. For the most part, thosewho do take no more math in high school or drop to a general math course.Several problems on this test deal with mathematics in the areas of graphreading, angular measurement, or scientific notation.

Frequency breakdown of mathlevel

Group Count %1-General 456 32.22-Algebra I 290 20.53-Geometry 214 15.14-Algebra II 224 15.85-Pre-calculus 986.93Missing 1329.34

Total 1,414

Total score by math level

Group Count %1-General 13.77 4.802-Algebra I 15.93 5.993-Geometry 16.36 6.304-Algebra II 18.58 6.395-Pre-calculus 21.14 7.97Missing 15.33 7.27

1 2 3 4 5 Msg

100

200

300

400

500

Math Level

10

20

30

40

1 2 3 4 5� Msg

Math Level

Total

Page 191: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

83

A substantial rise in total test score is apparent in the regression graph andis substantiated by the regression analysis below.Dependent variable is: Total1414 total cases of which 132 are missingR2 = 12.0% R2(adjusted) = 12.0%s = 5.921 with 1282 – 2 = 1280 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 6137.17 1 6137 175Residual 44871.2 1280 35.0557

Variable Coefficient s.e. of Coeff t-ratioConstant 12.1544 0.3409 35.7Math Level 1.65046 0.1247 13.2

A background in mathematics makes a statistically significant difference instudents’ performance on many problems. Problems 3, 10, 16, 19, 25, 26, and 33were designed to test students’ misconceptions in mathematics. The regressionswith these problems are all statistically significant. Math experience also appearsto help students with problems 20, 22, and 45. Math level appears to hurtstudents in answering two questions: 11 and 41. These are significant at the p =0.05 level.

Page 192: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

84Derived Variable: AccelerationThe grade level of a student is a somewhat redundant measure with age.

By subtracting a student’s age from his or her grade and adding six, one has ameasure of how advanced or impeded a student is in progress through school. Ihave defined a factor, acceleration, that is a measure of the degree to whichstudents are ahead of or behind their classmates. Students who are one yearyounger than the average for students in their grade have an accelerationmeasure of +1. Those who are one year older than the average of theirclassmates’ ages have an acceleration measure of –1.

Frequency breakdown of acceleration

Group Count %-4 4 0.283-3 14 0.990-2 16 1.13-1 29 2.050 699 49.41 479 33.92 63 4.46

3 5 0.354Missing 105 7.43Total 1,414

Most students appear to be at the 0 or 1 level for acceleration. This disparityexists because of round-off error in students reporting their ages. For example,in tenth grade, the average student turns fifteen in September, so when the testwas taken, about half the students reported that they were fourteen and halfreported that they were fifteen. This results in an acceleration measure of 0 forabout half the students and 1 for the other half.

-1-3 -2-4 0 1 2 3 Msg

200

400

600

800

Acceleration

Page 193: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

85Frequency breakdown of

acceleration

Group Mean SD-4 8.75 .95-3 11.28 2.30-2 13.06 3.73-1 13.41 5.730 14.69 5.791 18.37 6.30

2 17.47 6.863 13.00 3.87Missing 15.56 13.41

Note that overall performance on the test increases with increasingacceleration. However, there is a substantial reduction when acceleration exceeds+2. This is indicative of a nonlinear trend in the data. Students who are two ormore years younger or older than their classmates appear to hold moremisconceptions than their peers.Dependent variable is: Total1414 total cases of which 105 are missingR2 = 6.7% R2(adjusted) = 6.6%s = 6.067 with 1309 – 2 = 1307 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 3448.90 1 3449 93.7Residual 48104.6 1307 36.8053

Variable Coefficient s.e. of Coeff t-ratioConstant 15.3017 0.1853 82.6Acceleration 1.99420 0.2060 9.68

This acceleration factor is significant at the p = 0.05 level. Students withhigh acceleration measures appear to answer questions correctly more thanothers: 1, 21, 26, 28, 31, 39, and 40. These correlations are significant at the p =0.05 level.

10

20

30

40

-1-2-3-4 0 1 2 3 Msg

Acceleration

Total

Page 194: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

86C. Attitude FactorsA set of attitude factors was originally included on this test as a way to

measure if certain attitudes changed as a result of taking the Project STARcourse. The plan was to see if there were any changes from pre-test to post-test. Ihave included them here only for informational purposes. They are not to betreated as predictive factors.

Item 58, Educational AspirationsWhat is the highest level of education that you plan to complete?

A. Not finish High School. D. Some collegeB. High school. E. College degreeC. Trade, Vocational, or Business School.

Frequency breakdown of educational aspiration

Group Count %1 -A 49 3.472 - B 88 6.223 - D 130 9.194 - C 137 9.695 - E 883 62.4Missing 127 8.98

Total 1,414

A large majority of the students in this study are planning to attend somepostsecondary school.

1 2 3 4 5 Msg

200

400

600

800

1000

Educational Aspiration

Page 195: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

87Total score by student’seducational aspirations

Group Mean SD1 - A 12.49 5.282 - B 13.14 4.703 - D 13.62 5.194 - C 14.53 5.025 - E 17.20 6.51Missing 15.29 7.27

Students with higher postsecondary aspirations appear to do much betteron this test, especially those who intend to graduate from college. This trend issignificant at the p = 0.05 level. Comparing the confidence intervals for theplotted medians, one can see an overlap for the first four categories and themissing data. Only students who aspire to finish college appear to performsignificantly better (at the p = 0.05 level) than their those with lesser goals.

Dependent variable is: Total1414 total cases of which 127 are missingR2 = 6.5% R2(adjusted) = 6.4%s = 6.110 with 1287 – 2 = 1285 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 3310.56 1 3311 88.7Residual 47967.5 1285 37.3288

Variable Coefficient s.e. of Coeff t-ratioConstant 9.98389 0.6714 14.9Ed. Aspiration 1.41121 0.1499 9.42

10

20

30

40

1 2 3 4 5 Msg

Total

Educational Aspiration

Page 196: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

88Item 59, Importance of ScienceHow important do you feel science will be in your future occupation?A. Not at all B. Somewhat C. Important D. Very important E. Essential

This question was included in the instrument to see if students’ interest inpursuing scientific careers changed from pre-test to post-test. In the context ofthis study, it helps to show how students’ attitudes toward scientific careersrelate to their test performance.

Frequency breakdown of importance of science

Group Count %1 207 14.62 460 32.53 304 21.54 153 10.85 163 11.5Missing 127 8.98

Total 1,414

Judging from the graph above, we see that roughly half the students whotook this test are interested in pursuing careers in which science plays a majorrole.

Total score by importance of science

Group Mean SD1 13.87 5.152 15.19 5.863 16.25 6.094 17.58 6.565 19.64 7.14Missing 15.48 7.36

1 2 3 4 5 msg

100

200

300

400

500

Importance of Science

10

20

30

40

1 2 3 4 5 Msg

Importance of Science

Total

Page 197: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

89

Students are interested in scientific careers do much better on this testthan those who are not. This result is significant at the p = 0.05 level.Dependent variable is: Total1414 total cases of which 127 are missingR2 = 7.3% R2(adjusted) = 7.2%s = 6.074 with 1287 – 2 = 1285 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 3733.50 1 3733 101Residual 47408.9 1285 36.8941

Variable Coefficient s.e. of Coeff t-ratioConstant 12.3758 0.4054 30.5Importance of … 1.37604 0.1368 10.1

Page 198: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

90Item 60, Reason for Taking CourseWhy did you decide to take this course?

A. Curiosity or interest. D. Recommended by an adult.B. Hobby or amateur astronomer. E. Friend has taken the course.C. Needed credit.

This question was included on the original test to help determine howstudents make the decision to take a science elective in high school.

Frequency breakdown of reasonfor taking course(in alphabetical order)

Group Count %Adult 132 9.34Credit 423 29.9Curiosity 584 41.3Friend 69 4.88Hobby 53 3.75Missing 153 10.8

Total 1,414

Relatively few students took this course because a friend recommended itor because astronomy is their hobby.

Adult Credit Cur Friend Hobby Msg

200

400

600

Reason for Taking Course

Page 199: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

91Total score by reason for takingcourse

Group Mean SDAdult 15.17 5.45Credit 13.89 5.25Curiosity 17.81 6.51Friend 15.00 5.67Hobby 18.77 8.42Missing 15.36 6.98

Students who gave the reason that they were curious about astronomy orthat astronomy was their hobby did better than average in total score. Thosewho took the course to get credit had fewer correct answers. An adult’s or afriend’s referral was not significant at the p = 0.05 level.

Analysis of Variance For: Total

Source df Sum of Squares Mean Square F-ratio Probabilitycry1 726.150 726.150 19.078 0.0000hby 1 457.078 457.078 12.009 0.0005crt 1 244.372 244.372 6.4204 0.0114adt1 2.60606 2.60606 0.06847 0.7936frd 1 6.37061 6.37061 0.16737 0.6825Error 1408 53591.4 38.0621Total 1413 58019.9

10

20

30

40

Adult Credit Cur Friend Hobby Msg

Reason for Taking Course

Total

Page 200: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

92Dependent variable is: Total1414 total cases of which 153 are missingR2 = 8.6% R2(adjusted) = 8.3%s = 6.063 with 1261 – 5 = 1256 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 4353.40 4 1088 29.6Residual 46167.9 1256 36.7579

Variable Coefficient s.e. of Coeff t-ratioConstant 15.1742 0.5277 28.8Curiosity 2.63911 0.5843 4.52Hobby 3.59934 0.9859 3.65Credit -1.28299 0.6045 -2.12Friend -0.174242 0.9007 -0.193

Page 201: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

93D. Analysis of VarianceThe preceding section analyzed each factor alone for its contribution to

the variance in total test score. This section applies the technique of multiplelinear regression to explain the variance using all of the appropriate factorssimultaneously. All demographic and schooling factors are included; attitudefactors are not included in this model.

Multiple regression analysis is a technique for examining the effects ofmany independent variables on a dependent variable. In the case of this study,the independent variables are the demographic and schooling factors, thedependent variable is the total test score. Using all of these available factorsshows that 30.4 percent of the variance in total test score can be accounted for.This is a substantial portion of the variance, but 69.6 percent of the varianceremains unaccounted for.

Page 202: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

94Regression Analysis of Total Score by Demographic and Schooling FactorsDependent variable is: Total1414 total cases of which 360 are missingR2 = 31.1% R2(adjusted) = 30.2%s = 5.331 with 1054 – 14 = 1040 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 13339.2 13 1026 36.1Residual 29555.4 1040 28.4187

Variable Coefficient s.e. of Coeff t-ratioConstant 8.56079 1.646 5.20European 3.33754 0.3710 9.00Male 2.31198 0.3333 6.94Math level 1.00495 0.2027 4.96Physics 2.75583 0.5805 4.75Acceleration 1.06361 0.3118 3.41Mother’s ed. 0.484542 0.1345 3.60African -0.866513 0.7749 -1.12Earth science 0.329892 0.3578 0.922Chemistry 0.410729 0.4634 0.886Father’s ed. 0.105647 0.1266 0.834Latin 0.039820 0.7012 0.057Grade 0.004254 0.1863 0.023Asian 0.009125 0.9449 0.010

Using all of these factors, one can see that some have large coefficientsand some have small ones. The ratio of the absolute values of largest to thesmallest is 1,117:1. The factor coefficients appear to be very different in thismultiple regression than when factors are treated singly in simple regression.For example, the coefficient for the impact of grade level in this model is aminuscule –0.001, while when treated alone it is 1.21. However, the key statisticin this analysis is the t-ratio. Those factors with t-ratios less than 2.00 are notsignificant at the p = 0.05 level. The t-ratio for grade level, when all the factors areentered together is a measly –0.008 and is not significant at the p = 0.05 level.

As a consequence of this multiple regression, many factors are revealed tobe not significant at the p = 0.05 level. These nonsignificant factors are gradelevel, Latin heritage, African heritage, Asian heritage, the taking of Earth Science,

Page 203: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

95and the taking of Chemistry. The magnitude of the effect of each of thesefactors is the coefficient in the second column.

Using this particular stepwise regression model, one could predict thetotal score of a student by adding to the constant coefficient of 8.81 points, 2.3points if male, 1 point for each year of acceleration, 3.2 points for being ofEuropean heritage, 1 point for each math course, 2.8 points or taking physics,and 0.5 points for each level of mother’s education.Reduced Model of Total Score by Demographic and Schooling FactorsDependent variable is: Total1414 total cases of which 297 are missingR2 = 28.8% R2(adjusted) = 28.5%s = 5.399 with 1117 – 6 = 1111 degrees of freedom

Source Sum of Squares df Mean Square F-ratioRegression 13079.5 5 2616 89.8Residual 32381.7 1111 29.1464

Variable Coefficient s.e. of Coeff t-ratioConstant 8.75662 0.5114 17.1European 3.58453 0.3360 10.7Math level 1.32554 0.1290 10.3Male 2.38195 0.3252 7.33Mother’s ed. 0.485370 0.1191 4.08Physics 2.05254 0.5315 3.86

These factors have been recalculated by building a stepwise model. Eachfactor was added to a linear equation based on the maximum amount ofadditional variance accounted for. Six factors are statistically significant and areincluded in the model. What are we to make of this model? The first three factorsseem to contribute the most to the regression equation; the contribution of eachof the last three is small (see Table XIX).

Table XIX. Variance Explained by Reduced Regression ModelStep Item R^2 R

1 Math 12% 35%2 European heritage 21% 46%3 Gender 25% 50%4 Physics course 27% 52%5 Mother's education 28% 53%6 Acceleration 29% 54%

These results are graphed in Figure 20.

Page 204: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

96

To what degree are the first three factors independent from each other?One can tell from the magnitude of the correlation coefficients, summarized inTable XVIII (from Appendix E). These factors are not highly correlated.

Table XVIII. Correlation of Major Background FactorsTable XI Math Euro. GenderMath 1.00European heritage 0.20 1.00Gender 0.11 0.05 1.00

That mathematics level of students is the major contributor to student testscores is surprising. There are seven problems that deal with mathematics onthe test and students with higher-level math courses do significantly better onthese questions. These students also do better on several questions that are notrelated to mathematics—items 7, 21, 23, and 45. These are questions on framesof reference, the rotational rate of the Earth, the Moon’s orbital period about theSun, and light propagation in the daytime.

R^2,

%

of

V

aria

nce

Acc

ount

ed

For

0%

5%

10%

15%

20%

25%

30%

Math Europeanheritage

Mother'seducation

Physicscourse

AccelerationGender

Figure 20. Stepwise Regression Model of Total Score by Six Background Factors

Page 205: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

97VIII. DiscussionThis work confirms misconceptions that have been addressed in smaller

studies and develops a profile of conceptions of introductory astronomy stu-dents. It addresses the relationship between performance on a test ofmisconceptions and various background experiences and factors of the subjects.I shall first address each of the research questions in turn. I will then discuss theitems on the test and follow with dissemination issues and problems specific tothis dissertation.

A. Research Questions1. Validity

1(a) Is the test a valid instrument for measuring the misconceptions of studentsentering an introductory astronomy course?

This test is a valid instrument for measuring the misconceptions ofstudents in astronomy. The test was generated from interviews with studentsabout their ideas and by combing the literature for misconceptions aboutastronomical concepts. The instrument was pilot-tested on thousands ofstudents to refine the wording of questions, so that distractors reflected popularideas of the students.

A group of astronomers took the test and agreed on the answers to allforty-seven questions. Graduate students in Harvard’s Department ofAstronomy took early versions of the test. Those questions that were answeredincorrectly by these students were changed or eliminated. Teachers involved inthe Project STAR curriculum reviewed the test and suggested changes in itemsthat they viewed as confusing or inaccurate.

A large group of 240 teachers of introductory astronomy and earthscience helped to validate the test by predicting how their own students wouldperform on sixteen items from this test. They predicted that their own studentswould answer the items correctly at a 73 percent mastery level, on the average,after taking their course.

Alternative hypotheses were explored for why some questions wereanswered with greater frequency than others. The factors explored for eachitem were: the order in the test, number of students who chose not to answer,the inclusion of a picture or diagram, concept, fact, or math skill, and readinglevel. None was found to be significant at the p = 0.05 level.

Two χ2 tests were performed. The first was to determine if any studentanswers to items could be explained as based on random guessing of all thepossible answers. The second looked only at the distractors, discounting anystudents who answered items correctly. Both hypotheses, that the statistics werethe result of guessing, were rejected at the p = 0.05 level.

1(b) Which test items appear to be most appropriate in assessing student mis-conceptions in astronomy and should be included in revised instruments?

All test items were subject to a calculation and discussion of their difficulty(P-value) and discrimination (D-value). A stepwise regression was performedthat generated a list of items that were, as a group, highly discriminatory. This

Page 206: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

98technique selected a group of items that could more efficiently predict students’total scores by eliminating questions that were highly correlated. A set of sevenproblems was able to account for 75 percent of the variance in the population. Aset of twenty-one questions could account for 90 percent of the variance. Eitherset could be used by researchers and teachers as a highly discriminatory test ofstudents’ astronomical misconceptions.

1(c) How reliable is this test?Several tests of internal validity were carried out as measures of this

instrument’s reliability. These measures range from a low of 0.76 to a high of0.80. These results are consistent with other tests of this type. For achievementtests that are used to determine whether the mean scores of subgroups aresignificantly different, a reliability coefficient of 0.65 is satisfactory (Aiken 1985).The test–retest method was not used.

2. Misconceptions Revealed2(a) For students enrolling in a course where astronomical concepts are taught,

for which concepts will students initially hold conceptions that are at odds with acceptedscientific views? 2(b) Which misconceptions appear to be the most prevalent amongstudents.

This test revealed that students held a level of mastery over only threeastronomical concepts tested. They held misconceptions, however, about mostof the major astronomical concepts treated in introductory astronomy and earthscience courses.

Fifty-one student misconceptions were revealed by this test, nineteen ofwhich were preferred by more students than was the correct answer. Thesemisconceptions were listed by overall student preference. An item responsecurve was generated for each question, comparing the P-values of the correctanswer and four distractors across five student performance levels. Surprisingly,some twenty-two wrong answers were preferred with greater frequency byhigher-performing students.

Perhaps the best way to characterize student understanding is to present aprofile of a hypothetical student entering an earth science or astronomy course.Although there are probably no students with all of these ideas, this compositerepresents the average student in our sample. In this profile, I have taken someliberty in reporting misconceptions that are very popular, but may not be heldby a plurality of students:

The composite student is a male high schooler who has not taken earth science,chemistry, or physics previously. His math background consists of a course in Algebra I.He identifies himself as being of European heritage. Both his father and his mother havesome higher education, a few years at college or a degree from a two-year school. Thestudent intends to go on to get a college degree and thinks that science will, at least, besomewhat important in his future occupation. He has chosen to take this astronomycourse because he is curious about the subject, but he does not consider it a hobby. Hehas firm ideas about most scientific concepts and rarely guesses at the answers to thequestions in this test.

Page 207: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

99He tends to think of the astronomical world as fixed, or, at least, as constant. Hecan state that the Earth turns on its axis, but he is not quite sure of the ramifications ofthis motion. The length of daylight, the path of the Sun in the sky, and the movement ofthe Sun against the background of stars are all misconceived. In his view, the Sunmoves in a uniform, unchanging way, rising in the East, being overhead at noon, andsetting in the West. Its path is independent of geographic location or season. He knowsthat the Earth orbits the Sun in a year, but thinks that its path is highly elliptical.

He has ideas about the size of and relative distance between astronomical objectsthat are vastly out of proportion. Both the Earth and the Sun are thought to be about tentimes their actual diameter. Solar system objects are thought to be much closer to eachother than they actually are. This supports his view that the seasons are caused by theEarth’s changing distance from the Sun and the Moon’s phases are caused by the Earth’sshadow. The Moon circles the Earth in a day while the stars appear fixed in the sky. Theentire universe is compressed. Since the stars are fixed, traveling to another star wouldnot change the appearance of constellations. Galaxies are much further away than thevisible stars. The universe itself is static, neither expanding nor contracting. Gravitydoes not play a major role in the structure of the universe since it is not dependent onmass and distance, but only on air pressure.

The nature of light is thought to be understood. Light takes time to reach us fromthe stars, but its role in vision is misconceived. Light exists only where it can be seen.When a flashlight illuminates an object at night, he thinks there is no light between theflashlight and the object; light only exists where its effect can be seen by the observer’seyes. During the daytime, sunlight is thought to force the light from leaving a source.Moreover, objects can be seen without light traveling from them to one’s eye. He alsobelieves that light intensity diminishes with the inverse first power of distance. Coloredobjects transform the color of light, as opposed to selectively absorbing differentwavelengths. Light is not believed to be composed of particles, but to be a condition.

Misconceptions in mathematics limit the usefulness of graphs and calculations inhelping to understand astronomical concepts. He can extrapolate graphical data, but hasdifficulty reading graphs and extracting useful information or patterns. Hisunderstanding of scientific notation is poor. Order-of-magnitude calculations aredifficult for him and are often performed incorrectly. He understands angles only whenthey are concrete and small. More abstract arguments using angular measure are noteffective with him. He thinks a circle has only 180° of internal angle. Size-to-distanceratios are a foreign idea to him. He solves simple algebraic equations, but cannot applyproportional reasoning to real-world or word problems. He sees math as a separatesubject with little relevance to or utility in learning science.

3. Demographic Factors and School-Based FactorsSignificant differences in student performance on this test relate to several

demographic factors. By using the technique of multiple linear regression, themost important factors were identified. The model constructed from thisanalysis accounts for roughly 30 percent of the variance in total test scores of thestudent population. Thus, most of the variance remains unexplained. Somehowstudents who do well on this test are exposed to information or processes thatwere undetected by this test. Because overall test performance was poor, oneexplanation for the small amount of variance explained by identified factors is

Page 208: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

00that there was considerable guessing on the part of students and that therewere other unaccounted-for factors.

3(a) Are differences in the quantity of misconceptions related to gender? 3(b)Are differences in the quantity of misconceptions related to ethnic heritage? 3(c) Aredifferences in the quantity of misconceptions related to the educational accomplishmentof parents or guardians?

Male students perform better on this test by about 3.5 items, or about 7percent. Students who identify themselves as of European heritage average 5items, or 11 percent, higher on the test than others. Both these results aresignificant at p = 0.05, but are not unusual for tests of ability.

What is much more surprising is that other background factors play sucha limited role in accounting for the variance in results. Mother’s education is asmall factor, explaining only an additional 1 percent of variance in the regressionmodel. Father’s education was not significant at the p = 0.05 level. The fact thatfathers of some students have much more education than others was notsignificantly related to student scores. It appears that the educational level ofparents has little influence on the quantity of misconceptions that students hold.Two possible explanations are that parents may transmit misconceptions to theirstudents, or that, even if parents are aware of scientific conceptions, they havedifficulty transmitting these views effectively to their children.

These two factors are usually used to characterize the socioeconomicbackground of students. It appears that the income level of a student’s parentshas little relation to his or her conceptual understanding in astronomy. Studentswithout the benefit of highly educated parents appear to be at no disadvantageas far as scientific misconceptions are concerned.

4(a) Are differences in the quantity of misconceptions related to student’s gradelevel or age? 4(b) Are differences in the quantity of misconceptions related to a student’sprior completion of specific mathematics or science courses?

Only one highly significant schooling factor was uncovered by this study.The level of math courses taken explained the largest fraction of variance in theregression model. Students with several math courses had fewermisconceptions. Even for items on the test that did not deal with mathematicalskills, the level of mathematics explained a large amount of the variance. Twoother factors were significant, but minor in their effect: Whether or not a studenthad taken a physics course explained an additional 2 percent of the variance.Only a 1 percent effect of the accounted variance was explained by“acceleration”—that is, whether a student is ahead of or behind her or hisclassmates on the basis of age.

Again, the most astounding result of this multiple regression is thatseveral factors appear to have no impact on the number of studentmisconceptions. Student scores were found to be independent of grade levelwhen all other factors were taken into account. It appears that spending moretime in school has no impact on scientific misconceptions. Taking an earthscience course did not appear to have a significant impact on scientificmisconceptions either. Even though one-quarter of most earth science curricula

Page 209: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

01is astronomy, this fact did not appear to reduce misconceptions in the sample.Taking an additional science course or a year of chemistry did not help either.

B. Characterizing Misconception QuestionsCharacteristics of items that identify student misconceptions are quite

different from those on standardized tests or those that teachers create forstudent tests. The P-values of these misconception questions are often low,because students prefer an incorrect misconception to the correct answer. Thesequestions are less able to discriminate between students who perform well on anentire test and those who do poorly, because these questions are more difficult.Analysis of test items may show that multiple-choice misconception items arevery good test questions and should be included in standardized tests in spite oftheir low P-values.

Creating items that reliably identify misconceptions is not easy. There isno simple rule to help create these questions except, perhaps, one: misconceptionquestions are more difficult than ordinary questions—they must contain veryattractive distractors. Testmakers must use interviews of research results to findsuch distractors. A question that is answered correctly by 75 percent of subjectscannot reveal misconceptions. As revealed by this test, students appear often tochoose misconceptions with the same average frequency as correct answers. AP-value greater than 0.50 prevents this type of result. As a rule-of-thumb,misconception questions must have P-values less than 0.50. On this test theyhave averaged 0.34.

This result on testing for misconceptions may be somewhat dishearteningto teachers and their students. A test made up of only misconception questionsmight indeed yield a mean score of 34 percent. In most classrooms, this wouldbe equivalent to a letter grade of F. If teachers are to test using misconceptionquestions, they must let their assignments of letter grades reflect this loweraverage P-value of questions.

C. DisseminationDissemination of these findings should heighten the awareness of prac-

titioners as to the prevalence of scientific misconceptions in their students. Iexpect the impact of these results on the committed teachers is to be a reductionin the number and complexity of concepts presented in introductory astronomycourses and the implementation of explicit treatment of student misconceptionsthrough discussion and experimentation. The presentation of a revisedmisconception test, in multiple-choice format, will encourage teachers andresearchers to use this test as a tool for assessing the misconceptions of their ownstudents both prior to and after instruction. Individual items may find their wayonto exams and standardized tests as well.

I expect to submit the results of this study in summary form to the journalScience Education. Two papers already submitted are:

Philip M. Sadler. High School Astronomy: Characteristics and StudentLearning, Proceedings of the Workshop for Hands-on Astronomy for Education.Tucson, AZ: Fairborn Observatory, on 3/5/91.

Page 210: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

02Alan Lightman and Philip M. Sadler. How Well Can Science TeachersPredict Student Misconceptions before and after Instruction? Submitted toScience Education, March 1992.

This study will set the stage for further papers on the role ofmisconceptions in teaching astronomy. One is planned:

• The effectiveness of the Project STAR curriculum in reducing studentmisconceptions in astronomy. This will be an analysis of pre-test and post-testresults for control and Project STAR groups.

D. Errors, Omissions, and ProblemsThis study is subject to the limitations of any survey. It is a sample of a

total population and, even though the sample is large, it may be biased. Studentsin the selected schools may be different from the general population of studentsenrolling in introductory astronomy classes.

The test itself may be biased in favor of males of European extraction whohave taken many science and math classes. After all, I certainly fit thatdescription, as do most of those who helped to create and administer the test.Perhaps some subtle nuances in wording or style worked their way into the test.Also, students may have worked harder to complete the test if they sharedcertain attributes with the administering teacher.

Correct answers on this test were not equally distributed; 38 percent ofthe correct answers corresponded to answer “E.” Students may have discoveredthis preference and chosen this letter disproportionately or may have avoidedthis answer. Future versions of this test should remove any such preference.

Demographic questions could have offered better choices. The extensionof student ages downward by recoding could have been avoided by offering awider range of possible responses. The fact that a large portion of the studentschose “other” for their ethnic heritage points to a need for an expanded range ofchoices for future tests or better definitions in the sense that all students willunderstand all choices.

Some questions on this test do not appear to measure misconceptionswell. There is no dominant choice of an answer from students on thesequestions: the D-value of the correct answer is low. These questions should bereworked and retested, or eliminated from the test.

The reliability of the test could be improved. Internal consistency shouldnot be the only measure used to evaluate reliability. A test–retest procedurecould be carried out on a group of students sufficiently large to producestatistically significant results.

The tests of validity could be strengthened. Accomplished individuals,such as astronomy graduate students or astronomy teachers, should take thetest in its entirety. Astronomy teachers should predict outcomes of theirstudents on the entire test, rather than on only a subset of the test. Using onlysixteen items is not as strong a validation procedure as using the entire test. Acomparison could also be made between how students answer the written testquestions and how they choose answers when taking the test orally.

Page 211: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

03E. Future ExtensionsThe level of overall understanding of astronomical concepts in this student

population is appalling, even in a pre-test, and probably limits students’ ability tointegrated new concepts into their already well-developed frameworks ofunderstanding. Introductory earth science and astronomy books pay scantattention to many of the ideas that students hold. Without revisiting thesemisconceptions, students are damned to try to place new conceptions uponfaulty foundations. Perhaps this is why taking an earth science course has noimpact on student conceptions in astronomy. These misconceptions were neverdealt with before attempting to teach new ideas.

Several researchers have found that students can abandon theirmisconceptions and learn scientifically correct ideas only with unusual teachingmethods. The key technique is that students must elucidate their ownpreconceptions and then test them. Only by realizing that their own ideas cannotexplain the outcomes of experiments or natural phenomena do students realize aneed for a different theory. Teachers can then present the scientifically accurateconcept as a powerful idea that can predict and explain events. One consequenceof accepting these new ideas is, strangely enough, that the old conceptions areforgotten. So, misconceptions appear to be erased from students’ minds. Thismakes it very difficult for teachers to recall misconceptions from their ownstudent days; they simply do not remember them. Teachers must rely on theirown interviews or become familiar with the literature on scientificmisconceptions in order to incorporate these ideas in their teaching.

The low test scores could be thought of as boding well for teachers ofintroductory earth science and astronomy courses. There is a lot of room forimproving students’ scientific conceptions. This test will prove to be a usefulinstrument in attempts to determine if interventions that seek to modifystudents’ conceptions produce significant change. This test provides an essentialbaseline of student misconceptions for further studies. By analyzing post-testscores of Project STAR treatment groups and control groups, the impact of thisprogram can be assessed and results documented for future efforts of curriculumreform.

Page 212: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

04IX. ReferencesAbacus Concepts Inc. Statview 512+. Calabasas, CA: Brainpower Inc., 1986.Aiken, Lewis R., Psychological Testing and Assessment. Boston: Allyn and Bacon,

1985.Anderson, B., and C. Karrqvist. “How Swedish pupils, aged 12-15 years,

understand light and its properties.” European Journal of Science Education 5 (41983a): 387-402.

Anderson, B., and C. Karrqvist. Light and Its Properties. Trans. by GillianThylander. Molndal, Sweden: University of Gothenburg, 1983b.

Anderson, Charles W., and Edward L. Smith. Children’s Conceptions of Light andColor: Understanding the Role of Unseen Rays. Institute for Research onTeaching, Michigan State University, 1986. ERIC ED 270 318.

Anderson, G. Encyclopedia of Educational Evaluation. London: Jossey-Bass, 1975.Apelman, M. “Critical barriers to understanding of elementary science: Learning

about light and color.” In Observing Science Classrooms: Observing SciencePerspectives from Research and Practice, ed. by Charles Anderson. Columbus,OH: ERIC/SMEAC, 1984.

Aristotle. de Sensu. Trans. by W. S. Hett. London: Loeb Classical Library, ed.,1957.

Arnaudin, Mary W., and Joel J. Mintzes. “Students’ alternative conceptions of thehuman circulatory system: A cross-age study.” Science Education 69 (5 1985):721-733.

Arons, Arnold B. “Student patterns of thinking and reasoning, part one ofthree.” The Physics Teacher (December 1983): 576-581.

Atkin, J. Myron. “Some evaluation problems in a course content improvementproject.” Journal of Research in Science Teaching 1 (1963): 129-132.

Ausubel, D.P., J.D. Novak, and H. Hanesian. Educational Psychology: A CognitiveView. New York: Holt, Rinehart and Winston, 1978.

Bell, Alan, Gard Brekke, and Malcom Swan. “Misconceptions, Conflict andDiscussion in the Teaching of Graphical Interpretation.” In 2nd InternationalSeminar on Misconception and Educational Strategies in Science and Mathematicsin Ithaca, NY, ed. by Joseph D. Novak. Ithaca, NY: Cornell University Press,1987, pp. 46-58.

Bell, Beverly, Roger Osborne, and Ross Tasker. “Finding out what childrenthink.” In Learning in Science, The Implication of Childrens’ Science, ed. by RogerJ. Osborne and Peter Freyberg. Auckland, New Zealand: Heineman, 1985, pp.151-165.

Bloom, B. S. “Mastery Learning.” In Mastery Learning: Theory and Practice, ed. byJ. Block. New York: Holt, Rinehart, and Winston, 1971.

Bouwens, Robert E. A. “Misconceptions among pupils regarding geometricaloptics.” In GIREP - Cosmos - and Educational Challenge in Copenhagen, EuropeanSpace Agency, 369-370, 1986.

Page 213: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

05Broman, Lars. 27 Steps to the Universe. Salt Lake City: International PlanetariumSociety/Hansen Planetarium, 1986.

Brown, David E., and John Clement. “Misconceptions concerning Newton’s lawof action and reaction: The underestimated importance of the third Law.” InGIREP - Cosmos - and Educational Challenge in Copenhagen, European SpaceAgency, 1986, pp. 39-53.

Brumby, Margaret N. “Misconceptions about the concept of natural selection bymedical biology students.” Science Education 68 (4 1984): 493-503.

Camp, Carol Ann. “Problem solving patterns in science: Gender and spatialability during early adolescence.” Ed.D., University of Massachusetts,Amherst, 1981.

Caramazza, A., M. McCloskey, and B. Green. “Naive beliefs in sophisticatedsubjects: Misconceptions about trajectories of objects.” Cognition 9 (1981):117.

Carey, Sue. Conceptual Development in Children. Cambridge, MA: MIT Press, 1985.Carter, Karl C., and Bruce R. Stuart. “Using a celestial sphere to test scientific

concepts.” Journal of College Science Teaching 19 (3 1989): 164-167.Champagne, A. B. and Leo E. Klopfer. “A causal model of students’ achievement

in a college physics course.” Journal of Research in Science Teaching 19 (1982):299.

Champagne, A. B., L. E. Klopfer, and J. D. Anderson. “Cognitive research andthe design of science instruction.” Educational Psychologist 17 (1 1982): 31-53.

Champagne, A. B., Leo E. Klopfer, and J. H. Anderson. “Factors influencing thelearning of classical mechanics.” American Journal of Physics 48 (1980): 1074.

Clement, John. “Overcoming students’ misconceptions in physics: The role ofanchoring intuitions and analogical validity.” In GIREP - Cosmos - andEducational Challenge in Copenhagen, ed. by J. Hunt. European Space Agency,1986, pp. 84-97.

Clement, John. “Students preconceptions in introductory mechanics.” AmericanJournal of Physics 50 (1982): 66.

Cohen, H.G. “Dilemma of the objective paper-and-pencil assessment within thePiagetian framework.” Science Education 64 (1980): 741-745.

Cohen, Michael R. “How can the sunlight hit the Moon if we are in the dark?Teacher’s concepts of phases of the Moon.” Paper presented at Henry LesterSmith Conference on Educational Research, 1982.

Cohen, Michael R., and Martin H. Kagan. “Where does the old Moon go?” TheScience Teacher 46 (1979): 22-23.

Dai, Meme F. “Misconceptions about the Moon held by fifth and sixth graders inTaiwan.” National Science Teachers Association, 1990.

Dean, Geoffey. “Does Astrology Need to Be True?” The Skeptical Inquirer 11(1987): 166-184.

Page 214: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

06Dobson, Henry David. “An Experimental Study of the Effectiveness of thePlanetarium in Teaching Selected Science Concepts in the Middle School.”Ph.D. dissertation, Pennsylvania State University, 1983.

Doménech, Antonio and Elena Casasús. “Galactic structure: A constructivistapproach to teaching astronomy.” School Science Review 72 (260 1991): 87-93.

Driver, R., and J. Easley. “Pupils and paradigms: A review of literature related toconcept development in adolescent science students.” Studies in ScienceEducation 5 (1978): 61-84.

Driver, Rosalind, Edith Guesne, and Andrée Tiberhien. “Some features ofchildren’s ideas and their implications for teaching.” In Children’s Ideas inScience, ed. by Rosalind Driver, Edith Guesne, and Andrée Tiberhien.Philadelphia: Open University Press, 1985, pp. 193-201.

Duckworth, Eleanor. “The Having of Wonderful Ideas” & Other Essays on Teachingand Learning. New York: Teachers College Press, 1987.

Dufresne, Robert, William Gerace, Pamela T. Hardiman, and Jose Mestre.“Hierarchically structured problem solving in elementary mechanics: Guidingnovices’ problem analysis.” In GIREP Conference, Cosmos — An EducationalChallenge in Copenhagen, European Space Agency, 1986, pp. 116-130.

Eaton, Janet F. “Student misconceptions interfere with science learning: casestudies of fifth-grade students.” Elementary School Journal 84 (4 1984): 365-379.

Eaton, Janet F., Charles W. Anderson, and Edward L. Smith. Student’sConceptions Interfere with Learning: Case Studies of Fifth Grade Students.Institute for Research on Teaching, Michigan State University, 1983. ERIC ED228 094.

Ebel, Robert L. and David A. Frisbie. Essential of Educational Measurement.Englewood Cliffs, NJ: Prentice Hall, 1991.

Edoff, James Dwight. “An experimental study of the effectiveness ofmanipulative use in planetarium astronomy lessons for fifth and eighth gradestudents.” Ed.D., Wayne State University, 1982.

Erickson, Gaalen and Andree Tiberghien. “Heat and temperature.” In Children’sIdeas in Science, ed. by Rosalind Driver, Edith Guesne, and Andrée Tiberhien.Philadelphia: Open University Press, 1985, pp. 52-84.

Farrell, Margaret A. and Walter A. Farmer. “Adolescents’ performance on asequence of proportional reasoning tasks.” Journal of Research in ScienceTeaching 22 (6 1985): 503-518.

Feher, Elsa. “Conception of Light and Color.” In American Association of PhysicsTeachers in Atlanta, ERIC1986.

Festinger. A Theory of Cognitive Dissonance. Evanston, IL: Row, Peterson andCompany, 1957.

Finley, Fred N. “Evaluating instructing: the complementary use of clinicalinterviews.” Journal of Research in Science Teaching 23 (1986): 635-660.

Page 215: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

07Fisher, Kathleen M., and Joseph I. Lipson. “Science education in othercountries—Issues and questions.” In Science Education in Global Perspective, ed.byMargrete Siebert Klein and F. James Rutherford. 1-11. Boulder, CO:Westview Press, 1985.

Freyberg, Peter and Roger Osborne. “Constructing a survey of alternativeviews.” In Learning in Science, The Implication of Childrens’ Science, ed. byRoger J. Osborne and Peter Freyberg. Auckland, New Zealand: Heineman,1985, pp. 166-167.

Friedman, Alan J., Lawrence F. Lowery, Steven Pulos, Dennis Schatz, and Cary I.Sneider. Planetarium Educator’s Workshop Guide. Berkeley, CA: InternationalPlanetarium Society/Lawrence Hall of Science, 1980.

Furuness, Linda Bishop, and Michael Cohen. “Children’s conception of theseasons: A comparison of three interview techniques.” Paper presented atNational Association for Research in Science Teaching in 1989.

Gardner, Howard. The Unschooled Mind. New York: Basic Books, 1991.Gilbert, John K. “The study of student misunderstandings in the physical

sciences.” Research in Science Education (1977): 165-171.Goodlad, John. A Place Called School. New York: McGraw-Hill, 1984.Guesne, Edith. “Light.” In Children’s Ideas in Science, ed. by Rosalind Driver, Edith

Guesne, and Andrée Tiberhien. Philadelphia: Open University Press, 1985, pp.10-32.

Gunstone, Richard F., and Richard T. White. “Understanding of gravity.” ScienceEducation 65 (3 1981): 291-299.

Halloun, Ibrahim Abu, and David Hestenes. “The initial knowledge state ofcollege physics students.” American Journal of Physics 53 (11 1985): 1043-1055.

Hambleton, Ronald K., H. Swaminathan, and H. Jane Rogers. Fundamentals ofItem Response Theory. Newbury Park, CA: Sage Publications, 1991.

Happs, John C., and Christine Coulstock. “What might parents be teaching theirchildren about astronomy? Adult understanding of basic astronomyconcepts.” Paper presented at Australian Science Education Research Associationin 1987.

Hardiman, Pamela T., Robert Dufresne, and William Gerace. “Physics novices’judgments of solution similarity: When are they based on principles?” InGIREP Conference, Cosmos — an Educational Challenge in Copenhagen, EuropeanSpace Agency, 1986, pp. 194-202.

Hoff, D. “Astronomy for the non-science student—A status report.” The PhysicsTeacher March (1982): 175.

Hofwolt, Clifford A. “Instructional strategies in the science classroom.” InResearch within Reach: Science Education, ed. by David Holdzkom and PamelaB. Lutz.Washington, DC: National Science Teachers Association, 1985, pp. 43-57.

Page 216: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

08Holton, Gerald. Introduction to Concepts and Theories in Physical Science.Princeton, NJ: Princeton University Press, 1985.

Hopkins, Kenneth D., and Julian C. Stanley. Educational and PsychologicalMeasurement and Evaluation, 6th ed., Englewood Cliffs, NJ: Prentice-Hall, 1981.

International Association for the Evaluation of Educational Achievement. ScienceAchievement in 17 Countries: A Preliminary Report. New York: TeachersCollege, Columbia University, 1988.

Janke, Delmar L. and Milton O. Pella. “Earth science concepts list for grades K-12curriculum construction and evaluation.” Journal of Research in ScienceTeaching 9 (3 1972): 223-230.

Jung, Walter. “Understanding students’ understandings: the case of elementaryoptics.” In 2nd International Seminar on Misconception and Educational Strategiesin Science and Mathematics in Ithaca, NY, ed.by J.D. Novak, Cornell UniversityPress, 1987, pp. 268-277.

Karplus, Robert, et al. Science Teaching and the Development of Reasoning: EarthScience, 2d ed., Berkeley, CA: Lawrence Hall of Science, 1978.

Karplus, Robert, Steven Pulos, and Elizabeth Stage. “Early adolescents’proportional reasoning on ‘rate’” problems.” Educational Studies inMathematics 14 (1983): 219-233.

Kelsey, Linda J. “The performance of college astronomy students on two ofPiaget’s projective infralogical grouping tasks and their relationship toproblems dealing with the phases of the Moon.” Ph.D. dissertation,University of Iowa, 1980.

Kenealy, Patrick. “A syntactic source of a common “misconception” aboutacceleration.” In 2nd International Seminar on Misconception and EducationalStrategies in Science and Mathematics in Ithaca, NY, ed. by Joseph D. Novak,Cornell University Press, 1987, pp. 278-292.

Kerlinger, Fred N. Foundations of Behavioral Research. New York: Holt, Rinehart,and Winston, 1986.

Keuthe, James L. “Science concepts: A study of sophisticated errors.” ScienceEducation 47 (4 1963): 361-364.

Klein, Carol A. “Children’s concepts of the Earth and the Sun: A cross culturalstudy.” Science Education 65 (1 1982): 95-107.

Klein, Margrete Siebert. “Two worlds of science learning: A look at theGermanies.” In Science Education in Global Perspective, ed. by Margrete SiebertKlein and F. James Rutherford. Boulder, CO: Westview Press, 1985, pp. 97-154.

Klopfer, Leopold. “Effectiveness and effects of ESSP astronomy materials—Anillustrative study of evaluation in a curriculum development project.” Journalof Research in Science Teaching 6 (1 1964a): 64-75.

Klopfer, Leopold. An evaluative study of the effectiveness and effects of astronomymaterials prepared by the University of Illinois Elementary-School Science Project.ERIC, 1964b. ED032221.

Page 217: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

09Kyle, William C., Jr. “Curriculum development projects of the 1960s.” InResearch within Reach: Science Education, ed. by David Holdzkom and PamelaB. Lutz. Washington, DC: National Science Teachers Association, 1985, pp. 3-24.

Langford, Peter. Children’s Thinking and Learning in Elementary School. Lancaster:Technomic, 1989.

Lightman, Alan, and Philip M. Sadler. “How can the Earth be round?” Science andChildren (February 1986): 24-26.

Lightman, Alan P., and Jon D. Miller. “Contemporary cosmological beliefs.”Social Studies of Science 19 (1989): 127-36.

Lightman, Alan P., Jon D. Miller, and B. J. Leadbeater. “Contemporarycosmological beliefs.” In Misconceptions and Educational Strategies in Scienceand Mathematics, ed. by Joseph Novak. Ithaca, NY: Cornell University Press,1987, pp. 309-321.

Lindberg, David C. Theories of Vision from Al-Kindi to Kepler. The Chicago Historyof Science and Medicine, ed. by Allen G. Debus. Chicago: University ofChicago Press, 1976.

Lohnes, P.R. “Factorial modeling in support of causal inference.” AmericanEducational Research Journal 16 (1979): 323-340.

Loria, A., M. Michelini, and V. Mascellani. “Teaching Astronomy to Pupils Aged11-13.” In GIREP - Cosmos - and Educational Challenge in Copenhagen, ed. by J.Hunt. European Space Agency, 1986, pp. 229-233.

Mali, G., and A. Howe. “Development of earth and gravity concepts amongNepali children.” Science Education 64 (2 1979): 213-221.

Marshall, Kim, and Oliver W. Lancaster. Science: Elementary and Middle SchoolCurriculum Objectives. Boston Public Schools, 1983.

McClosky, M. “Intuitive Physics.” Scientific American 248 (1983): 122-130.McDermott, Lillian C. “Research on conceptual learning in mechanics.” Physics

Today July (1984): 24-32.McKenzie, D.L., and M.J. Padilla. “The construction and validation of the test of

graphing in science (TOGS).” Journal of Research in Science Teaching 23 (1986):571-579.

Microsoft Corporation. Microsoft Word User’s Guide. Redmond, WA: MicrosoftCorporation, 1991.

Minstrell, J. “Conceptual development research in the natural setting of asecondary school classroom.” In Science for the 80’s, ed. by H. B. Rowe.Washington, DC: National Education Association, 1982a.

Minstrell, Jim. “Explaining the ‘at rest’ condition of an object.” The Physics Teacher(January 1982b): 10-14.

Narode, Ronald. “Standardized testing for misconceptions in basic mathematics.”In 2nd International Seminar on Misconception and Educational Strategies in

Page 218: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

10Science and Mathematics in Ithaca, NY, ed. by Joseph D. Novak, CornellUniversity Press, 1987, pp. 222-333.

National Assessment of Educational Progress. The Nation’s Report Card. Princeton,NJ: Educational Testing Service, 1989.

Newton, Sir Issac. Opticks. London: William and John Innys, 1721.Novak, Joseph D. A Theory of Education. Ithaca, NY: Cornell University Press,

1977.Novick, Shimshon, and Joseph Nussbaum. “Using interviews to probe

understanding.” The Science Teacher November (1978): 29-30.Nussbaum, Joseph. “Childrens’ conception of the Earth as a cosmic body: a cross

age study.” Science Education 63 (1 1979): 83-93.Nussbaum, Joseph. “Students perception of astronomical concepts.” In GIREP -

Cosmos - and Educational Challenge in Copenhagen, ed. by J. Hunt. EuropeanSpace Agency, 1986, pp. 87-97.

Nussbaum, Joseph. “The earth as a cosmic body.” In Children’s Ideas in Science,ed. by Rosalind Driver, Edith Guesne, and Andrée Tiberhien. Philadelphia:Open University Press, 1985, pp. 170-192.

Nussbaum, Joseph, and Joseph Novak. “Alternative frameworks, conceptualconflict and accommodation: Toward a principled teaching strategy.”Instructional Science 11 (1982): 183-200.

Nussbaum, Joseph, and Joseph Novak. “An assessment of childrens’ concepts ofthe earth utilizing structured interviews.” Science Education 60 (4 1976): 535-550.

Ogar, J. “Ideas about physical phenomena in spaceships among students andpupils.” In GIREP - Cosmos - and Educational Challenge in Copenhagen, EuropeanSpace Agency, 1986, pp. 375-378.

Osborne, R. “Children’s dynamics.” The Physics Teacher 22 (1984): 504-508.Osborne, Roger J., and Beverly F. Bell. “Science teaching and childrens’ views of

the world.” European Journal of Science Education 5 (1 1983): 1-14.Osgood, Charles E., George J Suci, and Percy H. Tannenbaum. The Measurement

of Meaning. Urbana: University of Illinois Press, 1957.Osterlind, Steven J. Constructing Test Items. Boston: Kluwer Academic, 1989.Piaget, Jean, and Bärbel Inhelder. The Child’s Conception of Space. Trans. by F.J.

Langdon and J.L. Lunzer. New York: W.W. Norton, 1929.Placek, Walter Anthony. “Preconceived knowledge of certain Newtonian

concepts among gifted and non-gifted eleventh grade physics students.” In2nd International Seminar on Misconception and Educational Strategies in Scienceand Mathematics in Ithaca, NY, ed. by Joseph D. Novak. Ithaca, NY: CornellUniversity Press, 1987, pp. 386-391.

Plato. Plato’s Cosmology: The Timaeus of Plato. Trans. by Francis M. Cornford.London: Loeb, 1937.

Page 219: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

11Posner, G. J., K. A. Strike, P. W. Hewson, and W. A. Gertzog. “Accommodationof a scientific conception: Toward a theory of conceptual change.” ScienceEducation 66 (2 1982): 211-227.

Prather, J. Preston. Philosophical Examination of the Problem of Unlearning of IncorrectScience Concepts. National Association for Research in Science Teaching, 1985.ERIC ED256570.

Reiner, Miriam, and Menahem Finegold. “Changing students explanatoryframeworks concerning the nature of light using real-time computer analysisof laboratory experiments and computerized explanatory simulation of e.m.radiation.” In 2nd International Seminar on Misconception and EducationalStrategies in Science and Mathematics in Ithaca, NY, ed. by Joseph D. Novak.Ithaca, NY: Cornell University Press, 1987, pp. 368-377.

Rhoneck, Christoph von, and Karl Grob. “Representation and problem solvingin basic electricity, predictors for successful learning.” In 2nd InternationalSeminar on Misconception and Educational Strategies in Science and Mathematicsin Ithaca, NY, ed. by J.D. Novak. Ithaca, NY: Cornell University Press, 1987,pp. 564-577.

Rollins, M. M., J. J. Denton, and D. L. Janke. “Attainment of Selected Earth ScienceConcepts by Texas High School Seniors.” Journal of Educational Research 77(1983): 81-88.

Roth, Kathleen J. “Conceptual change learning and processing of science texts.”Paper presented at American Educational Research Association in Chicago, 1985a.

Roth, K.J. “The effect of science texts on students misconceptions about food forplants.” Ph.D. dissertation, Michigan State University, 1985b.

Russo, Richard. “Shoot the Stars—Focus on the Earth’s Rotation.” The ScienceTeacher February (1988): 25-26.

Rutherford, F. James. “Lessons from Five Countries.” In Science Education inGlobal Perspective, ed. by Margrete Siebert Klein and F. James Rutherford.Boulder, CO: Westview Press, 1985, pp. 207-231.

Sadler, Philip M. “Misconceptions in Astronomy.” In 2nd International Seminar onMisconception and Educational Strategies in Science and Mathematics in Ithaca, NY,ed. by Joseph D. Novak. Ithaca, NY: Cornell University Press, 1987, pp. 422-425.

Sadler, Philip M., and William M. Luzader. “The Teaching of Astronomy.” InInternational Astronomical Union, Colloquium 105 in Williams College,Williamstown, MA, ed. by Jay M. Pasachoff and John R. Percy. New York:Cambridge University Press, 1988, 257-276.

Schatz, Dennis, Andrew Fraknoi, R. Robert Robbins, and Charles D. Smith.Effective Astronomy Teaching and Student Reasoning Ability. Berkeley, CA:Lawrence Hall of Science, 1978.

Schoon, Kenneth J. “Misconceptions in Earth and Space Sciences, A Cross-AgeStudy.” Ph.D. dissertation, Loyola University, 1988.

Page 220: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

12Shipstone, D. M., C. Rhoneck, W. Jung, C. Karrqvist, J.J. Dupin, S. Joshua, and P.Licht. “A Study of students’ understanding of electricity in five Europeancountries.” European Journal of Science Education (1987).

Shymansky, James A., William C. Kyle, Jr. and Jennifer M. Alport. “Howeffective were the hands-on science programs of yesterday?” Science andChildren, November/December (1982).

Slinger, Lucille A. Studying light in the fifth grade: A case study of text-based scienceteaching. Institute for Research on Teaching, Michigan State University, 1982.Research Series No. 129.

Smith, Deborah. “Primary teachers’ misconceptions about light and shadows.” In2nd International Seminar on Misconception and Educational Strategies in Scienceand Mathematics in Ithaca, NY, ed. by Joseph D. Novak. Ithaca, NY: CornellUniversity Press, 1987, pp. 461-476.

Sneider, Cary, and S. Pulos. “Childrens’ cosmographies: understanding theEarth’s shape and gravity.” Science Education 67 (2 1983): 205-222.

Sonnier, Isadore L. “A study of the number of selected ideas in astronomy foundin earth science curriculum project materials being taught in college anduniversity astronomy courses.” Ed. D. dissertation, Colorado State College,1966.

Stead, B.F., and R.J. Osborne. “Exploring science students concepts of light.”Australian Science Teachers Journal 26 (3 1980): 84-90.

Stead, B.F., K. E. and R. J. Osborne. “What is gravity? Some children’s ideas.”New Zealand Science Teacher 30 (1981): 5-12.

Targan, David. “A study of conceptual change in the target domain of the lunarphases.” In 2nd International Seminar on Misconception and Educational Strategiesin Science and Mathematics in Ithaca, NY, ed. by Joseph D. Novak. Ithaca, NY:Cornell University Press, 1987, pp. 499-511.

Thijs, Gerard D. “Conceptions of force and movement, intuitive ideas of pupils inZimbabwe in comparison with finding from other countries.” In 2ndInternational Seminar on Misconception and Educational Strategies in Science andMathematics in Ithaca, NY, ed. by J.D. Novak. Ithaca, NY: Cornell University,1987, pp. 501-513.

Touger, J. S. “Students’ conceptions about planetary motion.” Paper presented atAmerican Association of Physics Teachers in 1985.

Toumlin, Stephen, and June Goodfield. The Fabrics of Heavens. London:Hutchinson, 1967.

Treagust, David F. “Evaluating students’ misconceptions by means of diagnosticmultiple-choice items.” Research in Science Education (1986): 363-369.

Treagust, David F., and Clifton L. Smith. “Secondary students understanding ofthe solar system: implications for curriculum revision.” In GIREP - Cosmos -and Educational Challenge in Copenhagen, ed. by J. Hunt. European SpaceAgency, 1986, pp. 363-369.

Page 221: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

13Troost, Kay Michael. “Science education in contemporary Japan.” In ScienceEducation in Global Perspective, ed. by Margrete Siebert Klein and F. JamesRutherford. Boulder, CO: Westview Press, 1985, pp. 13-66.

Tuckman, Bruce W. Conducting Educational Research, 3d ed. New York: Harcourt,Brace, Jovanovich, 1988.

Velleman, Paul F. Data Desk Handbook. Northbrook IL: Odesta Corporation, 1988.Velleman, Paul F., and Hoaglin. Applications, Basics, and Computing of Exploratory

Data Analysis. Boston: Duxbury Press, 1981.Viglietta, M. L. “Earth, sky and motion. Some questions to identify pupil ideas.”

In GIREP - Cosmos - and Educational Challenge in Copenhagen, ed. by J. Hunt.European Space Agency, 1986, pp. 369-370.

Vincentini-Missoni, M. “Earth and gravity: Comparison between adults’ andchildren’s knowledge.” In Problems Concerning Students’ Representation ofPhysics and Chemistry Knowledge in University of Frankfort, ed. by W. Jung,1981.

Vosniadou, Stella, and William F. Brewer. “Theories of knowledge restructuringin development.” Review of Educational Research 57 (1 1987): 51-67.

Wandersee, James H. “Can the history of science help science educators antici-pate students’ misconceptions?” Journal of Research in Science Teaching (1986).

Watts, D. Michael. “Gravity—don’t take it for granted!” Physics Education 17(1982): 116-121.

Watts, D. Michael. “Student conceptions of light: a case study.” Physics Education20 (4 1985): 183-187.

Weintraub, S. “Reading graphs, charts, and diagrams.” Reading Teaching 20(1967): 345-349.

Weiss, Iris. Report of the 1985-86 National Survey of Science, Mathematics, and SocialStudies Education. 1987a.

Weiss, Iris R. Report of the 1985-86 National Survey of Science and MathematicsEducation. Research Triangle Institute, 1987b. RTI/2938/00-FR.

Welch, W. W., L. J. Harris, and R. E. Anderson. “How many are enrolled inscience?” The Science Teacher 51 (9): 1984.

Wise, K. C., and J. R. Okey. “A meta-analysis of the effects of various scienceteaching strategies on achievement.” Journal of Research in Science Teaching 20(1983): 419-435.

Za’rour, George I. “Interpretation of natural phenomena by Lebanese schoolchildren.” Science Education 60 (1976): 277-287.

Page 222: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

14X. BibliographyAikenhead, Glen Stirton. “The measurement of knowledge about science and

scientists: An investigation into the development of instruments forformative evaluation.” Ed.D., Harvard University Graduate School ofEducation, 1972.

Anderson, B., and C. Karrqvist. “How Swedish pupils, aged 12-15 years,understand light and its properties.” European Journal of Science Education 5 (41983a): 387-402.

Anderson, C. W., and E. L. Smith. “Teacher behavior associated with conceptuallearning in science.” Paper presented at American Educational ResearchAssociation in Montreal, 1983.

Arons, A. B. “Addressing students’ conceptual and cognitive needs.” InComputers in Physics Instruction in Raleigh, NC, ed. by Edward F. Redish andJohn S. Risley. Reading, MA: Addison-Wesley, 1988, pp. 301-308.

Arons, Arnold. A Guide to Introductory Physics Teaching. New York: John Wiley &Sons, 1990.

Ausubel, David P. “An evaluation of the ‘Conceptual Schemes’ approach toscience curriculum development.” Journal of Research in Science Teaching 3(1965): 255-264.

Balaco, M.R. “Test development related to the understanding of basic chemistryand its application to societal problems: For ChemCom curriculum (pilotstudy).” Dissertation Abstracts International 46 (9 1986): 2647-A.

Bishop, Roy L. “Multiple-choice questions.” In International Astronomical Union,Colloquium 105 in Williams College, Williamstown, MA, ed. by Jay M. Pasachoffand John R. Percy. Cambridge University Press, 1988, pp. 83-87.

Blosser, Patricia E. Secondary School Students’ Comprehension of Science Concepts:Some Findings from Misconception Research. ERIC, 1987. ERIC/SMEAC ScienceEducation Digest 2.

Bogdan, Robert C. and Sari Knopp Biklen. Qualitative Research for Education.Boston: Allyn and Bacon, 1982.

Bowers, Raold Walker. “Effects of natural science courses upon Harvard Collegefreshmen.” Ed.D., Harvard University Graduate School of Education, 1952.

Bransford, J. D., and N. S. McCarrell. “A sketch of a cognitive approach tocomprehension.” In Cognition and the Symbolic processes, ed. by W. B. Weinerand D. S. Palermo. Hillsdale, NJ: Erlbaum, 1974.

Bruner, Jerome. Actual Minds, Possible Worlds. Cambridge: Harvard UniversityPress, 1986.

Cain, Peggy W., and Daniel W. Welch. Astronomy Activities for the Classroom.South Carolina State Department of Education, 1980. Teaching GuideED199062 SE034287.

Cangelosi, James S. Designing Tests for Evaluating Student Achievement. NewYork: Longman, 1990.

Page 223: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

15Carter, Carolyn, and George Bodner. “How student misconceptions of thenature of chemistry and mathematics influence problem solving.” In TheSecond International Seminar: Misconceptions and Educational Strategies in Scienceand Mathematics. Cornell University (Department of Education), 1987, pp. 69-83.

Cauldwell, Loren T. “A determination of earth science principles desirable forinclusion in science programs of general education in the secondary school.”Ph.D. dissertation, Indiana University, 1953.

Cavena, G. R., and W.H. Leonard. “Extending discretion in high schoolsciencecurricula.” Science Education 69 (5 1985): 593-603.

Champagne, Audrey B., and Leslie E. Hornig. The Science Curriculum. AmericanAssociation for the Advancement of Science, 1987.

Champagne, Audrey B., and Leopold E. Klopfer. “Research in science education:The cognitive psychology perspective.” In Research within Reach: ScienceEducation, ed. by David Holdzkom and Pamela B. Lutz. Washington, DC:National Science Teachers Association, 1985, pp. 171-189.

Champagne, Audrey B., Richard F. Gunstone, and Leopold E. Klopfer. EffectingChanges in Cognitive Structures among Physics Students. ERIC, 1983. ED 229238.

Cohen, Edward G. Attitude toward Science and Astronomy. 1980a.Cohen, Roalie. “Conceptual styles, culture conflict, and nonverbal tests of

intelligence.” American Anthropologist 71 (1969): 828-56.Collis, K.F., and H.A. Davey. “A technique for evaluating skill in high school

science.” Journal of Research in Science Teaching 23 (1986): 651-663.Committee on Research in Mathematics Science and Technology Education.

Interdisciplinary Research in Mathematics, Science, and Technology Education.Washington, DC: National Academy Press, 1987.

Crawley, F., and S. Arditzoglou. “Life and physical science misconceptions ofpreservice elementary teachers.” Paper presented at School Science andMathematics Association in 1988.

Cronbach, Lee J. Essentials of Psychological Testing,m 5th ed. New York: Harper &Row, 1990.

Czujko, Roman, and David Bernstein. Who Takes Science: A Report on StudentCoursework in High School Science and Mathematics. American Institute ofPhysics, 1989. AIP: R-345.

Driver, Rosalind. “Pupils alternative framework in science.” European Journal ofScience Education 3 (1 1981): 93-101.

Driver, Rosalind. “The pupil as scientist.” In Physics Teaching, GIREP inPhiladelphia, ed. by Uri Ganiel. Balaban International Science Services, 1980, pp.331-345.

Evans, Alan D. “Implementation and validation of a new course in introductoryastronomy at the college level.” ERIC ED162649 (1978): 1-10.

Page 224: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

16Feldstine, J. N. “Concept mapping: A method for detection of possible studentmisconceptions.” In International Seminar on Misconception and EducationalStrategies in Science and Mathematics in Ithaca, NY, ed. by Joseph D. Novak andH. Helm. Ithaca, NY: Cornell University Press, 1983.

Gabel, D. L. “Research interests of secondary science teachers.” Journal of Researchin Science Teaching 23 (2 1986): 145-163.

Gee, Brian. “Astronomy in School Science.” School Science 82 (1979): 31.Gibbs, Robert E. “Observing the sky.” Department of Physics, Eastern

Washington University, 1989.Gilbert, John K., and Roger J. Osborne. “Children’s Science and Its Consequences

for Teaching.” Science Education 66 (4 1982): 225-633.Gilman, D., I. Hernandez, and R. Cripe. “The correction of general science

misconceptions as a result of feedback mode in computer assistedinstruction.” In Proceedings of the National Association of Research in ScienceTeaching in Minneapolis, 1970.

Gorodetsky, Malka, and Esther Gussarsky. “The role of students and teachers inmisconceptualization of aspects of ‘chemical equilibrium’.” In SecondInternational Seminar: Misconceptions and Educational Strategies in Science andMathematics. Cornell University (Dept. of Education), 1987, pp. 187-193.

Gregory, Bruce. Inventing Reality. New York: John Wiley & Sons, 1988.Hale-Benson, Janice E. Black Children: Their Roots, Culture, and Learning Styles.

Baltimore: Johns Hopkins University Press, 1986.Happs, J. C., and L. Scherpenzeel. “Achieving long term change using the

learner’s prior knowledge and a novel teaching setting.” In 2nd InternationalSeminar on Misconception and Educational Strategies in Science and Mathematicsin Ithaca, NY, ed. by Joseph D. Novak. Ithaca, NY: Cornell University Press,1987.

Harris, D. “The place of Astronomy in schools.” Physics Education 17 (4 1982):154-157.

Hawkins, David. “Critical Barriers to Science Learning.” Outlook 9 (1978): 3.Healy, Mary K. “Writing in a science class: A case study of the connection

between writing and learning.” Ph.D. dissertation, New York University,1984.

Hill, Lon Clay Jr. “Spatial thinking and learning astronomy: The implicit visualgrammar of astronomical paradigms.” In International Astronomical Union,Colloquium 105 in Williams College, Williamstown, MA, ed. by Jay M. Pasachoffand John R. Percy. Cambridge University Press, 247-248, 1988.

Hoff, D. B., L. J. Kelsey, and J. S. Neff. Activities in Astronomy, 2d ed., Dubuque,IA: Kendall/Hunt, 1984.

Holdzkom, David, and Pamela B. Lutz. Research within Reach: Science Education.National Science Teachers Association, 1985.

Page 225: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

17Holton, Gerrold, F. James Rutherford, and Fletcher G. Watson. Project Physics.New York: Holt, Rinehart and Winston, 1981.

Howe, Ann C., and Bessie Stanback. “ISCS in review.” Science Education 69 (11985): 25-37.

Idar, J., and U. Ganiel. “Learning difficulties in high school physics: Developmentof a remedial teaching method and assessment of its impact on achievement.”Journal of Research in Science Teaching 22 (2 1985): 127-140.

Jackson, D., B.J. Edwards, and C.F. Berger. “Teaching the design andinterpretation of graphs through a computer aided graphical data analysis.”In National Association for Research in Science Education in Atlanta, GA, 1990.

Janke, Delmar L., and Milton O. Pella. “Earth science concepts list for grades K-12curriculum construction and evaluation.” Journal of Research in ScienceTeaching 9 (3 1972): 223-230.

Lyman, Howard B. Test Scores and What They Mean. Englewood Cliffs, NJ:Prentice-Hall, 1978.

Malone, Thomas W. “Toward a theory of intrinsically motivating instruction.”Cognitive Science 4 (1981): 333-369.

Mathematics, National Science Board Commission on Precollege Education in.Educating Americans for the 21st Century. National Science Foundation, 1985.

McCarthy, Francis Wadsworth. “Age placement of selected science subjectmatter.” Ed.D., Harvard University Graduate School of Education, 1951.

McDermott, Lillian C., Mark L. Rosenquist, and Emily H. van Zee. “Studentdifficulties in connecting graphs and physics: Examples from kinematics.”American Journal of Physics 55 (6 1987): 503-513.

McNally, D. “Astronomy at school.” Physics Education 17 (4 1982): 157-160.Miller, David, ed. Popper Selections. Princeton, NJ: Princeton University Press,

1985.Miller, Patrick W., and Harley E. Erickson. How to Write Tests for Students.

Washington, DC: National Educational Association, 1990.Minstrell, Jim. “Explaining the ‘at rest’ condition of an object.” The Physics Teacher

(January 1982): 10-14.Moore, R., and F. Sutman. “The development, field test and validation of an

inventory of scientific attitudes.” Journal of Research in Science Teaching 7(1970): 85-94.

Munby, H. “Studies involving the Scientific Attitude Inventory: What confidencecan we have in this instrument?” Journal of Research in Science Teaching 20(1983): 141-162.

Nisbett, Richard E., Geoffrey T. Fong, Darrin R. Lehman, and Patricia W. Cheng.“Teaching Reasoning.” Science 238 (1987): 625-631.

Novak, Joseph D., and D. Bob Gowin. Learning How To Learn. Cambridge:Cambridge University Press, 1984.

Page 226: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

18Nussbaum, Joseph, and Joseph Novak. “Alternative frameworks, conceptualconflict and accommodation: Toward a principled teaching strategy.”Instructional Science 11 (1982): 183-200.

Nussbaum, Joseph, and Joseph Novak. “An assessment of childrens’ concepts ofthe earth utilizing structured interviews.” Science Education 60 (4 1976): 535-550.

Omar, Abdulaziz Saud. “The effect of using diagnostic-prescriptive teaching onachievement in science of Saudi Arabian high school students.” Ph.D.dissertation, University of Kansas, 1984.

Osborne, R. J. “Some aspects of the student’s view of the world.” Research inScience Education 10 (1980): 11-18.

Osborne, R. J., and J. K. Gilbert. “A technique for exploring students’ views of theworld.” Physics Education 15 (1980): 376-379.

Osborne, Roger J., and Peter Freyberg. Learning in Science, The Implication ofChildrens’ Science. Auckland, New Zealand: Heineman, 1985.

Othman, Mazlan. “Influence of culture on understanding astronomical concepts.”In International Astronomical Union, Colloquium 105 in Williams College,Williamstown, MA, ed. by Jay M. Pasachoff and John R. Percy. CambridgeUniversity Press, 1988, pp. 239-240.

Pearson, P.D., J. Hansen, and C. Gordon. “The effect of background knowledgeon young children’s comprehension of explicit and implicit information.”Journal of Reading Behavior 11 (1979): 201-209.

Piaget, Jean. The Child’s Conception of the World. London: Routledge and KeganPaul, 1929.

Schneps, Matthew, H. And Sadler, Philip M., “A Private Universe.” PyramidFilms, 1988.

Sadler, Philip M. “Astronomy in U.S. High Schools.” In GIREP Conference,Cosmos—An Educational Challenge in Copenhagen, ed. by J. Hunt. EuropeanSpace Agency, 1986, pp. 261-264.

Schatz, Dennis, and Anton E. Lawson. “Effective astronomy teaching: Intellectualdevelopment and its implications.” Mercury (July/August 1976): 6-13.

Schleffler, Israel. Science and Subjectivity. Indianapolis: Hackett Publishing, 1982.Seeds, Michael A. Foundations of Astronomy, 2d ed. Belmont, CA: 1988.Smith, E.L. “Teaching for conceptual change: Some ways of going wrong.” In

Proceedings of the International Seminar on Misconceptions in Science andMathematics in Cornell University, Ithaca, NY, ed. by H. Helm and J. Novak.Ithaca, NY: Cornell University Press, 1983.

Smith, Murray R. “Astronomy in the native-oriented classroom.” Journal ofAmerican Indian Education 23 (2 1984): 16-23.

Snydle, Richard W., and John F. Koser. “An activity-oriented astronomy course.”Science Activities 10 (3 1973): 16-18.

Page 227: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

19Solomon, Joan. “Messy, contradictory, and obstinately persistent: A study ofout-of-school ideas about energy.” School Science Review 65 (23 1983): 225-229.

Stepans, J., and A. McCormack. “A study of scientific conceptions and attitudestoward science of prospective elementary teachers.” Paper presented atNorthern Rocky Mountain Educational Research Association in Jackson Hole, WY,1985.

Sunal, Dennis W., and V. Carol Demchik. Astronomy Education Materials ResourceGuide. 3d ed. Morgantown: West Virginia University Bookstore, 1985.

The College Board. Academic Preparation in Science. Academic Preparation Series.New York: College Board Publications, 1990.

Tinkelman, S. “Planning the objective test.” In Educational Measurement, 2d ed.,ed. by R. Thorndike. Washington, DC: American Council on Education, 1971.

Tremblath, R.J. “The frequencies and origins of scientific misconception.” Ph.D.dissertation, University of Texas at Austin, 1980.

Unger, Christopher Matthew. “Conceptual change in science instruction: Howmight interactive, computer-based models help?” Ed.D. Qualifying Paper,Harvard University, 1988.

Watson, Fletcher G. “Astronomy at the upper school level.” Annals of the NewYork Academy of Sciences 198 (1972): 173-77.

Welch, Wayne. “Twenty years of science curriculum development: A look back.”In Review of Research in Education, ed. by D. Berlinger. Washington, DC:American Educational Research Association, 1979.

Wittrock, M.C. “Learning as a generative process.” Educational Psychology 11(1974): 87-95.

Zeilik, Michael II. “PSI Astronomy unit: Astrology—The space age science?”American Journal of Physics, 42 (7 1974): 538-542.

Page 228: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

20AppendicesA. School DataB. Pre-test InstrumentC. P-Value, D-Value TablesD. Classical Test Theory TablesE. Item Correlation MatrixF. Chi-Square Analysis

Page 229: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

21Vitae

Philip Michael Sadler 10 Carver Street, Cambridge, MA 02138

Education:

Massachusetts Institute of Technology, Cambridge, MA B.S. Physics, 1973Harvard Graduate School of Education, Cambridge, MA Ed.M., 1974Harvard Graduate School of Education, Cambridge, MA Candidate for Ed.D., 1992

Professional Employment:

Instructor, Harvard Graduate School of Education 9/91-presentFrances W. Wright Lecturer on Navigation, Harvard University 1/90-presentDirector, Education Department,

Harvard-Smithsonian Center for Astrophysics 2/90-presentProject Director, of these NSF Education Projects:MicroObservatory, development of robotic telescope for school use. 3/90-presentInSIGHT, development of advanced simulations for introductory physics. 3/90-presentSPICA, summer institutes to train astronomy workshop leaders. 5/90-presentProject STAR, development of high school level astronomy course. 12/85-5/92

Vice President and Co-Founder, Peripheral and Software Marketing Inc.,Newton, MA 10/82-12/85

Vice President and Co-Founder, Computer Products Marketing Inc., Newton, MA 8/81-12/85President (presently on leave) and Founder,

Learning Technologies Inc., Cambridge, MA 7/77-presentScience Teacher (grades 7 and 8) and Coordinator, Carroll School, Lincoln, MA 9/74-6/77Staff Developer, Calculus Project, Education Development Center, Newton, MA 7/73-8/74Staff Member, Mathematics Project, Education Research Center, MIT 9/71-6/73

Consulting Experience.

Bolt, Beranek, and Newman, Cambridge, MA 1991-presentCambridge Public Schools, Science Advisory Board, Cambridge, MA 1988-presentBoston Childrens’ Museum, Boston, MA 1988-presentScience Museum of Virginia, Richmond, Virginia 1988-presentLawrence Hall of Science, Berkeley, CA 1988-presentApple Computer, Cupertino, CA. 1981-84

Page 230: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

22National Council of Science Museums, Calcutta, India. 1981Children’s Television Workshop, New York, NY. 1977-78

Other Teaching Positions

Summer Science Institute, Independent Schools Association, Concord, MA 1987-90Workshop Leader, National Air and Space Museum, Washington, DC 1987-89Workshop Leader, National Science Resources Center Summer InstituteSmithsonian Institution, Washington, DC 1988Amplification ‘86, Mathematics Teaching InstituteHarvard Graduate School of Education, Cambridge, MA 1986

Page 231: The Initial Knowledge State of High School Astronomy ... · Belmont High School contributed by taping interviews with many of their students. This process revealed students’ ideas

23Honors and Awards

Margaret Noble Address, Middle Atlantic Planetarium Society May 1991Representative, U.S.– Soviet Commission on Education,

National Academy of Education May 1988Executive Producer, “A Private Universe”, Pyramid Films

Blue Ribbon, American Film and Video Association 1990Gold Medal, Documentary, Houston International Film Festival, Houston, TX 1988Gold Plaque Award, Chicago International Film Festival 1989Silver Apple, National Educational Film and Video Festival, Seattle, WA. 1987

Representative of the Year (Worldwide), Apple Computer 1982

Patents:

4,164,829 Inflatable Structure 8/21/794,178,701 Cylindrical Projector 12/18/79

Teacher Certification:

Massachusetts Certificate #183612,General Science, Physics, Mathematics grades 7-12 8/12/74-life

Professional Membership:

American Association for the Advancement of ScienceAmerican Association of Physics TeachersAmerican Astronomical SocietyAssociation of Science Technology CentersAstronomical Society of the PacificInternational Planetarium SocietyNational Science Teachers Association