Experimental Research

download Experimental Research

of 38

Transcript of Experimental Research

CHAPTER 1INTRODUCTIONA. BACKGROUNDSuppose teachers wished to determine which of two methods of reading instruction was most effectiveone that involved 20 minutes of direct instruction in phonics each day throughout the academic year in grade 1 or one that involved the current practice of having the teacher read a book to the class for 20 minutes each day throughout the year in grade one. Similarly, suppose they wished to determine whether children learn better in a small class (i.e., with 15 students) or a large class (i.e., with 30 students). Finally, suppose they wished to determine whether requiring students to take a short quiz during each meeting of a college lecture class would result in better performance on the final exam than not giving quizzes. The example above is an implementation of experimental research. Therefore, this paper will discuss more about it in order to make us understand about this kind of research. B. PROBLEM STATEMENT1. What are the definition and the purpose of experimental research?2. How many kinds do experimental research have and how to apply them?3. What are their benefits and their infirmity?C. OBJECTIVES1. To know the definition of experimental research and its purpose.2. To know the kinds of experimental research and how to apply them.3. To know the benefits and the infirmity of experimental research.

CHAPTER IIEXPERIMENTAL RESEARCHA. DEFINITION AND PURPOSEExperimental research is the only type of reseacrh that can test hypotheses to estabilsh cause-effect relationships. It represents the strongest chain of reasoning about the links between variables. You may recall from chapter 1 that in experimental research the reseacrher manipulates at least one independent variable, control over relevant variables and observes the effect on one or more dependent variables is the one characteristic that differentiates experimental research from other types of research. The independent variable, also called the treatment, clausal, or experimental variable, is that treatment or characteristic be lieved to make a difference. In educational research, independent variables that are frequently manipulated include method of instruction, type of reinforcement, arrangement of learning environment, type of learning materials, and length of treatment. This is list by no means exhaustive. The dependent variables, also called the criterion, effect, or posttest variable, is the outcome of the study, the change or difference in groups that occurs as a result of the independent variable. It gets its name because it is dependent on the independent variable. The dependent variable may be measured by a test or some other quantitative measure ( e.g, attendence, number of suspension, time on task ).the only restriction on the dependent variable is that it represents a measurable outcome.Experimental research is the most structured of all reserach types. When well conducted, experimental studies produce the soundest evidence concering cause-effect relationships. The results of experimental research permit prediction, but not the kind that is characteristic of correlational research. A correlational study predicts a particular score for a particular individual. Predictions based on experimental findings are more global and often take the form if you see approach X, you will probably get better results than if you see approach Y of coures, it is unusualy for a singel experimental study to produce broad generalization of results, because any single study is limited in context and participants. However, replications of a study using different contexts and participants often produce cause-effect results that can be generalized widely.The Experimental Process The step in an experimental study are basically the same as in other types of research: selecting and difining a problem, selecting participants and measuring instruments, preparing a research plan, executing procedures, analyzing the data, and formulating conclusion. An experimental study is quided by at least on hypothesis that states an expected causal relationship between two variables. The experimental is conducted to confirm ( support) or disconfirm ( refute ) the experimental hypothesis. In an experimental study, the researcher will go to the action from the very begining. He selects the groups, decides what treatment will go to which group, controls extraneous variables, and measure the effect of the teratment at the and of the study.It is important to note that the experimental researcher controls both the selection and the assignment of the research participants. That is the researcher randomly selects participents into the different treatment conditions. It is the ability to randomly select and randomly assign of participants to treatments, which is also called the researchers manipulation of the treatments, is the distriquishing aspect of experimental research and the feature that distinguishes it from causal-comparative research has only random selection, not assignment, because causal-comparative participants are obtained from two already-existing populations. There can be no random assignment to a treatment from a singel population in causal-comparative studies.The group that receives the new treatments is often called the experimental group, and the group that receives a different treatment or is treated as usual is called the control group. An alternative to using these is to simply describe the treatments as comparison groups, treatments groups, or groups A and B. The trems are used interchangeably. A common misconception is that a control group always receives no treatment. This is not true and would hardly provide a fair comparison. For example, if the independent variable was type of reading instruction, the experimental group might be instructed with a new method, and the control group might continue, instruction with the currently used method. The control group would still receive reading instruction, members would not sit in a closet while the study was being conducted. Otherwise, you would not be evaluating the effectiveness of a new method as compared to that of a traditional method, you would be comparing a new method to no reading instruction at all. Any method of instruction is bound to be more effective than no instruction.The group are not to receive the difference treatments should be aquated on all variables that might influence performence on the dependent variable. For example, in the previous example, initial reading readiness should be very similar in each treatment group at the star of the study. The researcher must make every effort to ensure that the two group start as equivalently as possible on all variables except the independent variable. The main way that groups are equated is through simple random or stratified random sampling. After the groups have been exposed to the treatment for some period, the researcher collects data on the dependent variable from the groups and determines whether there is a real or significant difference between their perfomance. In other words, using statistical analysis of experimental studies in detail. For now, suppose that at the and of an experimental study one group han an average score of 29 on the dependent variable and the other group had an average score of 27. There clearly is a different between the groups but is a 2 point difference a meaningful or significant difference, or is it just a chance difference produced by meansurement error. Statistical analysis helps answer this question. Experimental studies in education often suffer from two problems : a lack of sufficient exposure to treatments and failure to make the treatments substantially different from each other. On most cases, no metter how effective a treatments is, it is not likely to be effective if students are exposed to it for only a brief period. To adequantely test a hypothesis concerning the effectiveness of a treatment, the experimental group would need to be exposed to it over a period of time so that the treatment is given a fair chance to work. Also of concren is the difference between treatments. In a study comparing team teaching and traditional lecture teaching, it would be vital that team teaching be operationalized in a manner that clearly differentiated it from the traditonal method. If team teaching meant two teachers taking turns lecturing, it would not be very different from traditional teaching and the researcher would be very unlikely to find a meaningful difference between the two study treaments. Also, if teachers using different treatments converse with and borrow from each others treatments, the original treatments become diluted and similar to each other. These problems have detrimental effects on the outcome of the study.Manipulation and controlDirect manipulation by the researcher of at least one independent variable is the one single characteristic the differentiates experimental research from other types of research. Manipulation of an independent variable is often a diffcult concept to grasp. Quite simply, it means that the research decides what treatments will make up the independent variable and which group will get which treatment. For example, if the independent variable was number af annual teacher reviews, the researcher might decide to from three groups : one group receiving no review, a second group receiving one review, and a third group receiving two reviews. In addition, having selected research participants from a single, well-defined population, the researcher would randomly assign participants to treatments. Thus, manipulation means being able to select the number and type of treatments and to randomly assign manipulation treatments.Independent variables in education are ethier manipulated ( active variable ) or not manipulated ( assigned variables ). You can manipulate such variables as method of instruction, number of reviews., and size of group. You cannot manipulate variables such as gender, age, or socioeconomic status. You can place participants into one method of instruction or another ( active variable ), but you cannot place participants into male or female categories because the already are male or famale ( assigned variable ). Althought the design of an experimental study may or may not include assigned variables, at least one active variable must be present.Control refers to the researchers efforts to remove the influenceaf any variable other than the independent variable that might affect performance on the dependent variable. In other words, the researcher wants the groups to be as similar as possible, so that the only major difference between them is the treatment variables as manipulated. To illustrate, the the importance of research control, suppose you conducted a study to compare the effectiveness of student tutors versus parent tutors in teaching first grades to read. Students tutors might be older children from higher grade levels, and parent tutors might be members of the PTA. Suppose also that student tutors helped each member of their group for 2 hours per week for a month, would the comparison be fair ? certainly not. Participants with the students tutors would have received 2 times as much help as that provided to the parents group. thus, one variable that would need to be controlled would be amount of tutoring . if this variable were not controlled, you could be confronted with a dilemma. If the students tutors produced higher reading scores that the parent tutors, you would not know whether this result indicated that students tutors were more effective that parent tutors, that longer periods of tutoring were more effective than shorter periods, or that type and amount of tutoring combined were more effective. To make the comparison fair and interpretable, both students and parents would tutor for the same amount of time. Then time of tutoring would be controlled, and you could truly compare the effectiveness of students and parent tutors.A reseracher must consider many factors when attempating to identify and control extraneous variables. Some variables that need controlling may be relatively obvious as the reseracher in the preceding study, you would need to examine such variables as reading readiness and prior reading instruction in addition to time spent tutoring. Some variables that need to be controlled may not be as obvious, for example you would also need to ensure that both groups used similar reading texts and materials. Thus, there are really two different kinds of variables that need to be controlled, Participants variables and environmental variables. A participants variable is one on which participants in different group in a study might different enviromental variable ( such as learning materials ) is a variable tn the setting of the study that might cause unwanted differences between groups. The researcher strives to ensure that the characteristics and experincess of the groups are asequal as possible on all important variables excepted the independent variable. If relevant variables can be controlled, group differences on the dependent variable can be attributed to the independent variable.Control is not easy in an experiment, especially in educational studies, where human beings are involved. It certainly ia a lot easier to control solids, liquids and gases our task is not an impossible one, however , because we can concentrate on identifying and controlling only those variables that might really affect or interact with the dependent variable. An interaction occurs when different values of the independent variable are differentianlly effective depending upon the level of the control variable. For example, if two groups had significant differences in shoe size or height, such differences would probably not affect the results of most educational studies. Techniques for controlling those extraneous variables that do matter will be presented later in this chapter.Threats to experimental validityAs noted, any uncontrolled extraneous variables effecting performance on the dependent variable are threast to the vadility of an experimental. An experimental is valid if results obtained are due only to the manipulated independent variable and if they are generalizable to individuals or contexts beyond the experimental setting. These two criteria are referred to, respectively, as the internal validity and external validity of an experiment.Internal validity is the degree to which observed differences on the dependent variable are a direct result of manipulation of the independent variable, not some other variable. In other words, an examination of internal validity focuses on threast or rival explenations for thr research results would have been the differences in the amount of time the two groups tutored. The degree to which experimental research results are attributable to the independent variable and not to some other rival explanation is the degree to which the study is internally valid.External validity, also called ecological validity, is the degree to which study results are generalizable, or applicable, to groups and environments outside the experimental setting. In other groups, an examination of external validity focuses on threast or rival explenations that would not permit the results of a study to be generalized to other setting or groups. A study conducted with groups of gifted ninth graders, for example, should produce results that are applicable to other groups of gifted ninth grades. If research results were never generalizable outside the experimental setting, then no one could profit from research. Each and every study would have to be reestabished over and over. An experimental study can contribute to educational theory or practice only if its results and effects are replicable and generalize to other places and groups. If results cannot be replicated in other settings by other researchers, the study has low external, or ecological, validity.So, all has to do in order to ocenduct a valid experiment is to maximize boyh internal and external validity. To maximize internal validity, the researcher must exercise very right controls over participants and conditions, producing a labolatory like environment. However the more a research situation is narrowed and controlled, the less realistic and generizable it become. A study can contributed little to educational practice if its technique, which have been proven effective in highly controlled setting, will not also be effective in a less controlled classroom setting. On the other hand, the more natural the experimental setting becomes, the more difficult it is to control extraneous variables. Thus, the researcher must strive for balance between control and realism. If a choice is involved, the researcher should err on the side of control rather than realism since a study that is not internally valid is worthless. a useful startegy to address this problem is to first demonstrated an effect in a highly controlled environment ( with maksimum internal validity ) and then redo the study in a more natural setting ( to examine external validity ). In the final analysis, however the researcher must seek a compromise between a highly controlled and highly natural environment. In the following pages we describe many threats to internal and external validity. Some extraneous variables are threast to internal validity, some area threast to external validity, and some may be threast to both. How potential threast are classified is not of great importance what is important is that you be aware of their existence and how to control for them. As you read, you may begin to feel that there are just too many threast for one little researcher to control. However, the tast is not formidable as it may at first appear, since there are a number of experimental designs that do control many or most of threast you are likely to encounter. Also, remember that each threat is a potential there only it may not be problem i a particular.Threasts to internal validityProbably the most authoritative source on experimental design and threasts to experimental validity is the work of donald campbell and julian stanley, and thomas cook and donald campbell. They identifild eight main threast to internal validity, History, maturation, testing, intrumentation, statistical regression, diffrential selection of participants, mortality, and selection-maturation intrection. However, before decribing these threast to internal validity, we would like to note the role of experimental research in overcoming these threasts. you are not rendered helpless when faced with them, quete the contary, the use of random selection of participants, the researcher assignment of participants to treatments. And control of other variables are powerfull approaches to overcaming the threast. As you read about the threast, note how experimental researchsrandom selection and assignment to treatments can control most threast.HistoryWhen discussing threats to validity, history refres to any event occurring during a study that is not part of the experimental treatment but may affect the dependent variable. The longerv a study lasts, the more likely it is that history will be a threat. A bomb scare, an epedemic of meales, or even general current events are examples of events that could produce a history of fect. For example, suppose you conducted a series of in serves workshops designed to in crease the morale of teacher participants. Between the time you conducted the workshops and the time you administered a posttest measure of morale, the news media announced that, due to state level budget problems, funding to the local school district was going to be significantly reduced and promised pay raises for teachers would likely by postponed. Auch as event could easily wipe out any effect the workshops might have had, and postest morale scores might well be considerably lower than they otherwise might have been (to say the least).Maturation Maturation refres to physical, intellectual, and emotional changes that naturally occur within individuals over a period of time. In a research study, these changes may affect participants performance on a measure of the dependent variable. Especially in studies that last a long time participants may become older, more coordinated, unmotivated, anxious or just plain bored. Maturation is more likely to be a problem in a study designed to test the effectivenees of a physochomotor training program on 3 year than in a study designed to compare two method of teaching algebra. Young participants would typically undergoing rapaid biological changes during the training program, raising the question of whether changes were due to the training program or to maturation.Testing Testing, also called pretest sensitization, refres to the threat of improved performance on a posttest being a result of having taken a pretest. Taking a pretest may improve participants scores on a postest, regardles of whether they received any treatment or intructions in between. Testing is more likely to be a threat when the time between testing is short, takin a pretest taken in september is not likely to affect performance on a posttest taken in june. The testing threat to internal validity is more likely to occur in studies that measure factual information that can be recalled. For example, taking a pretest on algebraic equations is less likely to improve posttest performance that taking a pretest on multiplication facts would.

Instrumentation The instrumentation threast refres to unreliability or lack of consistenciy, in measuring instruments that may result in an invalid assessment of peformance. Instrumentation may threaten validity in several different ways. A problem may occur if the researcher uses two different test, one for pretesting, and the tests are not of equal diffiulty. For example, if the posttest is more difficult than that pretest, it may mask improvement that is actually present. Alternatively, if the posttest is more difficult than the pretest, it may mask indicate improvement that is no really present. If data are collected through observation, the obseverses may not be observing or evaluating behavior in the same way at the end of the study as at the beginning, in fact, if they are aware of the nature of the study, they may see and record only what they know the researcher is hypothesizing. If data are collected through the use of a mechanical device, the device may be poorly calibrated, resulting in inaccurate measurement. Thus, the researcher must take care in selecting tests, observers, and mechanical devices to measure the dependent variable. Statisticul regression Statistical regression usually occurs in studies where participants are selected on the basis of their extremely high or extremely low scores. Statistical regression is the tendency of participants who score highest on a test ( a pretest ) to score lower on a second, similar test ( posttest ), and of subjects who score lowest on a posttest. The tendency is for score to regress, ( move lower ) toward a mean ( average ) or expected score. Thus, extremely high scorers, or move, toward the mean, and extremely low scorers regress ( move higher )toward the mean. Differential Selection of Participants Differential selection of participants is the selections of subjects who have differences before the start of a study that may at least partially account for differences found in a posttest. The theart that the groups were different before the study even began is more likely when a researcher, is comparing alread-formed groups.Mortality Mortality or attrition, refres to a reduction in the number of research participants that occurs over time as individuals drop out of the study. Mortality creates problems with validity particularly when different groupsmdrop out for different reasons and with different frequency the change in the characteristic of the group due to mortality can have a significant effect on the results of the study. For example, participants who drop out of the study may be lest motivated or uninterested in the study than those who remain. This is especially a problem when volunteers are used or when a study compares a new treatment to an existing treatment. Participants rarely drop out of control groups or exisiting treatments because few or no additional demands are made on them. However, volunteers participants using the new, experimental treatment may drop out because too much effort is required for participation. The experimental group that remains at the end of the study then represents a more motivated group than the control group. As another example of mortality, suppose suzy shiningstar (a high IQ and all the student) got the measles and dropped out of your control group. Before suzy dropped out, she managed to infect her friends might also be the high IQ and all that type students. The experimental group might end up looking pretty good when compared to the control group simply because many of the food students dropped out of the control group. The researcher cannot assume that participants drop out of a study in a rendom fashion and should, if possible, select a design that controls for mortality. A researcher can assess the mortality of groups by obtaining demographic information about the participants groups before the star of the study and the determining if the makeup of the group has changed at the end of the study. One wey to reduce mortality is to provide some incentive to participants to remain in the study. Another approach is to identify the kinds of participants who drop out of the study and remove similiar portions from the other groups.Selection-Maturation Interaction and Other Interactive Effects The effects of different selection may also interact with the effects of maturation, history, or testing to cause a threat to internal validity. What this means is that if already-formed groups are used, one group may profit ( or less ) from a treatment or have an intial advantage ( or disadvantages ) because of maturation, history, or testing factors. The most common of these interactive effects is selection-maturation interaction, which would exist. Of participants selected into the treatment groups matured at different rates during the study. To get a better idea of these effects, suppose, for example, that you received permission to use two of ms. Hynees english classes and that both classes were average and apparently equivalent on all relevant variables. Suppose, however, that for some reasons ms. Hynee had to miss one of her classes but not the other ( maybe she had to have a root canal ) and ms. Almamater took over ms. Hynees class. As luck would have it, ms. Mater proceeded to cover much of the material now included in your posttest ( remember history ). Unbeknownst to you, your experimental group would have a definite advantage to begin with, and it might be this initial advantage, not the independent variable, that caused posttest differences in the dependent variable. Thus, a researcher must select a design that controls for potential problems such as this or make every effort to determine if they are operating in the study.Threats to external validity There are several major threats to external validity that can limit generalization of experimental results to other populations. Building on the work of campbell and stanely, bracht and glass refined and expanded discussion pf threast to external validity. Bracht and glass clssified these therast into two categories. Threast effecting generalizing to whom that is, threat affecting the groups to which research results be generalizing make up to population validity. Threats affecting generalizing to what that is threast affecting the settings, conditions, variables, and contexts to which results can be generalized make up threast to ecological validity. The following discussion incorporates the contributions of bracht and glass into campbell and stanleys original ( 1971 ) conceptualizations.Pretest-treatment interactionPretest-treatment interaction occurs when participants respond or react differently to a treatment because they have been pretested. Pretesting may sensitize or alert subjects to the nature of the treatment, potentially making the treatment effect different than it would have been had subjects not been pretested. Thus, the research results would be generalizable, only be other pretest groups. The results would not even generalizable to the unpretested population from which the sample was selected. The seriousness of the pretest-treatment interactions threat depends on the research participants, the nature of the independent and dependent variables, and the duration of the study. Studies involving sell-report meausre, such as attitude scales and interes inventories, are especially susceptible to this threat. Campbell and stanley illustrate this effect by pointing out the probable lack of comparability of between two groups : one that views the antiprejudie film gentlemans agreement right after taking a lengthy pretest dealing with anti-semitism and another that views the move without a pretest. Individuals not pretested could conceivably enjoy the move as a good love story and be unaware that it deals with a social issue. Pretested individuals would be much more likely to see a connections between the pretest and the massage of the film. In contrast, taking a pretest on algebraic algorithms would probably have very young children, who would probably not see or remember a connection between the pretest and the subsequent treatment. Similarly, for studies conducted over a period of months or longer, the effects of the pretest would probably have wron off or been greatly diminished by the time a posttest was given. Thus, for some studies the potential interactive effect of a pretest ia a more serious consideration than others. In such cases, researchers should make use of unobtrusive measure, ways to collect data that do not intrude on, or require interaction with, research participants. For example, data can be gathered from school records, transcripts, and other written sources.

Multiple-treatment interference Sometimes the same research participants receive more than one treatment in succession. Multiple-treatment interference occurs when carryover effects from an earlier treatment make it difficult to assess the effectiveness of a later treatment. Suppose you were interested in comparing two different approaches to improving classroom behavior modification and corporal punishment ( admittedly an extreme example used to make a point ). For 2 months, behavior modification techniques were systematiclly applied to the particapants, and at the end of this period you found behavior to be significantly better than before the study began. For the next 2 months, thne same participants were physically punished ( with hand slappings, spankings and the like ) whenever they misbehaved, and at the end of the 2 months behavior was equally as good as after the 2 months of behavior modification. Could you then conclude that behavior modification and corporal punishment are equally effective methods of behavior control certainly not. In fact the goal of behavior modification is to produce self-maintaining behavior that is, behavior that continues after direct intervention is stopped. Thus, the good behavior exhibited by the participants at the end of the corporal punishment period could well be due to the effectiveness of previous exposure to behavior modification and exist in spite of rather than because of exposure to corporal punishment. If it is not possible to select a design in which each group receives only one treatment, the researcher should try to minimize potential multiple-treatment interference by allowing sufficient time to elapse between treatments and by investigating distinctly different types of independent variables.Multiple-treatment interference may also occur when participants who have already participated in a study are selected for inclusion in another, apparently study. If the accessible population for a study is one whose members are likely to have participation should be collected and evaluated before subjects are selected for the current study. If any members of the accessible population are eliminated from consideration because of previous research activities, a note should be made of this limitation in the research report.Selection-Treatment Interaction Selection-treatment interaction, like differential selection of participants problem associated with internal invalidity, mainly occurs when participants are not randomly selected for treatments. Interaction effects aside, the very fact that participants are not randomly selected from a population severely limits the researchers ability to generalize, because what population the sample represents is in question. Even if intact group are randomly selected, the possiblility exists that the experimental group is in some important way different from the control group, the larger population or both.When the use of nonrepresentative groups results in study findings that apply only to the groups involved and are not representative of the treatment effect in the extended population, this is selection-treatment interaction, another threat to population validity. This interaction occurs when actual study participants at one level of a variable react differently to a treatment than other potential participants in the population, at another level, would have reacted. For example, a resarcher might conduct a study on the effectiveness of microcomputer-assisted instruction on the math achievement of junior high students. Classes available to the researcher may represent an overall ability level at the lower end of the ability spectrum for all junior high students. If a positive effect is found, it may be that it would not have been found if the subjects were truly representative of the target population. And similary, if an effect is not found, it might have been. Thus, extra caution must be taken in stating conclusions and generalizations based on studies involving existing, nonrandomized groups. Selection-treatment interaction ia also an uncontrolled variable in designs involving randomization. For example, the accessible population is often quite different from the researcher attempts to generalize the result of the accessible population to the target population. Thus, the way a given population becomes available to a researcher may make generalizability of findings questionable, no matter how internally valid an experiment may be.suppose that, in seeking a sample, a researcher is turned down by 9 school system and finally accepted by a 10 th. The accepting system is very likely to be different from both the other 9 system and the population of schools to which the researcher would like to generalize the results. Administrators and instructional personnel in the 10 th school likely have higher morale, less fear of being inspected, and more zoal for imporvement than personnel in the other 9 school. In the research report, including the number of times they were turned down, so that the reader can judge the seriousness of a possible selection treatment interaction.Specificity of Variables Like selection treatment interaction, specificity of variables is a threat to generalizability of research results regardless of the particular experimental design used. Any given study has specificity of variables that is the study is conducted with a specific set of circumstance. We have discussed the need to describe research procedures in sufficient detail to permit another researcher to replicate the study. Such detailed descriptions also permit interested readers to assess how applicable findings are to their situation. Experimental procedures require operational definition of the variables. When studies that supposedly manipulated the same independent variable get quite different results, it is often difficult to determine the reasons for the differences because researchers have not provided clear, operational description og their independent variables. When operational description are available, they often reveal that two independent variables with the same name were in fact defined quite differently in the separate studies, thus explaining why results differed. Because such terms as discovery method, whole language, and computer-based instruction mean different things to different people, it is impossible to know what a research means by these terms inless they are defined without operationalized descriptions, it is not clear to what populations a study can be generalized. Generalizability of results it also tied to the clear definition of the dependent variable although in most cases performance on a specific measure ia the operational definition. When there are a number of dependent variable measures to select from, questions about the comparability of these instruments must be raised.Generalizability of results may also be affected by short or long term events that occur while the study is taking place. This threat is referred to as interaction of history and treatment effects. It describe the situation in which events extraneous to the study alter the research results. Short term, emotion-packed events, extraneous to the study alter the research results. Short-term, emotion-packed events, such as the firing of a superintendent, the release of district test scores, or the impeachment of a president might affect the behavior of participants. Usually however, the researcher is aware of such happenings and can assess their possible impcat of longer term events, such as wars and economic depressions, however, is more subtle and tougher to evaluate. Another threat to external validity is the interaction of time of measurement and treatment effect. This threat results from the fact that posttest given some time after treatment might provide evidence for an effect that does not show up on posttest give some time after treatment. Conversely, a treatment may have a long time, but a short term effect. Thus, the only way to assess the generalizability of findings over time is to measure the dependent variable at various times following treatment. To deal with the threats associated with specificity, the researcher must operationally define variables in a way that has meaning Treatment diffusion Treatment diffusion occurs when different treatment groups communicate with and learn from each other. Knowledge of each others treatment often leads to the groups borrowing aspects from each other so that the study no longer has two distinctly different treatmens, but two overlapping ones. The integrity of each treatments is diffused. Often, it is the more desirable treatment the experimental treatment or the treatment with additional resources that is diffused into the less desirable treatment. Experimental effects Researcher themselves also present potential threats to the external validity of their own studies. A researchres influences on participants or on study procedures are knwon as experimenter effects. Passive experimenter effects occur as a result of characteristic or personality traits of the experimenter, such as gender, age, race, anxiety level, and hostility level. These influences are collectively called experimenter personal attributes effects. Active experimenter effects occur when the researchers expectations of the study results affect her behavior and actually contribute to production certain research outcomes. This effect is referred to as the experimenter bias effect. Thus, an experimenter may unintentionally affect study results, typically in the direction desired by the researcher, simply by looking, feeling, or acting a certain way. One from of experimenter bias occurs when the researcher affects participants behavior, or is inaccurate in evaluating their behavior because of previous knowledge of the participants. Suppose researcher hypothesizes that a new reading approach will improve reading skills. If the researcher know that suzys reading skills a higher rating than they actually warrant. This example, illustrates another way a researchers expectations may actually contribute to producing those outcomes knowing which participants are in the experimental and control groups mau cause the researcher to unintentionally evaluate their performances differently. It is difficult to identify experimenter bias in a study, which is all the more reason for researchers to be aware of its consequences on the external validity of their study. The moral is that the researcher should strive to avoid communicating emotions and expectations to participants in the study. Experimenter bias effects can be reduced by blind scoring in which the researcher doesnt know whose performance is being evaluated.Reactive arrangements Reactive arrangements, also called participants effects, are threats to validity that are associated with the wey in which a study is conducted and the feelings and attitudes of the participants involved. As discussed previously, in order to maintain a high degree of control and thus obtain internal validity, a researcher may create an experimental environment that is highly artificial and hinders generalizability to nonexeprimental settings, this is a reactive arrangement. Another type of reactive arrangement results from participants knowledge that they are involved in an experiment or their feeling that they are in some way receiving special attention. The effect that such knowledge or feellings can have on the participants was demonstrated at the hawthorne plant of the western electric company in chicago some years ago, studies were conducted to investigate the relantionship between various working conductions and productivity. As part of their study, researchers investigated the effect og light intensity and worker output. The researchers increasted light intensity and production went up. They increased it some more and production went up some more. The brighter the place became, the more production rose. As a check, the researcher decreased the light intensity, and guess what, production went up the darker it got, the more workers producted. The researchers soon realized that it was the attention given the workers, and not the illumination, that was affecting production. To this day, the term hawthorne effects is used to describe any situation in which participants behavior is affected not by treatment per se, but their knowledge of participating in a study. A related reactive effect, known as compensatory rivalry or the jhon henry effect, occurs when members of a control group feel threatened or challenged by being in competition with an experimental group and they perform way beyond what would normally be expected. Folkhero john henry, you may recall, was a steel driven man who worked for a railroad. When he heard that a sterm drill was going to replace him and his fellow steel drivers, he challenged and set out to beat, the machine. Through tremendous effort he managed to win the ensuing contest, dropping dead at the finish line. When research participants are told that they will form the control group for a new. Experimental methode, they act like jhon henry they decide to challenge the new method by putting extra effort into their work, essentially saying, well show them that our old ways are as effective as their newfangled ways. By doing this, however, the control group performs atypically and their performance provides a rival explanation for the experimental group is not much better than that of the control group. As an antidote to the hawthorne and jhon henry effects, educational researchers often attempt to achieve a placebo effect . The term comes from medical researchers who discovered that any medication , even sugar and water, could make subjects feel better , any beneficial effect caused by a persons expectation about a treatment rather than the treatment itself became known as the placebo effect. To counteract this effect, a placebbo approach was developed in which half of the subjects receive the ture medication and half receive a placebo. The use of a placebo is, of course, not known by the participants, both groups think they are taking a real medicine. The application of the placebo effect in educational research is that all groups in an experiment should appear to be treated the same. Suppose, for example, you have four groups of ninth graders, two experimental and two control, and the treatment is a film designed to promote a positive attitude toward a vocational carrer, and the experimental participants are to be excused and shown another film whose content is unrelated to the purpose of the study.as an added control, you might have all the participants told that there are two movies and that eventually everyone will see both movies. In other words, it should appear as if all the students are doing the same thing. Another reactive arrangemen, or participants effect, is the novely effect, which refers to the increased interest, motivation, or engagement participants develop simply because they are doing something different. In other words, a treatment may be effective because it is different, not because it is better. To counteract the novely effect, a researcher should conduct a study over a period of time long enough to allow the treatment newness to wear off. This is especially advisable if the treatment involves activities very different from the subjects usual routine. Obviously there are many internal and external threats to the validity of an experimental ( or causal comparative ) study. You should be awaer of likely threats and strive to nullify them. One main way to overcome many threats to validity is to choose a research design that controls for such threats. We examine some of these design in the following sections.Group Experimental designs The validity of an experiment is a direct function of the degree to which internal and external variables are controlled. If such variables are not controlled, it is difficult to interpret the results of a study and the groups to which results can be generalized. The trem confounding is sometimes used to describe an intertwining of the effects of the independent variable with those of extraneous variables that makes it difficult to determine the unique effects of each this is what experimental design is all about the control of extrancous variables. Good desaigns control many sources of invalidity poor designs control few. If you recall, two types of extraneous variables in need of control are participants variables and environmental variables. Participant variables include both organismic variables and intervening variables. Organismic variables are participant ia an example. Intervening variables intervence between the independent and the dependent variable and cannot be directly observed but can be controlled for anxiety and boredom are example.Control of Extraneous Variables Randomization is the best single way to simultaneously control for many extraneous veriables. Thus, randomization should be used whenever possible participants should be randomly selected from a papulation and randomly assigned to treatment groups. To ensure random selection and assignment, researcher use methods that rely on pure chane, usually consulting a table of random numbers. Other randomization methods are also available. For example a researcher could flip a coin or use odd and even numbers on a die to assign participants to two treatments, heads or an even number would signal assignment to treatment 1, and tailsor an odd number would signal assignment to treatment 2. Randomization is effective in creating equivalent, representative groups that are essentially the same on all revelant variables. As noted, the use of randomly formed not possible with causal comparative research. The underlying rationale for randomizition is that if subject are assigned at random to groups, there is no reason to belive that the groups will be greatly different in any systematic way. Thus, the groups would be expected to perform essentially the same on the dependent variable if the independent variable makes no differnces. Therefore, if the groups perform differently ar the end of the study , the differnce can be atributed to the independent variable. It is important to remember that the large the groups, the more confidence the researcher can have in the effectiveness of randomization. Randomly assigning 6 participants to two treatments is much less likely to equalize extraneous variables than assigning 50 participants to two treatments. In addition to equating groups on participants variables such as ability, gender, or prior experience, randomization can also equalize groups on environmental variables. Teacher, for example, can be randomly assigned to teratment groups so that the experimental groups will not have all the carmel kandee teachers or all the hester hartless teacher. clearly, the researcher shoulduse as much randimization as possible. If subjects cannot be randomly assigned to groups, then at least treatment conditions should be randomly assigned to the existing groups. In addition to randomization, there are other ways to control for extraneous variables. Certain environmental variables. For example can be controlled by holding them constant foe all groups. Recall the example of the students tutor versus parent tutor study.in the example, help time was an important variable that had to be held constant, that is, made the same for both groups for them to be fairly compared. Other such variables that might need to be held constant include learning materials, prior exposure, meeting place and time ( students might be more aleart in the morning than in the afternoon ), and years of research experience. Controlling participants variables is critical. If the groups are not the same to start with, you have not even given yourself a figthing chance to obtain valid, intrepretable research results. Even if groups cannot be randomly formed, there are a number of techniques that can be used to try to equate groups. These include matching, comparing homogeneous groups or subgroups, using participants as their own controls, and analysis of covariance, several of these concepts were introduced the discussion of equating groups in causal comparative studies.

MatchingMatching is a technigue for equating groups on one or more variables, usually ones highly related to performance on the dependent variable. The most commoly used approach to matching involves random assignment of pairs, one participants to each group. In other words, the researcher attempts to find pairs of participants similar on the variable or variables to be controlled. If the researcher is matchingon gendre, obviously the matched pairs must be of the same gendre. However, if the researcher is matching on variables such as pretest, GRE, or ability scores, the pairing can be based on similarity of scores. Unless the number of participants is very large, it is unreasonable to try to make exact matches or matches on more than one or two variables. Once a matched pair is identified, one member of the pair is randomly assigned to one treatment group and the other member to the other treatment group. A participants who does not have a suitable match is excluded from the study.the resulting matched groups are idential or very similiar with respect to the variable being controlled. Major problem with such matching is that there are invariably participants who do not have a match and must be eliminated from the study. This may cost the research many subject, especially if matching is attempted on two or more variables . of course, one way to combat loss of participants is to match less stringenrly. They will constitute an acceptable match. This approach may increase the number of subjects, but it tends to defeat the purpose of matching. A related matching procedure is to rank all of the participants, from highest to lowest, based on their score. Are the first pair. One member of the first pair is randomly assigned to one group and the other member to the other group. The next two highest ranked participants are the second pair, and so on. The major advantage of this approach is that no participants are lost. the major disadvantage is that it is a lot less precies than pair-wise matching. Advance statistical procedures, such as analysis of convariance, have greatly reduced the research use of matching.

Comparing Homogeneous Groups or SubgroupsAnother previosly discussed way to control an extraneous variable is to compare groups that are homogeneous with respect to that variable. For example if IQ were an identified extraneous variable, the researcher might select only participants with IQ between 85 and 115 ( adverage IQ ). The resercher would then randomly assign half the selected participants to the experimental group and half to the control group. Of course, this proceduse also lowers the number of participants in the population and additionally restricts the generalizability of the findings to participants with IQs between 85 and 115 as noted in the discussion of causal comparative research , a similar, more satisfactory approach is to form different subgroups representing all levels of the control variable. For example, the available participants might be divided into high ( 116 or above ), average ( 85 to 115 ), and low ( 84 and below ) IQ subgroups. Half of the participants from each of the subgroups could then be randomly assigned to the experimental group and half to the control group. This procedure should sound familiar, since it describe startified sampling. If the researcher is interested not just in controlling the variable but also in seeing if the independent variable affects the dependent variable differently at different levels of IQ, the best appoarch is to build the control variable right into the design. Thus, the research design would have six cells, two treatment by three IQ levels. Draw the design for yourself, and label each cell with its treatment and IQ level.Using participants as their own controls Using participants as their own controls involves exposing a single groups to different treatments one treatment at a time. This strategy helps to control for participants difference, because the same participants get both treatments. Of course, this approach is not always feasible, you cannot teach the same algebraic concepts to the same groups twice using two different method of instruction ( well, you, could, but it would not make much sense ). A problem with this approach is a carryover effect from one treatment to the next. To use a previous example, it would be very difficult to evaluate the effectiveness of corporal punishment in improving behavior if the group receive corporal punishment was the same group that had previously been exposed to behavior modification. If one group was available. A better approach, if feasible, would be to randoml divide the group into two smaller groups, each of which would receive both treatments but in a different order. Thus, the researcher could at least get some idea of the effectiveness of corporal punishment because there would be a group that received it before behavior modification. In situations in whichthe effect of the dependent variable dissapers quickly after treatment, or in which a single participant is the focus of the research, participant can be used as their own controls.Analysis of CovarianceThe analysis of covariance is a statistical method for equating randomly formed groups of on one or more variables. Analysis of covariance adjusts scores on a dependent variable for initial differences on some other variable, such as pretest scores, IQ, reading readiness, or musicsal aptitude. The covariate variable should be one related to performance on the dependent variable. Although analysis of covariance can be used in studies when groups cannot be randomly formed, its use is most appropriate when randomization is used. In spite of randomization, it might be found that two groups still differ significantly in terms of pretest scores. Analysis of covariance can be used in such cases to correct or adjust postest scores for initial pretest differences. However, analysis of covariance is not universally useful. For example, the relationship betweenthe independent and covariate variables must be linear (represented by a straight line). If the relationship is curviliniear, analysis of covariance is not useful. Also, analysis of covariance is often used when a study deals with intact groups, uncontrolled variables, and nonrandom assignment to treatments, all of which weaken its results. Calculation of an analysis of covariance is a complex procedure. B. TYPES OF GROUP DESIGNSThe experimental design you select to a great extent dictates the specific procedures of your study. Selection of a given design influences factors such as whether there will be a control group, whether participants will be randomly selected and assigned to groups, whether the group will be pretested, and how data will be analyzed. In selecting a design, determining which designs are appropriate for study and for testing hypothesis is important before determining which of these are also feasible given the constraints under which you may be operating. There are two major classes of experimental designs: single-variable designs and factorial designs. A single-variable design is any design that involves one manipulated independent variable. Meanwhile, factorial design is any design that involves two or more independent variables with at least one being manipulated. Single-variable designs are classified as pre-experimental, true experimental, or quasi-experimental, depending on the degree of control they provide for threats to internal and external invalidity. Pre-experimental design do not do a very good job of controlling threats to validity and should be avoided. In fact, the results of a study based on pre-experimental designs are so questionable they are not useful for most purposes except, perhaps, to provide a preliminary investigation of a problem. True-experimental designs provide a very high degree of control and are always to be preferred. Quasi-experimental designs do not control as well as true experimental designs but do a much better job than the pre-experimental designs. If we could give such grades towards these three experimental designs, true experimental designs would get an A, quasi-experimental designs would get B or C (some are better than others), and pre-experimental designs would get a D or E. You could see on the pictures below:

Figure 2.1Explanation:True-experimental designs are better than quasi-experimental designs;Quasi-experimental designs are better than pre-experimental designs;Pre-experimental designs are quite better than not doing study at all.

Factorial designs are basically elaborations of single-variable experimental designs except that they permit investigation of two or more variables, individually and in interaction with each other. After an independent variable has been investigated using a single-variable design, it is often useful to then study the variable in combination with one or more other variables. Some variables work differently when paired with different levels of another variable. The designs discussed next represent the basic designs in each category.

Pre-Experimental Designsa) The One-Shot Case StudyThe one-shot case study involves a single-group treatment (X) and then posttested (O). None of the sources of invalidity are controlled in this design. b) The One-Group Pretest-Posttest DesignThe one-group pretest-posttest design involves a single group that is pretested (O), exposed a treatment (X), and postested (O). The success of the treatment is determined by comparing pretest and posttest scores. This design controls some areas of invalidity not controlled by the one-shot case study, but a number of additional factors relevant to this design are not controlled. If participants do significantly better on the posttest than on the pretest, it cannot be assumed that the improvement is due to the treatment. History and maturation are not controlled. Something may happen to the participants that makes them perform better the second time, and the longer study takes, the more likely it is that is something will threaten validity. Testing and instrumentation also are not controlled; the participants may learn something on the pretest that helps them on the posttest, or unreliability of the measures may be responsible for the apparent improvement. Statistical regression is also not controlled. Even if subjects are not selected on the basis of extreme scores (high or low), it is possible that a group may do very poorly on the pretest, just by poor luck. For example, participants may guess badly just by chance on a multiple-choice pretest and improve on a posttest simply because, this time, their guessing produces a score that is more in line with an expected score. The external validity threat pretest-treatment interaction is also not controlled. Pretest-treatment interaction may cause participants to react differently to the treatment than they would have if they had not been pretested. c) The Static-Group Comparison The static-group comparison involves at least two nonrandomly formed groups: one that receives a new or unusual treatment (the experimental treatment) and another that receives a traditional treatment (the control treatment). Both group are postested. However, it is probably more appropriate to call them both comparison groups, because each really serves as the comparison for the other. Each group receives some form of the independent variable (the treatments). Therefore, as an example, if the independent variable is type of drill and practice, the experimental group (X1) may receive computer-assisted drill and practice, and the control group may receive worksheet drill and practice. Occasionally, but not often, the experimental group may receive something while the control group receives nothing. The purpose of a control group is to indicate what the performance of the experimental group would have been if it had not received the experimental treatment. This purpose is fulfilled only to the degree that the control group is equivalent to the experimental group.The static-group comparison design can be expanded to deal with any number of groups. For three groups, the design would take the following form:X1OX2OX3OWhich group is the control group? Basically, each group serves as a control or comparison group for the other two. As already emphasized, but worthy of repeating, the degree to which the groups are equivalent is the degree to which their comparison is reasonable. In this design, participants are not randomly assigned to groups and there are no pretest data; thus, it is difficult to determine just how equivalent the groups are. That is, it is possible taht posttest differences are due to initial group differences in maturation, selection, and selection interactions, rather than the treatment effects. Mortality is also a problem; if you lose participants from the study, you have no information about what you have lost because you have no pretest data. On the positive side, the presence of a comparison group does control for history, since it is assumed that events occuring outside the experimental setting will equally affect both groups.Nevertheless, the static-group comparison design is occasionally employed in a preliminary or exploratory study. For example, one semester, early in the term, a teacher wondered if the kind of test items given to educational research students affects their retention of course concepts. For the rest of the term, students in one section of the course were given multiple-choice tests, and students in another section were given short-answer tests. At the end of the term, group performances were compared. The group receiving short-answer test items had higher total scores than the multiple-choice item group. On the basis of this exploratory study, a formal investigation of this issue was undertaken.

True-Experimental Designsa) The Pretest-Posttest Control Group DesignThe pretest-posttest control group design requires at least two groups, each of which is formed by random assignment; both group are administered a pretest, each group receives a different treatment, and both groups are posttested at the end of the study. Posttest scores are compared to determine the effectiveness of the treatment. The pretest-posttest control group design may also be expanded to include any number of treatment groups. For three groups, for example, this design would take the following form:ROX1 OROX2 OROX3 OThere are a number of ways in which the data from this and other experimental designs can be analyzed to test the research hypothesis regarding the effectiveness of the treatments. The best way to analyze these data is to compare the posttest scores of the two treatment groups. The pretest is used to see if the groups are essentially the same on the dependent variable at the start of the study. If they are; posttest scores can be directly compared using a statistic called the t test. If the groups are not essentially the same on the pretest (random assignment does not guarantee equality), posttest scores can be analyzed using analysis of covariance. Recall that covariance adjusts posttest scores for initial differences on any variable, including pretest scores. This approach is superior to using gain or difference scores (posttest minus pretest) to determine the treatment effects.A variation of the pretest-posttest control group design involves random assignment of members of matched pairs to the treatment groups, in order to more closely control for one or more extraneous variables. There is really no advantage to this technique, however, because any variable that can be controlled through matching can be better controlled using other procedures such as analysis of convariance. b) The Posttest-Only Control Group DesignThe posttest-only control group design is exactly the same as the pretest-posttest control group design except there is no pretest; participants are randomly assigned to at least two groups, exposed to the different treatments, and posttested. Posttest scores are compared to determine the effectiveness of the treatment. As with the pretest-posttest control group design, the pretest-posttest control group design, the posttest-only control group design can be expanded to include more than two groups. The combination of random assignment and the presence of a control group serves to control for all sources of internal invalidity except mortality. Mortality is not controlled because of the absence of pretest data on participants. However, it may or may not be a problem, depending on the duration of the study. If it is not a problem, the researcher may report that although mortality is a potential threat to the validity with this design, it did not prove to be a threat because the group sizes remained constant or nearly constant throughout the study. If the probability of differential mortality is low, the posttest-only design can be very effective. However, if there is any chance that the groups may be different with the respect to pretreatment knowledge related to the dependent variable, the pretest-posttest control group design should be used. Which design is best depends on the study. If the study is to be short, and if it can be assumed that neither group has any knowledge related to the dependent variable, then the posttest-only design may be the best choice. If the study is to be lengthy, or if there is a chance that the two groups differ on initial knowledge related to the dependent variable, the pretest-posttest control group design may be the best. c) The Solomon Four-Group DesignThe Solomon four-group design involves random assignment of participants to one of four groups; two of the groups are pretested and two are not; one of the pretested groups and one of the unpretested groups receive the experimental treatment; and all four groups are posttested with the dependent variable. The correct way to analyze data resulting from application of the Solomon four-group design is to use a 2 x 2 (two by two) factorial with treatment and control groups crossed with pretesting and nonpretesting. There are two independent variables in this design: treatment/control and pretest/no pretest. The 2 x 2 factorial analysis tells the researcher whether the treatment is effective and whether there is an interaction between the treatment and the pretest.

Quasi-Experimental Designsa) The Nonequivalent Control Group DesignThe nonequivalent control group design looks very much like the pretest-posttest control group design, except that the nonequivalent control group design, except that the nonequivalent control group design does not involve random assignment. The lack of random assignment raises the possibility of interactions between selection and variables such as maturation, history, and testing. Reactive effects are minimized.b) Time-Series DesignIn the time-series design, one group is repeatedly pretested, exposed to a treatment, and then repeatedly posttested. If s group scores essentially the same on a number of pretests and then significantly improves following a treatment, the researcher has more confidence in the effectiveness of the treatment than if just one pretest and one posttest were administered. History is a problem, as is pretest-treatment interaction.

c) Counterbalanced Design In a counterbalanced design, all groups receive all treatments but in a different order, the number of groups equals the number of treatments, and groups are postested after each treatment. This design is usually employed when intact groups must be used and when administration of a pretest is not possible. A weakness of this design is potential multiple-treatment interference.

Factorial DesignsFactorial designs involve two or more independent variables, at least one of which is manipulated by the researcher. Factorial designs are basically elaborations of single-variable true experimental designs that permit investigation of two or more variables individually and in interaction with each other. The term factorial refers to a design that has more than one independent variable, or factor. For example, method of instruction is one factor and student aptitude is another. The factor student aptitude also has two levels, high aptitude and low aptitude. The purpose of a factorial design is to determine whether the effects of an independent variable are generalizable across all levels or whether the effects are specific to particular levels. A factorial design also can demonstrate relationships that a single-variable design cannot. For example, a variable found not to be effective in a single-variable study may be found to interact significantly with another variable.

CHAPTER IIICONCLUSIONSExperimental research is the only type of reseacrh that can test hypotheses to estabilsh cause-effect relationships. When well conducted, experimental studies produce the soundest evidence concering cause-effect relationships. The results of experimental research permit prediction, but not the kind that is characteristic of correlational research.There are two major classes of experimental designs: single-variable designs and factorial designs. Single-variable designs involve: pre-experimental, true-experimental, quasi-experimental, and factorial designs. Pre-experimental designs can be applied through the one-shot case study, the one-group pretest-posttest design, and the static-group comparison. True-experimental designs can be implemented in several ways: the pretest-posttest control group design, the posttest-only control group design, and the Solomon four-group design. Quasi-Experimental Designs are applied in the nonequivalent Control Group Design, Time-Series Design, and Counterbalanced Design. The last, factorial designs are basically elaborations of single-variable true experimental designs that permit investigation of two or more variables individually and in interaction with each other. An experimental research study design has several advantages and disadvantages. The advantages of experimental research are two: the ability to prove causal relationship and the ability to precisely manipulate one or more variables researcher desired. Meanwhile the disadvantages are:It is difficult to generalize in everyday life. This means that the results of an experimental study cannot be directly used in real life or day-to- day. It is caused by conditions that are very controlled experimental studies, so the situation is not as in the real life. It requires considerable time. But this reason is not entirely correct, because sometimes an experimental study conducted in a relatively short time compared to the non-experimental research. REFERENCEShttp://damai1991.blogspot.com/2013/09/penelitian-eksperimental-psikologi.html

Experimental Research38