Best Practice in Adjusting Administration Time on Employment Tests

8

Click here to load reader

Transcript of Best Practice in Adjusting Administration Time on Employment Tests

Page 1: Best Practice in Adjusting Administration Time on Employment Tests

Best Practice Series

Adjusting administration time onemployment tests: best practice principlesto ensure validity and fairnessEugene Burke, Director of Science & Innovation, SHL Group

Page 2: Best Practice in Adjusting Administration Time on Employment Tests

This paper provides advice on the key issues that test users should address in considering any adjustmentto the standard times recommended for the administration of an employment test. Such issues are mostlikely to arise when a candidate or job applicant is identified as possessing a special need such as a physicalor cognitive disability.

This paper focuses on ability or cognitive tests where a candidate’s performance is scored againstquestions or tasks that are structured to have one correct answer (such as a verbal, numerical or inductivereasoning test) or that are scored to reflect a candidate’s maximum performance on a continuous task(such as a dexterity or hand-eye coordination test or a simulation requiring the candidate to respond tomultiple tasks and objectives). It excludes self report measures (such as measures of personality, motivationor values), as this class of assessment does not generally have a time constraint other than that used toguide candidates on how long the questionnaire is likely to take to complete, or that might be used forplanning purposes in an assessment or development centre in which such measures can be included.

In addition to dealing with specific issues of time adjustments, this paper is also intended as a contributionto best practice in the use of tests whether they be administered offline (such as a paper-and-pencil orcomputer-based tests) or online (such as the Verify portfolio of testsi). We have consulted many clients indrafting this paper and consulted with several leading experts in the field of testing including work andorganisational psychologists, other test developers and those who specialise in providing legal advice ontesting. What is apparent from these conversations is that this is an area that can raise confusion and poorpractice that may result in reputational and potentially legal risks to test users. SHL offers this paper tohelp test users avoid these risks.

Eugene Burke

Introduction

2 > Best Practice Series

Page 3: Best Practice in Adjusting Administration Time on Employment Tests

The issue of fairness to all candidates.

One of the key issues that is often overlooked in adjusting test times is whether such an adjustment is fairto all candidates, including those for whom the test time has not and will not be adjusted. Candidates’perceptions of an assessment process of which cognitive tests are a part will be largely driven by twoaspects of perceived justiceii:

• Procedural justice relates to whether a process is seen as offering a fair opportunity for participants inthat process to demonstrate their suitability for a position or role. The accuracy and stability of aninstrument allied with strong validity evidence, criterion and construct, are critical elements of thescientific evidence supporting positive perceptions of procedural justice. Also, evidence that shows thatan instrument functions equally well for different candidate groups and that it is free from any biases inits content and scoring is also important in supporting positive perceptions of procedural justice.

• Distributive justice relates to whether the outcomes of a process such as decisions to hire or not tohire are seen as fair and this is most often linked to disparate or adverse impact such as the 4/5th’srule used in the US to evaluate whether a process or stage of a process may exhibit adverse impactagainst protected groups as defined by US employment laws (similar classifications are used in othercountries)iii.

Adjusting the test time for any candidate should be based on a clear policy for such adjustments andshould not be undertaken as simply a matter of course when a candidate declares that they should beafforded special treatment. As we will explore a little further below, one of the key reasons that times areset for cognitive tests is to provide a standardised set of conditions so that scores between candidates canbe directly compared. When the test time is adjusted without a clear policy and without checks to ensurethat an adjustment is compliant with such a policy, then it may provide an advantage to a candidate thatmight be seen as unfair by other candidates. As such, an adjustment has changed the conditions underwhich the test was administered to different candidates; it also raises the issue of how the scores for thosedifferent candidates can be compared. These issues will be addressed in the principles of best practiceprovided below.

Test users should consider the impact on all candidates when considering an adjustment to test times.More specifically, users should consider whether they are creating a risk of claim of unfair treatment,whether that be disparate treatment or disparate impact as defined below and as taken from Burke(2006)iv:

• Disparate treatment which hinges on whether the candidate was treated differently, whether differenttreatment can be shown to have been unfair, and whether that treatment was inappropriately related tothe candidate’s ethnicity or race, religion, sex, age or disability. In a disparate treatment case, theapplicant is required to show that the employer’s rationale for the employment practice lacks credibilityand that the basis for the practice is discriminatory. To respond to such a claim, the employer isrequired to provide evidence of the logic behind the practice, and that logic needs to be backed up bydata that shows the employment practice is not discriminatory.

• Disparate impact arises when an employer introduces a practice that, while not intentionallydiscriminatory, is claimed to exclude or adversely affect members of groups protected underemployment law. This is the form of discrimination most closely associated with assessment and isoften referred to as adverse impact. In a case of disparate impact, proof of the claim relies on theapplicant showing that an alternative and equally valid process would have resulted in lower or noadverse impact. Responses to such claims may require the employer to provide statistical evidence thatthe process is not systematically biased, which takes us back to making sure that the science is goodand the evidence that it is good has been collected.

Best Practice Series > 3

Page 4: Best Practice in Adjusting Administration Time on Employment Tests

Why has a time been set for a test?

One of the reasons that times are set for tests has been mentioned: to establish a standardised condition sothat scores can be directly compared across candidates (there are of course other conditions that help tostandardise test administration, but time is a key one).

The other reason is that time may be a key aspect of what is being measured. For example, many tests usedto assess fit to clerical and administrative roles involve fairly simple questions such as comparing strings ofletters and numbers to identify similarities or differences. The key construct assessed by such tests is anability referred to as perceptual speed, which by the nature of its title, has time as an inherent element. It isoften defined as the ability to make quick and accurate comparisons between objects to detect similaritiesand differences. Such a test could be shown to be relevant when the job or role requires quality controlchecks, review of documentation and for roles such as Air Traffic Controller, in which a core part of the taskis safely managing aircraft movements (i.e. recognition and verification of the information provided byvarious displays and other sources).

More generally, tests can be classified in terms of their functional characteristics into power tests andspeed tests. The distribution of scores on a power test is driven by the difficulty of the test questions. Wellconstructed power tests such as the Verify verbal, numerical and inductive reasoning tests are designedso that time is not a critical factor of the test score. The timing on these Verify tests was set where 80%of participants in trials completed all of the items in a test, and checks on subsequent live administrationshas shown that over 90% of candidates complete all the questions in these Verify tests. What drives aperson’s scores on these tests is the level of difficulty at which they are able to operate.

The questions or items in a speed test tend to be comparatively simple in structure and low in difficultyas they measure more fundamental cognitive processes such as speed and accuracy of perception. Thefactor of time is deliberately used as a design feature of the Verify Clerical Checking test. What drives aperson’s score on such tests is the speed with which they are able to respond to a relatively simple task.

Clearly, the test user needs to consider the class of test for which they are considering a change inadministration time. We strongly recommend that the time of a speed test is not adjusted as thiscompletely removes the key element of the test that drives the distribution of test scores and, thereby,the validity of the test. Indeed, where such tests are seen as a critical ability for effective performance onthe job, then any consideration of adjusting test time is probably flagging a need to revisit theorganisation’s policy on the basic requirements for the job or role. If this shows that a special need wouldlegitimately exclude a candidate and if this has not been mentioned in the information sent or provided tocandidates, then the action is clearly to update that information and make such exclusions clearer.

Candidates have responsibilities too.

It is all too easy to put all the responsibility on to the test user when dealing with issues such as thosediscussed in this paper. However, candidates have responsibilities as well and a good reference for this hasbeen set out by the American Psychological Associationv. One of the key responsibilities for the candidate isto let the test user know whether they do have special needs and the specifics of those needs. In mosteducational testing programmes in the US, where policy in relation to special needs and testing is perhapsthe most developed, such policy generally recommends that there is a verification stage when anycandidate declaring a special need is required to provide medical references that can be verified.

4 > Best Practice Series

Page 5: Best Practice in Adjusting Administration Time on Employment Tests

If this seems difficult, think of the risks to you.

Two risks are highlighted:• Accepting an adjustment because the candidate received special treatment for educational tests or

exams. A key difference between educational tests and employment tests is that educational testsfocus on attainment and primarily on the knowledge gained by an individual in the course of aneducational programme. As such, the objective of these tests is different to that of employment tests,which is to predict future performance in a job or role. A key assumption in writing this paper is thatthe test user has a clear understanding of what the requirements for a job are and, therefore, whyspecific cognitive tests are relevant to an assessment, whether that be for recruitment and selection,promotion or succession.

• Making an adjustment means that you are declaring a contract for the conditions of futureemployment. In making an adjustment in test times, it can be interpreted that you or the employingorganisation are prepared to make the same adjustment to the time allowed to complete tasks if theperson is hired. Take the extreme case of administering the test untimed. That could be interpreted asstating that a similar adjustment will be made in the tasks related to that ability should the candidateaccept an offer of employment. That is, the employee will be allowed unlimited time to complete key tasks.

Reasonable actions.

The issues highlighted in this paper can be addressed proactively by taking some simple steps. These stepswill also enable you to make adjustments and compare the scores across candidates whether they havealso received an adjustment or have taken the test(s) under normal administrative conditions.

• Be clear about what the basic requirements for the job or role are and whether a reasonableadjustment can or has been made to the conditions under which someone holding that job or rolewould be expected to perform.

• Communicate the basic requirements for the job or role to candidates including anything that couldreasonably exclude them.

• Communicate why tests are being used and how these tests relate to effective performance in the roleor job.

• Inform candidates that if they have special needs that they should notify you of these at the outset ofany process in which tests are being used. Also, make it clear what form of evidence you require toverify the special need being declared.

• If an adjustment to the job or role can be made to accommodate a special need, then check how theactions or tasks that can be adjusted relate to the test(s) being used. Having a clear link betweenadjustments to the job or role and tests is critical to developing a defensible position.

• If the job requires an ability best measured by a speed test, then adjustments to the test’s time areinadvisable. If such a test measures a critical requirement for the job, then that suggests you haveidentified a basis for excluding candidates that needs to be explored and confirmed.

• If the test is a power test and an accommodation in working conditions can be made, consider how thataccommodation relates to the time for test administration. For example, a practical rule of thumb usedin the US is to add 50% of the time for those declaring and verified as having an allowable need. Whatthat states is that it would be reasonable for a candidate to be allowed 50% more time to completecertain tasks if they were successful in being made an employment offer.

• Make sure that any actions are in line with your special needs or disability and diversity policies.

Best Practice Series > 5

Page 6: Best Practice in Adjusting Administration Time on Employment Tests

Thinking through the issues can make things easier.

Let’s finish with an explanation of how these steps can simplify the issues we have explored. Let us take theproblem of comparing scores across candidates when adjustments have been made to the test time forsome but not all candidates. If an accommodation has been made in line with the principles set out above,then scores can be compared using, for example, existing norms.

The logic behind this is that, using the 50% adjustment as an example, you have declared that anaccommodation can be made to allow 50% more time in the job or role for those with a verifiable andallowable special need. The test is being used to predict performance in the job. For those without anallowable and verifiable special need, then normal job performance conditions would apply and, therefore,normal test time conditions could be reasonably argued to apply. For those for whom an additional of 50%in the time allowed for taking the tests is being made, then that accords with expectations in futureperformance for those with an allowable special need.

As such, the accommodation in test time is in line with any allowances that would be made in theworkplace, and, since the test is being used to predict performance, comparisons can be made usingexisting norms across candidates who have taken the test under normal time conditions, and candidateswho have been given an allowance in line with normal workplace policy.

Final thoughts.

Cognitive ability tests, and particularly power tests of reasoning, are the most consistent and strongestpredictors of job performance. Any action that changes the performance of such a test may dilute the valueand the perceived validity of a test to the point where scores do not reflect the experience of managerswhen they observe people’s performance in the workplace.

Tests are a fair way to evaluate people’s fit to a job or role, but they must also be seen to be fair by thosewho are asked to sit them. One of the cornerstones of perceptions of fairness is consistency, and this paperhas described practical steps to demonstrate that consistency, which will also ensure that the value ofcognitive tests in helping organisation acquire and develop the talent that will support their success isrealised.

6 > Best Practice Series

Page 7: Best Practice in Adjusting Administration Time on Employment Tests

iFor a broader discussion of issues related to online testing, see Burke, E. (2006). Better practice forunsupervised online assessment. Thames Ditton, UK: SHL. This is available as a free download fromwww.shl.com

iiFor more information on perceived justice and fairness, please refer to Gilliland, S.W. and Hale, J. (2005).How do theories of organizational justice inform fair employee selection practices? In J. Greenberg, andJ.A. Colquitt (Eds.) Handbook of organizational justice: Fundamental questions about fairness in theworkplace. Mahwah, NJ: Erlbaum.

iiiThe 4/5th’s rule states that the proportion of a minority or protected group under employment law that isselected, should not be less than 4/5th’s or 80% of the proportion of the majority group (e.g. if 50% ofmales are selected then a process would meet the 4/5th’s rule if 40% of females were also selected).

ivSee also Frank. J. Landy. Employment discrimination litigation: Behavioural, quantitative, and legalperspectives. Jossey-Bass, 2005 which is referenced by the Burke paper.

vThe American Psychological Association’s (APA) guidelines on the rights and responsibilities of test takerscan be found at http://www.apa.org/ science/ttrr.html

References

Best Practice Series > 7

Page 8: Best Practice in Adjusting Administration Time on Employment Tests

Guidelines for Best Practice inAdjusting administration time onemployment tests: best practice

principles to ensure validity and fairness

Whilst SHL has used every effort to ensure that

these guidelines reflect best practice, SHL does

not accept liability for any loss of whatsoever

nature suffered by any person or entity as a result

of placing reliance on these guidelines. Users who

have concerns are urged to seek professional

advice before implementing tests.

The reproduction of these guidelines by

duplicating machine, photocopying process or any

other method, including computer installations, is

breaking the copyright law.

SHL is a registered trademark of SHL Group

Limited, which is registered in the United Kingdom

and other countries

© SHL Group Limited, 2009

United KingdomThe Pavilion

1 Atwell PlaceThames Ditton

Surrey KT7 0NE

Client Support Centre: 0870 070 8000Fax: (020) 8335 7000

[email protected]