In compliance with the Uniform Guidelines on Employee Selection Procedures (1978)
Document Number 29 CFR 1607
Section 1607.15 B Criterion-related Validation Studies
1. User: Effem Foods Inc.
Dates of Study: June 14-18, 1999
2. Problem and Setting: Small manufacturing companty semi-skilled labor force (approximately 20 shop-floor personnel) seeks test to qualify job applicants for works as machine operators. Company reviewed Benchmark Testware’s "Shop Apprentice" test and felt it was appropriate for the skill level required of their operators. To investigate the criterion-related validity of this instrument, each operator took the test, and was separately rated on a number of job-related criteria by his supervisor, Statistical analysis was conducted to determine the correlation between test scores and subjective ratings.
3. Job Analysis Procedure: Benchmark Testware has been working with manufacturing companies in a variety of industries since 1995 to develop tests for skilled tradespeople. Benchmark’s Research Director, Martin Green, is a professional engineer with fifteen years of background in plant maintenance. Through personal experience and discussion with numberous maintenance supervisors and workers, Benchmark has identified eight criteria that describe areas in which skilled or semi-skilled workers can be subjectively appraised. Note that 29 CFR 1607.14B(3) approves the use of criteria without the need for full job analysis:
"Certain criteria may be used without a full job analysis if the user can show the importance of the criteria to the particular employment context. These criteria include but are not limited to production rate, error rate, tardiness, absenteeism, and length of service. A standardized rating of overall work performance may be used where a study of the job shows that it is an appropriate criterion."
Basis for Selection of Criteria:Review of job information shows the worksite to be a manufacturing shop with general skills required to be greater than those of repetitive, assembly-line work, but less than those of skilled maintenance technicians. Based on the employers focus on the mechanical competencies involved in the job we adopted Benchmark’s standard "Skilled Trades Evaluation Form" (see appendix A).
4. Job Titles/Codes:
5. Bases for selection of criterion measure: The test instrument was designed to be part of an overall hiring procedure which would vary from one plant to another. The basis for selection of criterion measures was to provide an objective ranking outside of the test instrument against which the test results could be mathematically correlated. Therefore criterion measures were chosen which could be determined by supervisory appraisals.
Appraisal Forms: Prior to seeing the scores of the sample group, the supervisor was provided with evaluation forms listing eight criteria on which the workers were to be rated. A sample copy of the appraisal forms is shown in Appendix A. These forms were completed by the supervisor before he saw the results of the tests, and submitted to Benchmark Testware
Measures to Ensure Fairness of Ratings: The fairness of the ratings was dependent on lack of personal biases on the part of the supervisor. Because the sample group was homogenous (seventeen out of seventeen white male), it was determined that any personal biases on the part of the supervisor could have little effect on the validation of the test, other than to artificially degrade the correlation. In other words, if the supervisor gave a poor rating to someone who deserved a good rating, and that individual subsequently performed well on the test, the effect would be simply to make the test look bad. It would not lead to a modification of the scoring scale which would bias the test against minorities. Since it was in the best interest of the supervisor to make the test look good, it would therefore be expected that he would attempt to make his ratings as objective as possible without regards to personal biases.
6. Sample Group Description: The sample group consisted of fourteen machine operators employed at Effem, plus three maintenance technicians at the next level of progression. The sample group was chosen so as to give the best available measure of the applicability of the test.
Demographic Make-up: None of the members of the sample group were reported to be members of protected classes.
7. Description of Selection Procedures: This report studies the test procedure known as "SkillsProfiler Shop Apprentice Tests" (copyright 1997 Benchmark Testware 477 Jarvis Ave. Winnipeg Canada) . This test is designed to be used by employers as one component of an overall hiring procedure.
8. Techniques and Results: The employer’s subjective evaluations are tabulated by Benchmark Testware and an overall rating is determined for each employee by taking the sum of the individual criterion ratings. The members of the sample group are then ranked according to thes ratings, and their raw test scores are placed in order beside these rankings.
A mathematical correlation value (between -1.0 and 1.0) is determined between the two columns of values, with a value of zero corresponding to random figures, i.e. no correlation. Subsequently, various sub-categories of the test questions are isolated, sub-scores calculated, and these sub-categories are then correlated with the supervisory ratings to determine which types of questions are most significant. It is these significant sub-categories, in addition to the raw scores, which are brought to the attention of the employer when examining the test results of job applicants in greater detail.
Statistical Results and Significance: A print-out of the spreadsheet analysis is provided in Appendix B. It may be noted that the raw score correlation obtained for the group was 0.47, and that the best correlation obtained by weighted scoring of sub-categories was that obtained in Category 3, "Easy Questions", where the correlation between scores and evaluations was .57 (57%). Based on the sample size (17) and the observed raw correlation, a statistical analysis was carried out to determine the significance. It was determined that there is a probability of less than .03 that a positive correlation at this level would have been obtained purely by chance.
Observed Differences according to Race and Gender: Because of the size and homogeneity of this sample group, no separate results are available for different minority groups.
Measures Taken to Ensure Fairness: The test is generally designed in accordance with professional test design practises. Multiple-choice questions are offerred with clear alternatives. English usage has been reviewed to ensure that the vocabulary is chosen to be as simple as possible in order to convey the necessary meaning of the questions. All questions cover material which an individual with no specific technical training might have been able to learn through working in the home or garage, rather than knowledge which would be available only to those with special education or industrial experience.
9. Alternative Procedures: The employer has been working until now with the simple alternative of hiring on job interviews. Because the numerous job factors involving the skill of the operator, such as machine set-up, maintenance, quality control, the company feels that there is a significant value to the company in hiring workers who will perform the job at a greater-than-average competency level. The "do-nothing" alternative was therfore not acceptable to the employer; furthermore, the employer had no evidence that a "do-nothing" alternative would have less adverse impact on protected groups than implementation of a testing procedure.
10. Uses and Applications: The test procedure is for the employer to have all job applicants write the "Shop Apprentice" test on computer. The results are stored in a file, which is sent electronically to Benchmark Testware. The raw scores are tabulated and the detailed results are placed in a spreadsheet for breakdown of scores according to various sub-categories determined by Benchmark Testware. Based on this analysis, recommendations are made by Benchmark Testware to the employer concerning the abilities of the applicants. The final selection of applicants is made by the employer.
Evidence of Validity of Procedure: There are several elements of evidence which attest to the validity of the procedure. The most important is the significant correlation observed between test scores and supervisory ratings as measured on the sample group of existing Effem employees. Because this study was carried out on an existing workforce, one might question whether its validity applied only to people with experience on the job, or whether it is also applicable to new applicants. ("Concurrent" vs. predictive" studies).There are three answers to this question:
1. The test measures a level of mechanical knowledge which is expected of entry-level technicians.
2. The knowledge attributes measured in this tests are specifically chosen to consist of things that a person with no industrial experience might have readily learned at home, e.g. household plumbing, wiring, carpentry, minor car repair, as opposed to topics which would have required specific industrial training or experience e.g. bearings, hydraulics, three-phase electrical power.
2. The Code of Federal Regulations recognizes the admissibility of concurrent validity studies where predictive studies are not feasible. Note 29 CFR1607.C(4):
"Representativeness of the sample. Whether the study is predictive or concurrent, the sample subjects should insofar as feasible be representative of the candidates normally available in the relevant labor market for the job or group of jobs in question, and should insofar as feasible include the races, sexes, and ethnic groups normally available in the relevant job market.
In determining the representativeness of the sample in a concurrent validity study, the user should take into account the extent to which the specific knowledges or skills which are the primary focus of the test are those which employees learn on the job."
11. Source Data: Information on the participants in the sample group are available from Effem.
12. Contact Person(s):
Effem Inc: Janice Hazlewood
Benchmark Testware: Martin Green
13. Accuracy and Completenes: I hereby certify that this validation study was carried out to the best of my professional capabilities, and that no relevant information concerning this study has been supressed from this report.
Martin Green, M. Sc, P. Eng.
Research Director, Benchmark Testware