Test Validity - Employment Testing
Test validation studies or test validity for a pre-employment assessment
instrument is only an objective measure that evidences that the
test or personality assessment actually measures what it purports to measure.
Validation is not a stamp of approval by any governmental agency but
rather a study undertaken and directed by the test publisher in
accordance with certain professional standards.
The Achiever employment assessment has been established and validated in accordance
with the procedures described in "Standards for Educational and
Psychological Testing," which is referred to in paragraph (2)
1607.6, "Minimum Standards for Evaluation," Federal
Register Volume 35, dated Saturday, August 1, 1970. It is therefore
not discriminatory and is in compliance with E.E.O.C. and other
The Reliability and Validity Manual published by Candidate Resources, Inc.,
establishes the legal and written confirmation that this
employment test was professionally developed and validated in accordance
with both Construct and Criterion methods of validation. Candidate Resources, Inc.,
will defend the validation or content of the Achiever for
any company using this pre-employment assessment, but cannot assist any company as
a result of the misuse or abuse of the Achiever.
There are five forms of validity:
- Construct validity refers
to the extent in which dimensions with similar names on
different tests relate to one another. Two things that correlate
highly on a personality test are not necessarily identical, but do provide reassurance
that they are related and are a "construct" or part of
the makeup (like honesty, dependability, sociability, etc.) of
an individual as related to actual job performance.
- Concurrent validity is
that approach whereby people who are successful within a given
job within a given company or industry are evaluated and
generally grouped Top Third, Middle Third, and
Bottom Third. The assessment scores of the people who fit
each of these ranges are then compiled and Job Benchmark
Standards of the Top Third are used to hire, train or
- Predictive validity ,
sometimes called criterion validity, occurs
when the employer hires people for a job based on normal hiring
procedures (interviewing, reference checks,
education/experience, etc.) and at the same time has them
complete the pre-employment test, but does not utilize any data from it
in the hiring decision. Within six months, or any appropriate
period of time later, the pre-employment assessment is scored, and benchmarks
are established of the people who were hired in the new jobs who are
still with the employer and whom the employer considers
successful. Job Benchmark Standards are thus established through
the Predictive approach.
- Content validity
represents job function testing, i.e., typing, mathematics,
design, CPA exams, physical work endurance, etc. Content
validity is an appropriate strategy when the job domain is defined
through job analysis by identifying the important behaviors, tasks, or knowledge and
the assessment or test is a representative sample of behaviors, tasks or knowledge
drawn from that domain. The Uniform Guidelines on Employee Selection Procedures
state that in order to demonstrate the content validity of a selection procedure,
a user should show that the behaviors demonstrated in the selection procedure
are a representative sample of the behaviors of the job in question or that the selection
procedure provides a representative sample of the work product of the job.
- Face validity
This is the simplest form of validity which basically tells us that the
personality test or other assessment instrument appears (on the face of it) to measure what it is
supposed to measure. Simply put, a test that would be composed
of accounting problems would have face validity as a measure of the ability
to succeed as an accountant. Face validity is not very sophisticated
because it is only based on the appearance of the measure. Be careful
because the market is flooded with personality testing that has only face
Candidate Resources, Inc., recommends that an organization establish and
utilize a consistent standard hiring process when making hiring
decisions. Information should be gathered in each step of the
standard hiring process to have specific and measurable data to
utilize in making a final hiring decision. The pre-employment assessment used
should count no more than one-third of the hiring decisions. The
preliminary interview, job history check, in-depth interview results
and evaluation of education, experience and other pertinent factors
should be considered as well.
Under the Uniform Guidelines on Employee Selection Procedures,
adopted in the 1970's, validation of any part of the hiring process
(assessments included) was no longer deemed necessary unless a
company was not meeting the 4/5th Rule in either hiring or
promotional practices. Consequently, there are three optional
approaches to using assessments:
- Establish your own successful employee Job Benchmark Standards
by conducting a concurrent validation by job classification. By
tying job-related criteria to the aptitudes and personality
dimensions of the assessment, the ultimate in validation and job
relativity is assured. Also, the Job Benchmark Standards
simplify the interpretation and use of the pre-employment assessment in the
hiring process, since it establishes a model for hiring,
promotion and training purposes.
- Establish Job Benchmark Standards by job classification by
answering job-related questions on the requirements of the job.
Candidate Resources' PC software will then develop Job Benchmark
Standards based on the requirements of the job and the
behavioral traits and cognitive abilities required in the
individual in order to successfully perform the job.
- Use of Job Benchmark Standards comprised of successful people
in jobs across the United States. Then, after a reasonable
period of time, compare the successful people selected to the
Benchmark Standards used for that job for confirmation of
correctness and/or modification of the benchmark standards.
The in-depth validation identified above is not necessary if you are
in compliance with the 4/5th Rule described below. This rule was
designated by the E.E.O.C. as a computation tool to establish a
basis to show whether or not a company is having an adverse impact
in their hiring practices.
EXAMPLE: Out of 120 job applicants (comprised of 80 white and 40
minority), 48 whites were hired and 12 minorities were hired.
48 out of 80 white applicants = 60%
12 out of 40 minority applicants = 30%
This hiring pattern results in adverse selection of minorities,
since 1/2 as many minorities are hired as whites (or 30/60), whereas
the hiring ratio must equal 4/5th as many minorities as whites.
Return to Top
Do validity studies guarantee accuracy?
No, they do not. Validity and reliability go hand in
hand. I have taken a number of assessments with
varied results. Many were very far off target but all of them
were supposedly validated instruments. Let's take a look at
how this frequently happens.
Let's say that a company has designed a test that measures communication
styles and that the personality assessment is very effective. The validation
studies for any assessment instrument are only an objective measure
that evidences that the test actually measures what it purports to
measure, and in this particular case it is communication styles.
Letís say that this particular personality test is later given certain
external modifications so that it can also be sold as a
pre-employment assessment. The personality test is still backed by validity
studies, but unless new validity studies are done, there are no
validity studies to support the use of the assessment for its
intended purpose as a pre-employment assessment. In this
example, the intended use is quite clear (to measure communication
The Uniform Guidelines on Employee Selection Procedures
specifically state that the evidence of validity and utility of the
selection procedure must support its operational use.
Now letís take another look at where validity studies can be
very misleading. All pre-employment assessments that measure behaviors are based on
certain theoretical models. Some of those models are very simplistic
because they are used more for training purposes than anything else.
Common sense tells us that human behavior is actually very complex
but for training type applications we need to keep things simple.
If we look at the interpretative manual for one of these
personality assessments, we will find out a little information about the
behavioral model that was used. "People that score high in
dominance are often very ingenious, highly competitive and are
generally very rigid in their thinking, extremely planful and have
strong ethical standards. Such people are often tenacious,
tough-minded types that lack empathy and are often
uncooperative." Could such a test be validated in relation to
the theoretical model? Yes it could.
This personality assessment may prove very effective in training situations
but its limitations are obvious when applied to a pre-employment
assessment context. If people that are high in dominance are very ingenious,
then we would also have to assume that submissive individuals would
be found to be lacking in mental ability. From a practical
standpoint, we know that there are no strong correlations between
cognitive ability and dominance. We also know that there are highly
dominant individuals that have low ethical standards and submissive
individuals that have high ethical standards. From a practical
standpoint we can also say that highly dominant people are not
necessarily tough-minded, competitive or planful.
I have seen a similar personality test used in a number of
pre-employment situations and I can tell you that the results are
often very misleading. In one situation, the personality assessment results were
indicating that all of a companyís employees were highly dominant.
By watching the behaviors and listening to those employees, I could
tell that at least 50% of them were actually very low in dominance.
The amusing part was that those employees were all being put through
training designed to try to lessen the negative effects of their
supposedly high assertiveness. From what I could observe, it would
have been more effective to put them through assertiveness training!
I was able to test those same employees a short time later with a
validated pre-employment assessment (The Scoreboard) and the
assessment results confirmed my observations. Very few of the employees
were high in dominance, they mostly scored low to mid-range. There are
obvious disadvantages to simplistic behavioral models in that a
group of separate behaviors are clumped together. In this case the
personality test was not measuring true dominance. It relied heavily on ethics
and competitiveness to measure dominance. In the preceding
situation, it turned out that all of the employees in that
particular job had very high character strength or ethics. What was
actually being measured was ethics (flexibility) and not dominance.
Return to Top
How valid is the validity study?
The fact that a personality test is backed by validity studies means very
little in itself. Some of the validation techniques are very weak.
Some personality assessments are very simple in that they will contain a listing
of certain descriptors ( such as friendly, outgoing, agreeable,
competitive) and will ask the respondent to circle each descriptor
that describes themselves. The effectiveness of such a technique is
obviously of limited value, but the validity studies may well
indicate that the test is 90% accurate and reliable. How is that
Actually the whole process can be very simplistic so letís take
a look at the total process (totally hypothetical, of course). The
candidate circles a list of descriptors that he feels accurately
describes himself. The testing company takes those descriptors and
expands on their definitions and then gives the report back to the
candidate. On the last page of the report is a questionnaire sheet
that asks the candidate to rate the accuracy of the personality test
report and mail
it back to the test publisher. The candidate circles the percentile
scoring range that is applicable (90 to 100%, 80 to 89% and so on).
Since this is basically a self-evaluation where the candidate has
described himself, how likely is he to say that the resulting test report
is less than 90% accurate, especially if the personality report only makes
positive statements about him? All of the responses that are
received are then entered into a database that is used as an ongoing
The main advantage of this hypothetical personality test is that on the
surface it is fast and cheap. While I would question the overall
effectiveness of such a program, it could offer a few advantages. It
would probably be a little more accurate and objective in most cases
than when an interviewer directs the applicant to "tell me a
little bit about yourself." Itís certainly quicker. What I
would have to really question in relation to such a test would be
whether or not the validity studies would meet the requirements of
the "Uniform Guidelines On Employee Selection Procedures"
as they pertain to the professional standards for validity studies.
Return to Top
Was the Test Professionally Developed?
Validity studies are not really that comprehensible unless you
have a good solid background in statistics. If you are
anything like most people, you are probably suspicious of statistics to begin
with. Start with one simple question. Were the
procedures used in validation consistent with generally accepted
professional standards such as those described in the
"Standards for Educational and Psychological Tests?" A
reputable test publisher will generally make such a statement
somewhere in its brochures or validity manuals.
Secondly, you should be aware of one very important fact.
Just because a testing instrument was written by someone with a PhD,
does not necessarily mean that the instrument was professionally
developed or that it will meet the generally accepted professional
standards that have been previously referenced. Be cautious of
any personality test that claims to have been written by a professional, and
then immediately tries to lead you to the conclusion that it was
professionally developed without referencing any validation or
reliability studies The two concepts do not necessarily go
hand in hand.
I recently visited a web site that used this tactic very
effectively. It then followed up with a very long article
under the heading of Validity. After endless scrolling through
the long article, it concluded without ever mentioning validity ,
except in the title. Some people are slicker than cow guts on
a door knob!
Return to Top