Great Careers in Testing

JJ Zhu, Ph.D. JJ Zhu, Ph.D.

Dr. Zhu manages the clinical psychometric department responsible for providing psychometric services to Pearson Clinical Assessment Division worldwide. His responsibilities include capacity planning, annual budget development, project-based psychometric budget estimation, providing technical training to psychometricians and research directors, developing and enforcing standard psychometric procedures, and conducting performance evaluation. He is also responsible for high level psychometric and content consulting, conducting high level data analysis, innovating new methodology, reviewing research proposal related to international adaptation of US products, and final psychometric review of Pearson clinical products published by international partners.

Previously, Dr. Zhu served as research director, senior research director, Wechsler team manager, and worked as project director for several flagship products, such as WASI-III, WPPSI-III, WISC-IV, WASI, ABAS, and Beta-III. Also served as lead psychometricians for WIAT-II US, WIAT-II LOTF, WISC-IV Canada, WPPSI-III Australia, DAS-II, CELF-IV, Bracken, CCC, ALL and many other products.

Your Career

How and when did you first get interested in psychometrics? How did it happen that you chose it as the focus of your career?
JJ Zhu, Ph.D. In 1993, after receiving my doctorate in cognitive development (with a minor in statistics), I started my career at The Psychological Corporation (later renamed as Harcourt Assessment, and currently Pearson) as a project director, working on the content test development of WASI3. After about 6 months, I discovered that if I could do the work of a psychometrician, I would be much more effective in test development because it took a lot of effort to write the data analysis specs and to go over the specs with the psychometrician. If I did it myself, I wouldn't need to write the detailed specs because I knew what I want and why I wanted them. Since then, I started learning psychometrics on my spare time and checking the work by psychometricians for the purpose of self-learning.

Soon, I found out that it was relatively easier for me to spot the errors made by psychometricans because I knew what results were “making sense” and what were not due to my content knowledge. For instance, on achievement measures, children diagnosed with mental retardation are expected to have a flat profile across different tests. If their standard score on reading or writing is significantly higher than on other areas (such as math), something is wrong. In addition, because of my knowledge and experience in content test development, I can bridge the gap between and the content test developer and psychometrician. My skill growth in psychometrics (though very primitive) received so much positive feedback from my colleagues and supervisor that I was motivated to learn more. In 2000, due to staff shortage, I was giving an opportunity to temporarily fill the psychometrician role for a Wechsler project, which motivated me to learn the details of psychometrics required by the project. In 2002, I took over the management of the psychometric team.

What have been the most enjoyable areas of your career in testing? What do you most enjoy thinking about?
Problem solving and how to make the test more useful for clinicians, teachers, parents, and children. Now, how to take advantage of digital technology to develop tests that assess the process of learning.

What do you consider to be your greatest work achievement?
Bridging between content test development and psychometrics.

You have an extensive leadership background with the Wechsler series of tests. How do these tests compare with other intelligence tests? How has the test evolved over time?
Wechsler intelligence scales are the most widely used intelligence tests around the world because they are easy to use, psychometrically superb (Kaufman, 2006), well-researched, and proven to be clinically useful. In addition, the Wechsler intelligence scales are not developed according to any single doctrine of intelligence theory. Like their predecessors, while focusing on the clinical and ecological validity of the tests, the modern Wechsler intelligence scales bridge the idea of several branches of intelligence theories. At each revision of the Wechsler intelligence scale, not only the normative data get updated, the test blueprint was also carefully redesigned to reflect advances in theoretical and practical foundations of cognitive and neuropsychological assessment.

What are the most important elements to consider when designing and developing tests and their supporting administration, scoring and other systems?
User-friendliness, ecologically validity, and sufficient evidence of clinical validity.

Can you please provide a brief summary of the steps involved in publishing a test from conception to publication?

  1. What trait will be assessed? (Define the blueprint of the test, such as the construct and content coverage of the test.)
  2. What are the test used for? (What is the target population? Who will be tested? Who will administer and score the test? Who will interpret the test results? How the test results will be used in decision making, etc.?)
  3. How the trait will be assessed? (Decode how to sample a set of representative behavior, how to observe, record, and score the behavior, etc.)
  4. How will the test be scaled? (What scores will be produced by the test? How will these scores be derived? And how these scores are interrelated?)
  5. Complete research design for test development (item writing, item tryout, test tryout, standardization, reliability and validity studies).
  6. Develop the test and execute the research plan ( design the test item, administration instructions and scoring rules; produce testing materials; data collection; scoring and data entry; data analysis, develop norms, validate norms using reliability and validity evidence.
  7. Write the test manuals (administration and scoring manual and technical manual)

What do you believe is the primary value of testing in our society?
To discover how individuals are different from each other and make intelligent decisions how to act toward an individual according to the information collected using standardized tests.

Psychometric Education and Careers

For people considering a career in psychometrics, what are important qualities, skills or talents they should possess to be successful?
A psychometrician must have good research skills supported by logical reasoning, problem solving, and knowledge of research design, statistics, and data analysis software.

What are the most rare and valued talents and skills in testing?
In the field of testing, we often have three types of experts: (1) content test developers who are good in content test development (design item and test, write administration instructions and test manual) but know very little about the psychometrics; (2) clinicians who are good in clinical application of tests; and (3) psychometricians who are good in research design and data analysis but know very little about the content test development part and about how the test is administered, scored, and interpreted in clinical setting. One becomes the most rare and valued talent if he/she is an expert on content test development, clinical application of the tests, and psychometrics.

What have you discovered to be the best ways to attract, compensate and work with extraordinary testing educators and professionals?
In our department, we make sure that everyone becomes a technical leader. Every individual is different, with different strength and weakness. We help each individual to develop leadership in their strength area and establish a clear career path for everyone.

How are most test authors paid for their work in creating tests? Are most tests authored by a team of people or one person? What percentage of authors do you believe get a share of the revenue of the works they create?
In most cases, once tests are published, authors are paid by royalties (usually, certain percentages of the annual sales according to the agreement between the author(s) and the publisher). For instance, if the royalties are 20% of the annual sales, the author will be paid 2 million dollars if the annual sales are 10 millions. Most tests are authored by one or two authors.

What are the most popular continuing education programs for psychometricians?
NCME conference and workshops offered by different universities.

The Past, Present and Future of Testing

What are the 5 most important tests in all of history?

  • Wechsler intelligence tests
  • MMPI
  • Raven
  • SAT
  • GRE

What the 5 most influential testing organizations worldwide?

  • Pearson
  • ETS
  • ACT
  • College Board
  • MESA

Who do you consider to be the 5 most influential psychometricians of all time?

  • Georg Rasch
  • Michael Kolen
  • Wim J. Van der Linden
  • Willian H. Angoff
  • Benjamin D. Wright

What percentage of high stakes tests are Computer Adaptive Tests in your estimation? What percentage do you estimate it will be in 5 – 10 years?
Now: about 5% or less; about 20% in 5-10 years.

What are the pros and cons of Computer Adaptive Tests vs. standard tests. What percent more time and money goes into a CAT test vs. a standard test?
The main pro is time saving. The main cons are: (1) when fewer items are administered, examinee may feel the test is too hard because there is not enough easy items for him/her to warm up; (2) cost of test development is too high.

If you had the power to change the world of assessment, what would be the first 3 things you would do?

  1. Test publishers need to show more evidence of validity for the suggested use and interpretation.
  2. Tests should be treated as something like medical equipment that must be carefully reviewed before publication.
  3. Reduce the amount of testing in rich countries and increase in poor countries.

Which sectors of testing are growing and which are declining?
Tests designed to just produce a score and classification are declining; tests that link to intervention and provide solutions are growing.  In the past, most tests were used to produce a score (or a set of scores), which was/were used (often together with other information) in making diagnosis or placement decisions. However, this type of tests seldom inform the test user, teachers, or parents about how to act toward the children. After testing, parents often ask: "Yes. I knew that my child has an average IQ, but he/she did poorly on tasks measuring working memory. But, what should I do to help him/her? My child also did poorly on math test. Are working memory and math performance related? " Now, more and more test is adopting the RTI ( response to intervention) approach, such as tests provided by AimsWeb. I am not an expert of RTI. A quick Web search can give you 95,200,000 items that are related to RTI.

How would you rank the top 3 intelligence tests and why?
Wechsler Intelligence Scales, DAS2, and SB5 are the 3 top intelligence tests, due to clinical utility, ecological validity, psychometric quality, research literature, and their theoretical foundation. In my opinion, the tests that locked to a single theory and measure so-called “pure” cognitive function put strong constrains on their clinical utilities.

Is there any value in test preparation for cognitive tests? How do you feel about the value of the test preparation industry in general?
Yes. Research has shown that cognitive training did improve the cognitive performance on transferred tasks. Practice on cognitive tasks is similar to physical excise, which will help individuals to fully realize their true cognitive potential. However, cognitive training is not the same as teaching the examinee how to beat the test. The key is to show that the effect is long lasting and can be transferred to the tasks outside the training materials. In general, the training materials used by the current test preparation industry are not very well designed. Many items used by them are flawed and misleading. I think the test publisher should be the best one to do the test preparation.

How do you envision the future of testing?
More digital, adaptive, and dynamic testing. By dynamic, I mean not only to focus on what ability or knowledge an individual currently has, but also how well he/she can obtain new skills and discover the new knowledge.  This is related to the so-called the 21st century skills that are critical for an individual to be success in the modern society. Some of the frequently cited 21st century skills in the literature are: (1) skills related to problem solving, critical thinking, and systems thinking; (2) skills related to creativity: generating new ideas, concepts, or associations between existing ideas or concepts;  (3) Social skills, such as EQ; and most important (4) meta-learning skills: learning to learn; un-learn and relearn, plus intellectual curiosity and flexibility. 

awards has been honored with the 2020 Academics' Choice Smart Media Award, a prestigious seal of educational quality. The Academics' Choice Advisory Board consists of leading thinkers and graduates from Princeton, Harvard, George Washington University, and other reputable educational institutions. Our award is for no particular test but for our site and test preparation system as a whole.