Should These Tests Get a Failing Grade?

The stakes are high for students seeking to attend one of New York City’s specialized high schools, like the Bronx High School of Science: Admission is based solely on a single standardized test.

I haven’t had to take a standardized test in decades. Before I took the SAT, when I was a senior in high school, I had no coaching or test preparation. As far as I know, none was then offered in the small Midwestern city where I grew up. Still, I did well enough to gain admission to DePauw University, and later Harvard Law School, and I doubt I’d be where I am today were it not for my test scores.

So when The New York Times recently published a set of sample questions from New York City’s admissions test for its selective high schools, known as the SHSAT and administered by Pearson, the same company that runs the GMAT and a slew of other career-defining tests, I decided to try to answer them.

Suffice to say I would not be attending Stuyvesant or Bronx High School of Science were I now an eighth grader.

In the great sorting process that can begin in preschool, there’s no question that standardized test scores matter. For admission to New York City’s elite high schools, like Stuyvesant or Bronx Science, they’re the only criteria. And the stakes get higher in the next rung of educational aspiration. Even at Harvard, which prides itself on its holistic approach to admissions, high test scores are vital, as a recent lawsuit alleging discrimination against Asian-Americans makes clear.

Elite schools, in turn, boast about the eye-popping average test scores of their students, which is a crucial factor in determining their U.S. News & World Report ranking.

Students at the top three U.S. News-ranked business schools — Harvard, Chicago and Wharton — have average GMAT scores of 731, 730 and 730 out of a possible 800. Fourth-ranked Stanford was even higher, at 737.

But the problems I encountered when taking the SHSAT online demonstrate how even one standardized test question might derail a promising student’s future.

In fact, I was thrown off by the very first question on the test:

To answer the question, I focused on the phrase “precise revision.” I took that to mean the revision that most precisely, exactly or accurately reproduces the original.

The question didn’t say how many people the reporter interviewed, and a reader has no way of knowing. So an accurate revision would need to be equally vague. Any revision that specified “three contestants” is not an accurate reproduction of the original, but an embellishment. That eliminated answers B and D.

Answer C refers to “some of the winners,” but doesn’t say winners of what. The original is explicit: “the contest.” And C embellishes “talked”: “discussed the contest.” The original doesn’t say what the reporter talked to the winners about. So C failed on two counts.

That left A, which is both vague and explicit in the same way the original is, and thus the most “precise revision.” I chose it and pushed the “submit” button and got an immediate response.


My confidence was badly shaken, and it was pretty much downhill from there. (Somewhat rattled, I got three of five questions wrong before finally throwing in the towel.)

Days later, I found myself still fretting over the question about the reporter and the cooking contest. I looked up the meaning of “precise”: “marked by exactness and accuracy.” At the least, I thought, my reasoning was legitimate.

So I sent the question to Mary Norris, author of “Between You and Me: Confessions of a Comma Queen” and a legendary copy editor at The New Yorker. If anyone understands revisions of English prose, it’s she. I didn’t tell her anything about my experience and asked her to answer the question and tell me what she thought.

“I got so confused!” she said when we spoke a few days later. “To revise that sentence precisely would be to make it as lame as the original.”

She said she was stumped immediately by the reference to “people” who “did the best in the contest.” Can multiple people be the “best?” Can there be more than one “winner”? What kind of “contest” would that be? “To say there are three people adds information that isn’t in the original,” she said. “And we have no way of knowing if that’s accurate.”

C was tempting. “It’s nice and vague, and in this context, vague equals precise,” she said. Nonetheless, she picked answer B. “At least it doesn’t say ‘winner,’” she reasoned.

Wrong again!

The “correct” answer, according to the New York City Department of Education, is D. “The top three” in that answer is more specific than “some people who did the best” in the original.

“I would never have picked D,” Ms. Norris said.

Daniel Koretz, a professor at the Harvard Graduate School of Education, the author of “Measuring Up” and “The Testing Charade,” and one of the country’s foremost experts on standardized tests, agreed that the question is, at best, ambiguous. “Problematic items do sometimes occur even in good tests, and that is one more reason it is never acceptable to make a consequential decision based on a single test score,” he said.

It’s hard to know how prevalent tainted questions are, since Pearson and other test administrators don’t disclose the data they collect on specific questions and answers.

Mr. Koretz said that several years ago, he was contacted by a parent whose son had been denied admission to Bronx High School of Science because of one question. The parents disputed that the “correct” answer was, in fact, correct.

“It was totally ambiguous,” Mr. Koretz said of the question, adding, “I got the wrong answer.” He contacted the New York City Department of Education, but was unable to obtain standard data on reliability and validity for the question. “I got nowhere,” he said.

Will Mantell, a spokesman for the New York City Department of Education, said Pearson investigates “any items with problematic or unusual results.”

“If an error is found,” he added, “the item is not scored.”

The risk of erroneous answers is reduced if students can take a test multiple times, as they can with the standard college admissions test. But students can take the SHSAT only once, except in unusual circumstances.

As for the question that stumped me and Ms. Norris, Mr. Mantell said Pearson had already spotted a problem and revised the question. (He added that the question was never used on an actual test.)

Here’s the new question:

Read this sentence. During a nightly news segment about a cooking contest, a reporter talked to some people who did the best in the contest.

Which revision uses the most precise language for the words “talked to some people who did the best in the contest?

“Precise” now modifies “language,” not “revision,” which struck me as an improvement.

The answers remain the same, and D is still “correct.”

But neither Ms. Norris nor Mr. Koretz found this to be satisfactory. “I had to read this a few times, assuming incorrectly that they did something to fix the ambiguity inherent in the answer choices. They didn’t,” Mr. Koretz said.

A problem is that C is now more precise than D in saying that the people discussed “the contest.” And the confusion over how a contest can have multiple winners or people who “did the best” remains.

“I’m not at all sure I’d get the right answer,” Ms. Norris said. “I’d still have the same problem. The revision doesn’t make the case for improving the sentence at all.”

She pointed out that the original sentence would be much improved if the reporter interviewed some of “the losers” rather than “people who did the best in the contest” since any contest involving three or more people would be expected to produce multiple losers. The answers would need to be revised accordingly.

A spokesman for Pearson, Scott Overland, said the company’s assessment team was examining the revised question in light of the issues I raised. He added that the test undergoes a “rigorous, multi-step development process” and that “the New York City Department of Education is involved in all aspects of the test development process and provides final approval before students take the SHSAT.”

Mr. Mantell declined further comment.

Mayor Bill de Blasio has caused an uproar by proposing to abolish the SHSAT, and admit students to elite schools based solely on grades and class rank. He said it was “insane” to rely on one taking of a single test. Some elite colleges are now making standardized tests optional.

My experience suggests they may have a point. Test scores, at least, shouldn’t be the sole basis for admission.

“I guess I wouldn’t get into an elite school today,” Ms. Norris said. “Maybe it just shows you don’t have to go to the best schools to succeed.”

In Other News

© 2020 US News. All Rights Reserved.