The Wrong/Incomplete Data

Several years ago, an acquaintance made a comment that almost caused me to take his head off. He said, “Your wife has a really cushy job. She doesn’t even leave for work until 9:30 every morning.” I refrained from homicide and tried to explain that, first, that because she is both a college professor and an opera director, as well as a performer, she seldom got home before nine or ten o’clock at night, and usually it was later, far later, that she worked four out of five weekends at the university, and that overtime compensation was non-existent. He replied by pointing out that she only had to work nine months out of the year. I just shook my head and walked away, because that wasn’t true, either. Generally, she only gets paid for nine months, but she works between eleven and twelve months a year — admittedly “only” about forty hours a week in the summer to catch up on what won’t fit in the year, to research and often write the shows for the coming year, to conduct job searches, and to write the required scholarly articles. And for all that, with all of her graduate work and international expertise, and as a tenured full professor, she makes far less money than do almost any of our offspring — only one of whom has more degrees.

I’m not writing this to say how down-trodden professors are — I do know some who truly skate by, although they’re a tiny minority, and that could be yet another example — but to offer the first instance of what might be called “data abuse.”

The second example is that of the Mars probe that crashed several years ago, because its systems clashed. One system had been programmed for “English” measurements, the other for metric. A third example is NASA itself, and the fact that manned space exploration has actually declined in scope and in accomplishments ever since the Apollo missions of more than 30 years ago.

A fourth example is the issue of school voucher programs, a proposal that was just defeated in Utah. Proponents argued that providing vouchers for roughly $3,000 a year per student for those who wished to go to private schools would actually allow more money for those students who remained. Mathematically, this would have been true, but the most salient points were minimized and never addressed in all the sound-bite coverage. First, even if every student received the maximum voucher amount, on average families would have to come up with an additional $4,000 per student. Exactly how many families making less than the Census Bureau’s “middle-class” income of $42,000 are going to be able to come up with an additional $8,000 in after-tax income [assuming two children in school]? Currently, only about 15% of all private school students receive financial aid, and that means that schools cannot afford to grant significant additional aid, not without raising tuition. Second, a great many communities in the state have no private schools at all. Third, the program did not provide additional funding to pay for the voucher program, but would have diverted it from existing [and inadequate] public school funds. So, in effect, the voucher program would not have benefited low-income students, or most middle-class students, but, for the most part, would have subsidized the tuition of those who could already afford such schools. Certainly, the program would have done little for the public school system, even though the supporters claimed that it would have.

Another example is the “core” inflation version of the Consumer Price Index, which is supposed to measure the rate of price inflation, and is the index used by government to measure how inflation affects consumers. Several years ago, however, the changes in the prices of food and energy were removed because they were too “volatile.” Yet 67% of all petroleum products go to transportation, and the majority goes into the tanks of American cars. So, as we have seen a price increase of almost 60%, as measured by the cost of a barrel of oil, over the past year or so, that increase doesn’t appear as part of inflation measurements. Thirty-three percent of all the petroleum we use goes into making industrial products, such as rubber and plastic, and chemicals. But those costs are reduced by “hedonics” or implied quality improvements. If your new car has better disc brakes or cruise control, or automatic stability, the CPI auto component for durable goods is adjusted downward to reflect quality improvement. The only problem is that the price paid by the consumer doesn’t go down, but up, yet the statistics show a decline the durable goods index.

These are all examples of what I’d loosely term “using the wrong data.” At times, as in the case of the Mars probe, such usage can be truly accidental. At other times, as in the case of my acquaintance, such incorrect data usage is because the user fits a prejudice into existing data and doesn’t really want to seek out conflicting and more accurate data.

In other cases, as exemplified by the NASA budget, other data, chosen to exploit other political priorities, take precedence. And, as illustrated by the voucher issue or the CPI measurements, all too often those with a political agenda have no real interest in using or examining the full and more accurate range of data.

What is often overlooked in all of these cases, however, is that in none of them did those involved use “incorrect” data. The figures used were accurate, if often selective. Yet in political and policy debates; in inter-office and intra-office, or departmental budget or resource allocation tussles; even in conversation; what people focus on all too often is whether the numbers are accurate, rather than whether they’re the numbers that they should be considering at all. Seeking accuracy in irrelevant data isn’t exactly a virtue.

It’s not just whether the data is accurate, but whether it’s the right data at all.