“This data” or “these data”: which is correct?

When I enrolled as a public policy analysis graduate student at the University of California, Berkeley, my goal was to learn how to better distinguish between good policy and bad. In school, I learned a lot about statistical methods, microeconomics, cost-benefit analysis, and policy analysis. But another thing I learned about was that some people have strong opinions about the word “data.”

The particular linguistic controversy around the word “data” revolves around its use in a certain phrase: “this data.” Many academics strongly prefer the phrase “these data” to “this data,” claiming the former use of the phrase is “correct” grammatically.

Why do academics often insist that “these data” is “correct” grammar while “this data” is incorrect? The logic behind this argument is that “data” is actually a plural of the word “datum.” Historically, this is true. The words “data” and “datum” first appeared in the early 17th century, in particular for use in math. The original meaning of the word “data” was “facts given as the basis for calculation in mathematical problems.”

Why does “these data” sound so wrong to the average person, then? It may be because the word “datum” has nearly completely dropped out of use in the English language. Over the past five years, there is only one week (late February/early March 2022) where the word “datum” registered 1% of the peak popularity of the term “data” on Google Trends.

Despite the near extinction of the term “datum,” “these data” still has some life. According to Google Trends, the phrase “these data” gets one mention for every two mentions of “this data.” This means the phrase “these data” is still competitive, but is still the minority linguistic composition to “this data,” much to the chagrin of many academics.

How can “this data” be justified linguistically? The use of the phrase “this data” makes linguistic sense if the word “data” is treated as a mass noun like the words “money” or “food.” We all accept that the words “money” and “food” are plural, but we would look askance at someone saying “look at all these money I made” or “I can’t believe I ate all these food.”

While most grammar references will tell you both uses of the word are correct in modern English, the two alternate grammars put policy analysts in a hard place. People who have spent a lot of time in the academic world tilt strongly toward the use of the phrase “these data” based on storytelling within academia about the noble, dying “datum.” Policymakers, on the other hand, tend to exist in contexts where more standard English prevails. This can lead policymakers to prefer the phrase “this data” and see “these data” as an incorrect use of the phrase.

The rule of thumb I lean toward is this: if you can code switch masterfully, use “this data” in policymaking contexts and “these data” in academic contexts. If you, on the other hand, are like most people and can’t reflexively change your grammar from context to context, pick a use of the noun that fits your context better. If you work mostly with policymakers, use the standard mass noun phrasing of “this data.” If you work mainly in academic circles, use the count noun formulation “these data.”

And if you work with both? Well, suffer with the rest of us until academia gets on board with the public or finally convinces the public to buy into the good ol’ count noun formulation.