Data struggles with the social.
Data struggles with context.
Data creates bigger haystacks (misleading correlations).
Data favors memes over masterpieces (recognizing, but not predicting, human reactions to novelty).
Data obscures values (appearing disinterested, but skewed by value choices in construction and interpretation).
Brooks makes one more claim about data:
Big data has trouble with big problems. If you are trying to figure out which e-mail produces the most campaign contributions, you can do a randomized control experiment. But let’s say you are trying to stimulate an economy in a recession. You don’t have an alternate society to use as a control group. For example, we’ve had huge debates over the best economic stimulus, with mountains of data, and as far as I know not a single major player in this debate has been persuaded by data to switch sides.The fact that you don't have a control group is not a problem with your data. Your data can be very good, and highly predictive of future events without a control group. Or not. For some purposes, having a control group is very useful but the fact that you don't always have a control group does not reveal a problem with the data itself. Moreover, having a control group won't help you decide whether or not to stimulate the economy in a recession, because you can't run your experiment, roll back the clock and start over based upon the outcome.
Brooks appears to be attempting to justify in retrospect his incorrect assertions and assumptions about the economy, and his attacks on economists like Paul Krugman whose positions, in retrospect, have proved far more accurate than those favored by Brooks. But here's the thing: There are only so many ways for a government to stimulate the economy. The debate was not between those who wanted more stimulation of the economy and those who wanted less - those who argued for more recognized pretty early that they had lost the debate, even if the data suggests that they were correct that the stimulus bill that passed was too small. The argument was between stimulating the economy and austerity, and all Brooks needs to do to see some pretty good points of comparison is to compare what happened in the U.S. to what happened in Europe and the U.K. where austerity proved counter-productive.
The argument that "as far as I know not a single major player in this debate has been persuaded by data to switch sides" is also not evidence of a failure of the data. It's evidence of epistemic closure. If your reaction to being presented with facts that contradict the positions you've taken for years, both in terms of the outcome of the U.S. stimulus bill and the U.K./European austerity measures, is "I don't care what the data says," the problem is with you. If you examine the data and find that the manner of its collection, issues of incompleteness, or similar factors leave certain questions unresolved, great. Let's find a way to collect better data to test your theories, independently or as compared to others, in the future. But if you attempt to justify a refusal to look at the data by arguing that it's impossible to draw any conclusions from past economic data without a control group, that you should not be judged by the fact that most or all of your predictions proved to be incorrect, and that people should not trust their lying eyes but should instead consider that austerity might have worked even better than the stimulus bill in a theoretical parallel universe... you may have a future writing science fiction shows, but you have no business writing about economics.
Update: LGM's Scott Lemieux reminds us of Brooks' criticism of Nate Silver, as well as his past reliance upon incorrect data.