Catallaxy Files

Australia's leading libertarian and centre-right blog

Archive for February 22nd, 2010

Peer review

80 comments

You may have missed the excellent article by Frank Furedi published in Saturday’s Australian (see below) about the corruption of the peer review system.

For too long it has been asserted that the gold standard in scientific research is peer review.

That’s false. Peer review is second best to the true gold standard: where scientists undertake repeated research in controlled circumstances and present their analysis and full data for scrutiny. That is, that they seek to have their findings falsified. And the more times in which the analysis is subjected to rigorous testing and remains extant, the more confident we can be in the analysis. But it survives only until it is disproved.

Karl Popper is famous among other things for his falsifiability / refutability view. That is, something is not a theory if it cannot be refuted. My theory: all swans are white can be disproved by showing me a black swan. Obviously anthropogenic global warming is therefore not a theory – there is no evidence that could be presented that would be classed as refuting the AGW “theory”.

But it wasn’t  Popper who first thought of this. It seems it was Galileo (although one cannot exclude the possibility it was some Ancient). In his Sidereus Nuncius, Galileo writes of his observations of the moon. He showed that it wasn’t perfect – it contained mountains, valleys and craters. Indeed, Galileo was about – using geometry – to calculate with remarkable precision the height of mountains by their distance from the terminator. He was surprised to find that the highest observable lunar mountains were around six kilometres high.

Of course this all offended the authorities, since the Heavens were clearly perfect. Therefore the moon had to be a perfect sphere.

Johann Georg Brengger of Bavaria, an astronomer who was contemporary to Galileo, said the moon appeared to have a rough surface, but it didn’t. In fact it was covered with a transparent crystal substance that filled every valley and crater. So the surface of the moon was perfectly smooth.

Galileo wrote:

the hypothesis is pretty; its only fault is that it is neither demonstrated nor demonstrable. Who does not see that this is a purely arbitrary fiction that puts nothingness as existing and proposes nothing more than simple non contradiction?”

So Galileo sets out falsifiability.

And when quantum physics was postulated by Neils Bohr and Werner Heisenberg among others, it was opposed by Einstein who famously quipped that ‘God does not play dice’. Einstein then posed question after question probing the quantum physicists in an attempt to falsify their theory. He failed.

Today’s climate scientists have much to learn from Galileo, Einstein and Popper.

Read the rest of this entry »

Written by Samuel J

February 22nd, 2010 at 7:05 pm

Posted in Uncategorized

Oomph

124 comments

I am a big fan of Deirdre McCloskey. One of the things she’s always carrying on about is ‘How big is big?’. She argues that in much empirical analysis that people confuse statistical significance with substantive significance. In a play on words, she describes this as being the standard error of empirical analysis. For readers who are not statistically literate the standard error refers to the precision of the estimate that the analysis has produced. McCloskey argues that it isn’t enough for an estimated coefficient to have a small standard error (i.e. be estimated with a high degree of precision) it must also have ‘oomph’. I agree. So a highly statistically significant relationship might actually have a very small effect and so not be of substantive importance. So it’s not really enough to just look at the statistical significance of any relationship, we also need to think about the size of the relationship. McCloskey talks about this in her book, joint with Stephen Ziliak, The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives and an entire issue of the 2004 Journal of Socio-Economics (subscription required) is dedicated to discussing the issue.

In matrix form, this is the idea.

When doing an empirical test we should look for substantive significance and statistical significance and not just one or the other. We all agree that when a hypothesis lacks both statistical significance and the statistic lacks oomph then we should reject whatever hypothesis we are investigating. Similarly when we have statistical significance and oomph then we should accept whatever hypothesis we’re investigating. (This language might annoy some purists, we don’t accept hypotheses we fail to reject them etc. etc.). The Reject (1) category is what annoys McCloskey so much; a coefficient that has statistical significance but no oomph. The Reject (2) category is more controversial, to my mind. McCloskey also makes the argument that our conventional t values for hypothesis testing are arbitrary – of course she is right, they are arbitrary. She seems to suggest that it depends from case to case. I do have some sympathy for that argument, but I am uncomfortable with the position. My view is closer to Jeffrey Wooldridge’s (2004) Journal of Socio-Economics position

While I completely agree that statistical significance does not imply economic significance, I think pushing an economically large effect that is statistically insignificant is usually a stretch.

Results in this category shouldn’t be ignored; I think a case for much more work can be made for this category of result.

So why am I carrying on about this? In Phil Jones’ BBC interview we see an example of McCloskey’s standard error. Recall questions B and C.

B – Do you agree that from 1995 to the present there has been no statistically-significant global warming
Yes, but only just. I also calculated the trend for the period 1995 to 2009. This trend (0.12C per decade) is positive, but not significant at the 95% significance level. The positive trend is quite close to the significance level. Achieving statistical significance in scientific terms is much more likely for longer periods, and much less likely for shorter periods.

C – Do you agree that from January 2002 to the present there has been statistically significant global cooling?
No. This period is even shorter than 1995-2009. The trend this time is negative (-0.12C per decade), but this trend is not statistically significant.

In my previous discussion of this interview I made the point that Jones should have told us what his significance levels actually were (so a standard error or a t-stat or a p-level would have been very useful). I have guesstimated his analysis in e-views using data from the CRU website. I estimated the following equation:
Temp = constant + B*Time Trend + AR(1) + error
I included the AR(1) term to take care of any unit-root problems and by using the Newey-West correction was able to get results very similar to what Jones describes in his interview.

The regression corresponding to question B:

The regression corresponding to question C:

The coefficient I estimate for question B is slightly smaller than Jones’ estimate (0.11C per decade to his 0.12C per decade) and my coefficient for question C is also slightly smaller (-0.14C per decade to his -0.12C per decade) [-0.14 is a smaller number than -0.12, don’t get confused by the minus signs]. Neither of those coefficients is significant at the 95 percent significance level, as Jones says. But, as he says, the question B coefficient is very close to significance – it has a p-value of 0.0512. If we were to accept a 90 percent significance level it would be statistically significant (1 – 0.0512 = 94.88 percent). So why doesn’t Jones say that? I’ve often seen people making the argument that a 90 percent significance is okay.

I think the answer is in the question C coefficient. There he says the trend is not statistically significant. But look at the p-value 0.0723. It is clearly not statistically significant at the 95 percent level, but it would be significant at the 90 percent level (1 – 0.0723 = 92.77 percent). In other words, Jones cannot claim that the answer to question B is statistically significant without then conceding that the answer, the negative coefficient, to question C is also statistically significant at the 90 percent level. (The p-value for question C is very sensitive to the Newey-West adjustment, without that adjustment it is statistically significant at the 95 percent level.) Under the standard error approach that McCloskey so hates, it would be game over.

So it looks to me that he is playing silly-buggers with significance levels. In his July 5, 2005 email Jones had indicated

The scientific community would come down on me in no uncertain terms if I said the world had cooled from 1998. OK it has but it is only seven years of data and it isn’t statistically significant.

Well, maybe now it is; the trend coefficient from 2003 looks to be statistically significantly different from zero at the 90 percent significance level.

Some caveats: I am not an econometrician. I have guesstimated what Jones did. What I have done is very rough and ready. He may have done something very different and the significance tests in his analysis might be very different to those I have reported here. He should post his tests and the significance levels on the web so that we can all have a look at them.

Written by Sinclair Davidson

February 22nd, 2010 at 12:05 pm

Posted in Uncategorized

Get a job

6 comments

Retro is in fashion at the moment. The Age today has a retro-opinion piece. All the chestnuts so beloved by the left. Australian is a low-tax nation, taxation is good because it buys civilisation etc. etc. etc.

Many of us are clearly infuriated by the thought that people we know might be shirking their fiscal responsibilities…

Indeed. The best way to not shirk your fiscal responsibility is to get off your bum and get a job and get off welfare.

Written by Sinclair Davidson

February 22nd, 2010 at 8:26 am

Posted in Uncategorized

The Biggest Losers

34 comments

Someone put the question the other day “What is the purpose of the Obama visit?”

I don’t recall the answer but it will provide an opportunity to compare and contrast the way that two men who swept into power on the back of  landslide victories have visibly lost the plot in record time.

Of course it was apparent before they took office that they would be on the short list of the worst PMs/Presidents ever. This situation would be amusing except that people are being hurt and the fundamentals of good governance will remain under threat for some time to come (and that is an optimistic scenario).

The most disturbing aspect of the situation is the way that the mainstream media  in both the US and and Australia have played favorites and largely given up on the task of feeding straight news and commentary.

Progressive/left leaning intellectuals have done the same.

When the disastrous records of Obama and Rudd are written up by the historians the working media and the left-leaning intelligentsia will have to take large share of the blame.

Written by Rafe

February 22nd, 2010 at 7:31 am

Posted in Uncategorized