Did the stimulus work?

Treasury says yes. In the budget papers they produce a graph showing a nice upward sloping relationship between the size of the stimulus in 2009 relative to IMF forecasting errors.

Those countries that enacted large and timely fiscal stimulus packages, including China, Korea and Australia, performed much better than expected. Those countries with smaller packages, such as the US, Germany, Canada and France, tended to perform broadly in line with expectations. The relationship shown is highly statistically significant, with a t?statistic on the slope coefficient of 3.3.

Before we go too far into the analysis, I should point out that this confuses correlation and causation. In addition the logic could be reversed. You could easily argue that some economies got paniced into enacting large stimulus packages by IMF forecasts that turned out to be very wrong.

Possum verifies the econometric result and comes to the same conclusion as does Treasury.

The size of stimulus packages mattered – the international evidence is in.

The Coalition needs to be questioned about its economic viewpoints – viewpoints which are far from mainstream economics, existing on the very fringes of economic debate and which are completely at odds, completely and utterly at odds, with the international empirical evidence.

You know what an argument without evidence is called?

Making shit up.

Data snooping is also ‘making shit up’. That is exactly what that graph is – data snooping. Treasury take their data from the IMF. They use the size of stimulus as a percentage of 2009 GDP (for 19 of the G20 economies – the EU is a member in its own right) from here (pg.36) and then calculate the forecast error by subtracting the 2009 forecast GDP growth from the 2009 actual GDP growth. I do have a minor quibble – some of the 2009 actuals are still forecasts (Possum looks like he has substituted an actual for Australia in his regressions), in the analysis below I use the IMF data as downloaded. Like Possum and the Treasury I use the absolute value of the stimulus size and not the negative figure.

The problem with the Treasury analysis is that it only makes use of 11 observations, when they could have used 19 observations. The table below shows the data they could have used relative to the data they did use.

There is no obvious reason why those countries should have been excluded. The Treasury sample isn’t just advanced economies and isn’t just resource rich economies. What is the effect of using the truncated sample? First I show the graph as per the Treasury sample and then I show the graph as per the full sample.

As the Treasury suggests the slope coefficient of the line of best fit is statistically significantly different from zero with a t = 3.3. That is what Possum finds, that is what I find.

This is the graph making use of the full dataset.

The slope coefficients are positive in both cases. The t-stat for the slope coefficient for the full data set, however, is 0.50 well below the generally accepted levels for a t-stat to indicate statistical significance. In layman terms the slope coefficient is not statistically significantly different from zero and we cannot conclude that there is a relationship between the size of stimulus packages and the forecast error. To emphasise the difference between the results for the Treasury analysis on the truncated sample and the full sample, I reproduce the two lines of best fit together in the graph below.

What I’ve done is take the regression equations from the two graphs and then substituted x values into them from 0.5 to 4 at 0.5 intervals. I then plotted the lines. There is a huge difference between the two – especially for the three economies (Australia, China and Korea) that Treasury nominate as having had early and large stimulus packages.

So far the discussion has been about the potential for outliers to confound the analysis and so on. But before we get to that, we should be looking at the data sample. The IMF reported data for 19 of the G20 economies, and Treasury have used a sub-sample of those economies in their analysis. If there is a good reason for having done so, they need to tell us.

Update: Lurkers email to ask why this is being released today – these are the workings for Hey, what did I miss? If you don’t already have one, get an email subscription.

  1. I said my piece at looking behind the unemloyment figures here:


    The risk is that we face hysteresis in labour markets that may inhibit our growth – (and budget forecasts)!

  2. daddy dave says:

    So they cherry-picked 11 data points that fitted the trend, and threw out the other data points that didn’t fit the trend.
    It’s fucking outrageous. That’s no different to just outright lying. It’s the same as just making the data up.

  3. asf says:

    Possum should give up on the statistical analysis. Whoever gave him that EViews licence has a lot to answer for.

    Mighty brave of him to publish the regression output and then claim the results are meaningful.

  4. Sinclair Davidson says:

    Let’s not launch into Possum – rather the Treasury have some explaining to do.

  5. JC says:

    I agree Dad it is outrageous. Brazil and South Africa would be interesting to see.

  6. JC says:


    Who in Treasury would be responsible for mining out the most friendly data? Is there a department for bullshit?

  7. asf says:

    Sorry Sinc, couldn’t help myself.

    You are correct, Treasury are the real problem. I wondered why they had excluded Russia, after all it is a G8 country and it seemed strange to have 7/8 present. Now I see why.

  8. JC says:


    Why would Treasury paste in Russia? It is after all primarily a commodity exporting nation and would look too much like us in a way 🙂

  10. Butterfield, Bloomfield & Bishop says:

    not hard to wonder why Russia was excluded or most of the others.

    try to think. not too hard you might get a headache

  11. “try to think. not too hard you might get a headache”

    Please tell us Homer. I’m dying to know why you think we can ignore the world’s largest natural resource owner and its hundreds of millions of residents on a 2nd world income.


  12. daddy dave says:

    not hard to wonder why Russia was excluded or most of the others.
    You can’t exclude on a case by case basis. That’s still cherry-picking. You need an a priori rule for inclusion in the analysis.
    try to think. not too hard you might get a headache
    That’s your response?
    Paraphrasing Homer: “There’s a reason but I won’t say what it is. But it’s a very good reason, and if I was to tell you (which I won’t), I would easily win the argument, because you would see what an excellent reason it is.”

  13. dover_beach says:

    Is Michael E. Mann now working in Treasury?

  14. Rococo Liberal says:

    There’s a structeal defecit on page 21

  15. “There’s a reason but I won’t say what it is. But it’s a very good reason, and if I was to tell you (which I won’t), I would easily win the argument, because you would see what an excellent reason it is.”

    No, you’re foregtting that Homer is “eduating” us.

  16. Tim R says:

    Nice to see this analysis. I had this exact thought a few days back. ie: Selective data points were being used by the government analysis.
    This is bad bad science and it can happen even when it’s not deliberate.
    The problem can occur if you start with a hypothesis and then try to fit data to it instead starting by considering ALL the data and then seeing if any hypotheses fit. Good scientists know the human brain tends to fall into the trap of confirmation bias – and they make sure to avoid this trap.
    Unfortunately, most people wouldn’t know good scientific method if it smacked them in the face.

  17. C.L. says:

    Possum… comes to the same conclusion as does Treasury

    What a surprise.

    His last big call was that Rudd’s insulation scheme was the best fire prevention innovation since the hydrant.

  19. Good work Sinc, I’ve linked you in

  20. asf says:

    Edward E. Leamer calls this the ‘Sherlock Holmes Inference’ or the tendency to theorise before all the evidence is in, resulting in biased judgments. Leamer says this is because “a theory constructed before seeing the facts can be disastrously inappropriate and psychologically difficult to discard.”

  21. Skuter says:

    Great work Sinc. Let’s hope a few Senators launch into the Wombat Whisperer at estimates for this one…

  22. Sinclair Davidson says:

    Yes. To be fair the budget papers are the responsibility of the Treasurer. He needs to answer questions too.

  23. Pedro says:

    Even the original bodgy chart does not evidence the claim unless you have grounds to say why stimulus below a certain percentageof GDP is not effective.

  24. rog says:

    Employment is up – higher than all the pundits predicted.

    What was the question again?

  25. jack says:

    It is called post normal science. That ,means you dream up a hypothesis and then hunt around until you find some data to fit it, even if you have to “adjust” the data. Then you give it to one of your mates who is in on the method. He agrees that the hypothesis is proven and they pass it off as peer reviewed.
    The dead give away is when they agree, “the science is in.”

  26. JC. says:

    What was the question again

    There was no question, Wodgie.

    We can appreciate it if you can’t follow the discussion as it is obviously way above your abilities.

    Sinc believed that the Treasury was gilding the numbers by selecting best fit data (follow so far).

    Unless you have an argument to show that it was best fit and why the other countries were left out, you really have no business being here and we can assume you’re just trolling to get attention.

    Now, back of the class please.

  27. daddy dave says:

    Employment is up – higher than all the pundits predicted.
    ie your point is, the economy is doing well, therefore the stimulus worked.
    I think you didn’t read Sinclair’s post above (either that or you didn’t understand it), which I guess you are admitting. (ie “what was the question” is an admission you don’t understand)

  28. JC. says:


    You think this blog would suit you a bit more?


    You’ll love it there and find like minded people.

  29. FDB says:

    “The dead give away is when they agree, “the science is in.””

    So how do you distinguish the “dead giveaway” (of scientific fraud – a very serious and legally actionable accusation) in one case, from the general agreement that accompanies any new finding in science?

    Let me guess.

    It’s when you don’t like the conclusion, isn’t it?

  30. JC. says:


    What’s with new spring of confidence in your step I’ve been noticing of late? Where’s this coming from? Is it someone posing as you?

  31. FDB says:

    My back’s to the wall JC.

    Sometimes when everything tangible in your life unravels and wanders off, you see other things a little more clearly.

  32. rog says:

    Only JC would waste time wondering what the question was – and then answering it.


  33. daddy dave says:

    The dead give away is when they agree, “the science is in.”
    I agree with your sentiment, but unfortunately there is no dead giveaway.

  34. C.L. says:

    Employment is up – higher than all the pundits predicted.

    What was the question again?

    The question: under whom was full employment restored?

    Answer: John Howard.

  35. JC. says:

    Only JC would waste time wondering what the question was…..

    No Wodge, you forgot. You asked the question when there wasn’t any question and like the creepy little coward that you are now suggesting I was.

    Back of the class, Wodge. And stop getting emotional because you’re found to be a total laughable doofus.

  36. rog says:

    Over at possum JC shoots yet another hole in his foot

    The US had recessions (from memory) in early 70’s, late 70’s, early 80’s, mid/ate 80’s, early 90’s and early 2000’s.

    In each of those recessions, with the exception of the last one (2008), stimulus spending was small to non-existent while the deficit never reached these humongous levels that it is now.

    The fact that the US has been in deficit for years and the cumulative effect of this deficit has been disastrous seems to have escaped the reality challenged JC.

  37. pedro says:

    So rog, the GFC was caused by the bush deficits? Is that what you are saying?

  38. JC. says:

    Nothing has escaped me Wodgie, you doofus.

    Read the comment again, Mr. Carpenter. I said that it was good that possum has understood Sinc’s argument in the above thread and evidently conceding to him (Sinc). This was the argument that you interpreted as a question, Wodge, which Dad and I picked you up on and where I suggested it would be better for you to hang around the carpenters blog because of your inability to grasp things here.

    The points that I made to possum is that even if he ignored Sinc’s argument correlation doesn’t follow causation as it is impossible to determine that the stimulus alone is responsible for the higher correlation as individual economies are far too fluid for such comparison and his summation is therefore too simplistic.

    (no Wodge, I am not talking about using sex toys, which a complete bogan like you would immediately think is what “stimulus” means)

    I suggested he take a look at all those recessions in the US during those periods where he would see there was next to no stimulus (not sex toys Wodge) and he would see in almost all those cases even faster growth spurts out of recession when the deficit was no larger than 3.5% mainly brought on by the automatic stabilizers doing their thing.

    Now back the carpentry blog, Wodge. Chop chop.

  39. dover_beach says:

    So how do you distinguish the “dead giveaway” (of scientific fraud – a very serious and legally actionable accusation) in one case, from the general agreement that accompanies any new finding in science?

    Easy, you notice that they been playing funny buggers with the data; it’s that simple.

  40. Andrew Carr says:

    Great post Sinclair. Have you contacted anyone at treasury about why they left out so many countries, or is there anything in the budget papers to suggest a reason for these exclusions ?

  41. Sinclair Davidson says:

    No and no. I wouldn’t even know who to contact at Treasury. At the same time, I’m reluctant to get into a relationship like that.

  42. “Employment is up – higher than all the pundits predicted.

    What was the question again?”

    The question is why you take things at face value but don’t want to look at the details:


    “Here’s what really happened:

    Net arrivals -12600

    Changes in hours worked – negative, equivalent to -1.4 unemployment or eqivalent to 169 200 job losses

    Increase in discouraged workers, Sep 08-Sep-09: 37 900

    Increase to “not marginally attached to the labour force”, Sep-08 to Sep-09: 157 700”

  43. “So rog, the GFC was caused by the bush deficits? Is that what you are saying?”

    Sort of.

  45. Mount Isa Miner says:

    Every single time I drop by I learn something, and I usually don’t like it.

  46. JC. says:


    Drop by anytime and don’t be afraid to post what’s on your mind about a particular topic.

  47. Butterfield, Bloomfield & Bishop says:

    there are at least three errors in Mark’s analysis.
    can anyone help Mark as he doesn’t understand simple statistics

  48. Andrew says:

    Forgetting the outright fraud for a moment, let’s consider which direction causality works. Countries like AUS and China entered GFC with no debt and didn’t need to put 5% of GDP into bank rescues – they could therefore afford larger stimuli. Stands to reason they would avoid the bulk of the GFC, given that it was after all a debt / financial crisis! So if we regressed the 19 against banking system losses, or against public sector debt at FY 2007, would we find a more statistically significant explanation?

  49. Brian says:

    So you think our economy should be compared with Mexico,Russia,Indonesia,Turkey and Saudi Arabia etc ? South Africa is the only one you could possibly squeeze in and Argentina if they had not have had such a long standing problem with loans,banks etc. So whether possum is right or not I do think that possum leaving out the countries did was quite appropriate.

  50. Pedro says:

    Is Brian Homer’s new name?

  51. dover_beach says:

    Brian, if it can be compared with Brazil and China, why not with India, Mexico, or Russia?

    But, anyway, the point of the exercise was not to compare countries but to quantify the significance of the stimulus spending or not in each.

    Finally, Possum didn’t leave anything out, the Treasury did; Possum merely replicated what they found.

  52. jc says:


    it wasn’t Possum that left them out it was the treasury. That’s a start.

    Secondly countless countries were advised by the IMF and large fiance chancelleries around the world to participate in the stimulus spendathon, as it would be good for their economies and the rest of the world.

    Now if the IMF advised them and we want to see how stimulus performed you don’t go around picking and choosing which you want to compare to. In fact if you want to slim down the comparison you would want to compare at random by randomly picking the beauty contestants.

    that’s how things should have been done.

  53. sam says:

    Okay, both this article and possums article are oversimplifying things way too much.

    Possum argues “a dollar is a dollar is a dollar”. Well, no, because different dollars of stimulus will have different economic multipliers and different impact times. A cash handout for example, has a much lower economic multiplier than say, spending on infrastructure. The cash hand out is an immediate stimulus, whereas the infrastructure spending is medium-long term.

    Even within cash hand outs, the impact of say, a lump sum hand out vs a tax cut makes a big difference. There is plenty of argument about which is more likely to impact on spending, but the reason there is an argument is because it depends on a variety of other factors, such as the levels of household debt that may be in arrears, and the likelihood of a given country to save. This varies from country to country. In China there is a much larger savings culture, so the government spent most of it’s stimulus on nation building, whereas here, the government hoped a weak spending culture would mean more of the handouts would be spent straight away.

    Then you go to countries like Russia where a huge portion of trade is in black market goods, and the result is less measurable. This is also true in Indonesia, and Mexico. As such, financial stimulus in those countries is harder to measure. You also need to take into account how those countries were impacted their regional neighbours. Mexico makes much of its money from manufacturing for US consumption, so the amount that they spent on stimulus for example was actually less important to their economy than the effectiveness of the US stimulus. The reverse could also be said here, that had China not spent so much on nation building infrastructure in its stimulus, we would not have done so well either.

    The bottom line is to simply take the amount of spending as a proportion of GDP, and the growth rate of that country, and try to make some sort of case by case rough analysis like this is really all just a bit silly. The bottom line is that government spending was supposed to fill the hole which was left by a drop in private spending, in order to keep growth going and prop up the economy. Whether or not we got value for money, and the debt was worth it, is just as much about how effectively a stimulus was planned as how much overall was spent.

    It’s worth pointing out that the WTO, IMF, and virtually every credible economic organisation was throughout the crisis calling on all governments to spend more to prop up the world economy. Now they’re calling for a careful set of measures to help manage the debt, without withdrawing too far. Debt is bad, but it was there for a reason, to keep people in jobs, thus paying taxes, and thus not draining public finances through wellfare. It’s pretty straight forward stuff, so I don’t think we should be debating whether or not a stimulus was the right thing to do, so much as debating whether it was well managed.

  54. jc says:

    Cue ball gave a lecture Oxford? Wow! I never realized standards for that fine university had fallen so low.


  55. rog says:

    Brian, what’s the a priori exclusion rule. When you figure that out, please come back and tell us. Without it, your blazing counter-attack on treasury’s behalf was ineffectual.

  61. John says:

    The concern we should all have re this manipulation of IMF data – cause that’s what it is. Is that someone made a calculated decision to use the subset of data that created the perception that the Rudd Government’s stimulus package worked. Now the Budget papers are not Treasury papers, they actually belong to the Government. So in reality its down to Rudd, his advisers, Swann or his advisers. Worse still its down to someone that knowingly misled the parliament and the people of Australia by selectively manipulated the IMF data.

  63. Geoff Sherrington says:

    There’s quality as well as quantity. There’s incentive and disincentive.

    We have seen the quality of the Treasury graph compromised to the extent that someone should be charged with fraud – unless there is a completely reasonable explanation that this was not blatant cherry picking. I have not heard of one.

    As to quantity, there was a large quantity of ceiling insulation in the figures, whose effect is now of more doubtful prediction. A lot seems to be protecting the floors of warehouses. How does one justify using employment figures for workers valiantly and unsafely those ripping out sub-grade insulation? This is the modern version of digging holes and filling them in again.

    Would a reasonable person, versed in the ways of the world, ever dream up such a stupid plan as the Pink Batts? Fraud again. Show me a large government incentive scheme and I’ll show you fraud. Some incentive!

    Disincentive works the other way – like the super tax on mining. We keep hearing the PM bleat “The minerals belong to the people”. In a way that is so; but if mining companies had not risked large amounts of skill and funds over the decades, there would be no minerals to export. In that sense, unfound minerals belong to the people, but mineral discoveies should belong to those who made them. That’s equitable.

    As one who spent over 30 years at the leading edge of mineral exploration and large resources, I can say that (oil and gas aside) the rate of new hard rock discoveries has fallen almost to zero since BHP-Billiton and Rio soaked up so many promising companies, converting a number of critical mass units to a bloated pseudo bureaucracy. We are feasting on the last of the many discoveries of my era. Even if a major decision to invigorate exploration was made today, it would be 10-20 years before the new discoveries became productive. Folks, we are on the downslide to a drought in mineral income. If I was still share trading, I’d be out.

    This country is becoming certifiable by insanity standards.

  64. Hey, this is a really interesting topic and I thank you for drawing it to people’s attention. The (certainly looks dodgy) issue of Treasury cherry-picking their data and (I guess obvious) reasons for doing so aside, why are you performing t-tests on the slope of the line? This isn’t sample data. You haven’t randomly chosen 19 countries out of all countries in the world. You’ve specifically and particularly chosen the G19 countries. So the slope is exactly what it is for those G19 countries. There’s no uncertainty about it. I suppose you could argue about the strength of the correlation (and obviously the correlation is weaker with the full dataset), but that’s about it.

    And maybe a linear association isn’t a best fit anyway. Did the assumptions of the regression hold? Were your residuals i.i.d and normally distributed? Maybe some other function would better model the relationship.

    Anyway, a minor (technical) criticism. The point that Treasury’s association between stimulus and outcome looks a lot better on paper when they take out particular data points is pretty clear.



  65. Sinclair Davidson says:

    Stan – I merely reproduced the Treasury analysis with the full data set. Nothing more. Those are all good question best posed to Treasury.

  66. daddy dave says:

    You haven’t randomly chosen 19 countries out of all countries in the world.
    Statistical inference is sometimes done on small population sets. It’s a good question whether this is valid. My opinion without looking into it too deeply is yes, but that’s for another day.
    (the basic logic is that they are a subset of all “possible” countries).

  68. Mr. E says:

    Did the popping of the oil bubble had anything to do with Mexico and Russias performance?

    Throw out oil bubble countries and that regression line starts looking much, much better.

