Computer-based statistical and econometric packages

The advent of widely available and powerful computer-based econometric packages such as Shazam, Eviews, SAS, SPSS, RATS, and Statistica (among others) has resulted in a boon in the use and application of econometric and statistical analysis to a whole host of problems.

Prior to these software packages, statistical and econometric analysis was left to a few hardened and highly qualified econometricians and statisticians who knew a great deal about the use and misuse of statistical and econometric techniques.

Now, however, everyone uses – or rather, misuses – these packages to mine data and produce conclusions that cannot be sustained if subject to a careful analysis. It’s a bit like using a calculator to add or multiply. Users often do not realise the errors they have introduced and are unaware of making fundamental mistakes. Unfortunately, many referees are unable (or unwilling) to notice the errors so they pass through to publication.

This pathetic error-prone analysis has been the foundation of so-called ‘climate science’ which is riddled with basic analytical flaws such as mis-specification, omitted variables, multicollinearity, heteroskedasticity, measurement errors, autocorrelation, data mining to support prior conclusions,  publication bias, and confusing causality with correlation.

To this extent, at least, sophisticated computer-based data analysis programs may have caused more problems than the benefits they have provided.

By reducing the cost of using these tools there has been a revolution in the use of such analysis to prove and disprove all sorts of propositions.

Unfortunately, in my personal observation, 99 per cent of these analyses are crap. The ‘researchers’ are not dispassionate and will do whatever it takes to reach their pre-determined conclusion. This bias is shielded behind seemingly sophisticated analytics which is simply punched out from a computer.

This is not science, but it has taken over many university social science departments where so-called theories are not able to be tested using the scientific method of conducting experiments which can be falsified.

Normally one would expect that such poor analysis would be critiqued and the authors embarrassed. But in this publication-rich environment, where the opportunity cost of reading boring papers is high, they go through to the keeper, with the media reporting false conclusions as if they were fact, especially if it is in line with the biases of the particular media organisation. The ABC, for example, breathlessly reports all sorts of so-called ‘research’ which confirms anthropogenic global warming. How tiresome. It is just impossible for sophisticated statisticians and econometricians to keep up with the mountain of crap.

Perhaps these computer packages are more of a curse than blessing?

About Samuel J

Samuel J has an economics background and is a part-time consultant
This entry was posted in Economics and economy. Bookmark the permalink.

66 Responses to Computer-based statistical and econometric packages

  1. ChrisPer

    Andrew Leigh, call your office…

  2. Token

    A lot of fools ate on their way to lose a lot of money for not understanding the risks of bad data analytics.

    Pity too much of it is public funds.

  3. Token

    Andrew Leigh, call your office…

    Zing

  4. Sinclair Davidson

    Oi – I’ve earned a good living out of correcting poor econometric analysis/interpretation the last few years.

  5. .

    Unfortunately, in my personal observation, 99 per cent of these analyses are crap. The ‘researchers’ are not dispassionate and will do whatever it takes to reach their pre-determined conclusion. This bias is shielded behind seemingly sophisticated analytics which is simply punched out from a computer.

    This is not science, but it has taken over many university social science departments where so-called theories are not able to be tested using the scientific method of conducting experiments which can be falsified.

    The problem is research is funded to support a conclusion. The style of writing may lead you to write a waffling conclusion and not hint at your suspicions, but if you set out to prove x and you disprove x, and prove that y and z are actually true (in general, not a statistical test of null and alternatives), then you are considered a failure.

    This is not science. It’s not even a good scientific approach to economics or research in any other discipline.

    The level of poor form, general stupidity and lack of proper application of statistics in medicine and climatology is astounding.

    I am not an econometrician, but I know well enough to worry about specification error, self selection bias, the diagnostics mentioned above, time series issues and the correct regression method, power testing, the importance of economic (read: actual) significance…and so on.

    Most of the medical scientists, epidemiologists and climatologists are simply blind to these issues.

  6. .

    It’s not that bad, Gab. It can’t do time series too well but it can do choice modelling, but not all kinds of truncated/limited variable stuff.

    It’s just clunky even though it looks slick.

  7. JC

    The ABC, for example, breathlessly reports all sorts of so-called ‘research’ which confirms anthropogenic global warming. How tiresome. It is just impossible for sophisticated statisticians and econometricians to keep up with the mountain of crap.

    On a related subject , just today i was listening their ABC news (yea I ought to put it on the music station) interviewing some academic douchebag peddling the so-called success of plain packaging in reducing smoking. How did he reach this conclusion? Oh by citing an amazing 80% increase in calls to the Quit line. That’s it.

    I’m not a statistician, but I do know that you couldn’t possibly reach the conclusion from that info unless you’re a propagandist, halfwit or both.

    The only way you could is to either do some serious polling and compare it to estimated consumption in volume terms after the Roxonian packaging was introduced.. Only then could you say there is or isn’t some sort of correlation and even then it’s not 95%.

  8. Gab

    Clunky? I used to wear platforms that were less clunky. I also detest SAP, but that’s got nothing to do with statistical analysis.

  9. .

    I’m going to put it out there.

    I can’t freakin’ stand EViews.

    Quite frankly, I like Microfit! Now there’s a niche product that no one gives a crap about.

  10. Sinclair Davidson

    SAP!!! Is it still going? These days Stata is the package of choice – especially as it is now menu driven.

  11. Sinclair Davidson

    Microfit??? FFS. A time-series man like you should be doing his own programming in fortran. :)

  12. Gab

    Yep, there are still some saps out there that think SAP is the bees knees.

  13. sabrina

    Doesn’t mean that teaching of econometric analysis methods and/or teaching of the use of these packages not done properly?
    If you consider many of the published analysis being crap, where does this leave the reviewers of the papers? Inept or lackadaisical?

  14. Sinclair Davidson

    Sabrina – in econometrics bullshit baffles brains.

  15. .

    you should be doing his own programming in fortran

    Good luck with that, pal.

    I was going to get a copy of gauss and then I was told…you need to programme that too.

    For god’s sake just get R if you want to code, it’s free. I don’t see the point of paying for a half finished product which is meant to “boost” productivity.

    Microfit??? FFS

    Yes well.

  16. .

    Sabrina

    Econometrics should be relatively easy most of the time, almost “turn key”.

    You are kind of right but I think the profession cleaned its act up at the beginning of the 1990s given a lot of professional complaint and the advent of time series work.

    You are pretty much right though – other than statisticians, most professionals or academics aren’t that good at stats unless they are economists.

    The editing of academic research is highly variable. Some is excellent, some is dumb and poorly informed, some is done by someone’s lackey and is wilfully ignorant, even deleterious for research and downright rude. A lot is ok.

  17. Samuel J

    Fortran? What about assembly language or programming in binary?

  18. Steve of Glasshouse

    I still like GIGO. That’s from the days of pencilled in programming cards in the mid 70s at school

  19. Sinclair Davidson

    Punchcards! Those were the days.

  20. .

    This is quickly turning into a four yorkshiremen skit…

  21. dismissive

    Assembly is only for writing hardware interface drivers. Fortran is the bomb for stats.

  22. Gab

    R

    What a weird name. Sounds like a pirate.

  23. Andrew

    I think we’ve lost something. Long ago, researchers would painstakingly gather data, carefully formulate hypotheses and modelling, punch and run the cards through a computer, and estimate perhaps a single equation, once, even in a thesis. Even in my early days, we spent a lot of time understanding the concept of regression modelling with manual calculation before we ever even ran one (on a mainframe no less). Now, those with a minimal theoretical grounding can run hundreds of models, torturing the data until it provides their desired result. Students in many courses are expected nay encouraged to blip over large areas of theoretical knowledge. And SAS, Stata, EViews, etc. are as readily accessible to a new generation of researchers as Excel in the 90s. Most of them probably understand what’s as much really going on behind the scenes as the general population does with a microwave.

  24. ChrisPer

    Fortran? What about assembly language or programming in binary?

    The tool you have paid the introductory time wastage to learn, is the cheapest tool to use. Fortran will show clearly whether you understand or black-box it – because the box aint black in Fortran.
    SAS and SPSS – pfft, Schroedingers cat would love ‘em.

    (Did those people asking if SAP still existed mean SAS?)

    Seriously, working in unfamiliar packages you are likely to deliver rubbish time and again.

  25. The Pugilist

    I’m with Martin. I was raised on eviews and have used SAS ans Stata a little. I’ve started using R lately and now I wouldn’t use anything else. Incredibly versatile, easy to learn and plenty of online support. The best part is, it’s open source and hence FREE…

  26. The Pugilist

    Having said that, I must say, her majesty’s Australian public service actually pays people to misuse/abuse these programs to torture datasets until they produce the desired results. I think a lot of public sector econometricians know what they’re doing is wrong, but hey it puts organic food on the table and premium unleaded in the Lexus…

  27. Tel

    R

    What a weird name. Sounds like a pirate.

    And that’s the wsy we likes it!

    For god’s sake just get R if you want to code, it’s free. I don’t see the point of paying for a half finished product which is meant to “boost” productivity.

    In R about 85% of what you want to do is already done by library functions, so the only “coding” you have to do is type the correct function name. If you can’t remember the name, it has online help with builtin search and anyway there’s heaps of examples on the web.

    If you want to do something unusual and clever you could choose to write some code (use FORTRAN if you feel like it, R is compatible with most languages), but very few jobs require that.

  28. 2dogs

    Of course, Python is the way to go now, with all the stats add-ins it has.

    The ‘researchers’ are not dispassionate and will do whatever it takes to reach their pre-determined conclusion. This bias is shielded behind seemingly sophisticated analytics which is simply punched out from a computer.

    Generally, this can be picked by such things as low r2 values or obvious but untested alternate hypotheses. I think the problem here is that in applying the scientific method in recent times, there is a lack of emphasis on completeness. e.g. Low r2 means go look for more explanatory variables. Don’t just test your model against the null hypothesis, test against every reasonable alternate model.

  29. .

    Good call. I’ve done so, but mostly on model specification.

  30. dd

    Perhaps these computer packages are more of a curse than blessing?

    They’re tools, so on balance, and in the long run, will be a blessing.
    Right now they are new and can therefore be used to impress and baffle. They can be used badly or used well.

    The main problem is the faith that’s put into anything that comes out of them. That’s the result of perceptions not keeping up with reality.

  31. I am the Walrus, koo koo k'choo

    Shazam tried to look interesting by changing the colour of the screen while it ran a programme. Didn’t work.

    I remember Microfit. Menu driven – heaven sent for a computer illiterate undergrad!

    Mathematica was pretty good for more advanced statistical work.

    Just maximise the R-squared and make sure each parameter is statistically significant, and you’ll be right ;-)

  32. dismissive

    Not really new. We were using SAS in the 80s.

  33. Sinclair Davidson

    Yes sorry SAS. I’ve had to use SAP for other purposes. When the revolution comes we’re going to torture the SAP people on television as a warning to others.

  34. squawkbox

    Ah, good old Systems Against People…

  35. Leigh Lowe

    The skill in using these economic forecasting tools is judging exactly when to throw the silver ball onto the spinning wheel.

  36. entropy

    Also bloody expensive (SAP I mean)

  37. Steve of Glasshouse

    Sinc..let’s not forget reverse polish notation on the first calculators..

  38. Noddy

    Reality is a hard taskmaster.
    If you chaps are so smart design a financial system without inflation and escalating debt.
    You would be doing mankind an exceptional service.

  39. .

    Noddy
    #1151101, posted on January 14, 2014 at 10:15 pm

    Reality is a hard taskmaster.
    If you chaps are so smart design a financial system without inflation and escalating debt.
    You would be doing mankind an exceptional service.

    Free banking tends to zero inflation. As do the Swedes, even with a central bank, fiat money and fractional reserve banking.

    If debt escalates is neither here nor there. What matters is default and what can happen.

    If there are zero defaults, it doesn’t matter how much debt you have.

    No debt – most people could never buy a home. (A lack of affordability is because of taxes – not debt pushing up values).

    Now you are informed with insightful answers, what is your opinion on LIMDEP or MAPLE?

  40. entropy

    Reality is a hard taskmaster.
    If you chaps are so smart design a financial system without inflation and escalating debt.
    You would be doing mankind an exceptional service.

    quite so, if we had a system like that all the prols would be kept in their place and not bother their betters.

  41. JohnA

    Gab #1150923, posted on January 14, 2014 at 8:47 pm

    I also detest SAP, but that’s got nothing to do with statistical analysis.

    Ruddigore (Act 2): That’s nothing – everybody does that: it’s expected of you!

  42. Sinclair Davidson

    I made a lot of money out of RPN when I was a student. At that time all new stock brokers got given an HP (B10 I think it was) when they joined the market and then they had to pass an exam on basic valuation. None of then could use the calculators and so I made money tutoring brokers in how to use their calculators. More money than sense they had.

  43. I ring up one of our stats people, they’ve forgotten more than I’ll ever know.

    The new model testing R packages are mind boggling, they will do in a few minutes what would have taken an entire PhD in my day. Julia is claimed to run even faster. Both as useless as pen and paper if the data is crap.

    Run a mile when one of them insists on the word ‘projection’ instead of ‘prediction’.

  44. JohnA

    entropy #1151082, posted on January 14, 2014 at 10:05 pm

    Also bloody expensive (SAP I mean)

    Naturally – for two reasons:
    a) the package is built to do everything (it is therefore perfect – a proud example of the skills of German engineering)
    b) any incompatibility must therefore be a business structure problem.
    Therefore SAP people say “SAP does everything. So if there is a problem we must re-engineer your business to work with SAP.”

    I believe the Business Re-Engineering Division of SAP is the largest source of revenue and profit. :-)

  45. Gab

    “SAP does everything. So if there is a problem we must re-engineer your business to work with SAP.”

    Ain’ t that the truth! In fact, I think it’s the first thing new recruits are taught on induction at SAp.

  46. .

    Has anyone ehre used LISREL, PC Give or PC Give’s competitor (can’t remember).

    Can some smart person here give a rundown on what R can’t do vs all of the packagaes mentioned here or commonly used, without coding?

    I promise to get better at coding. I’ve had to do a little but christ that was tedious.

  47. Fess

    87% of statistics are made up on the spot to win an argument.
    An old joke, but it works on several levels.

  48. incoherent rambler

    Computer hardware and software is a tool (I have been saying this for 40 years). It is to the pen and paper, what the backhoe is to the shovel.
    In the hands of an idiot, a good backhoe can do a lot of damage in a short space of time.

  49. You know what? Excel is a very powerful tool, and it’s underrated.

  50. Blogstrop

    True, Phillipa. We used to run our small business accounts on excel.

  51. Blogstrop

    From now on I’m looking out for ways to use the word heteroskedasticity. It could, for example, be the root cause of loopy leftism.

  52. Yes, 27 years later Excel can still surprise. Although the latest versions broke the charts.

  53. Andrew

    Apparently some dickhead economist once calculated that if you slap a 40% tax on mining on top of income taxes, nothing would change except you collect infinite revenue. The companies would be indifferent to paying it. And you can set the hurdle a slow as3% ROE.

  54. Cool Head

    Climate change and related modelling has been interesting to watch, totally corrupt. My company indirectly models the effects of climate on insurance portfolios, mainly in the US, but also in Europe. We do a few things, but mainly we model the effects of weather as it relates to property damage. For the modellers reading this we always do validation, i.e. construct a model then apply the results of the model to data that was not used in the modelling process. Once we make models insurer buy the models and bet their companies on them. Currently of models are used in over 100 companies and these insurers have bet several hundred billion dollars on the accuracy of these models.
    The models are built using historical data with the past being a very good indicator of the future. There is NO AGW signal detectable in our models. Simple counts of cyclones, hurricanes, etc. confirm that things are not getting worse in terms of extreme weather events. The best quote that I ever got from an actuary discussing AGW was “don’t throw away your claims experience just yet”, i.e. no AGW signal in their data (like ours).
    Making models where people have to bet the future of their company on them versus making models to publish in journals are 2 totally different activities. Once has to face reality the other addresses a political and funding need.

  55. Tel

    Cool Head, your data is probably missing the necessary adjustments. Hire some Climate Scientists, they can fix it up for you.

  56. Ilan

    In my first months out of uni as an engineer. My Lead told me that all graduates should be banned from using computer analysis programs until they understood what they were doing.

    “Crap in = Crap out”

    Anyone can get the answer they want when they don’t understand the data and tools they use. Fortunately we have a checking process that seems to be lacking in the field of “Climate-Science”.

  57. .

    How many years do you use to validate the model?

    I’ve seen Govt. departments use one year to validate a cross sectional model (albeit built from trends observed or exacted through time series regression).

  58. Token

    “Crap in = Crap out”

    The average time most project people spend on data means they do not have the capability to know they are handling crap, let alone feeding it into their over complex models…though the smell should’ve given them a hint.

  59. PEB

    I have to agree with Samuel J. The vast majority of models are complete crap. It doesn’t matter what tool you use, or whether you write Fortran, no amount of statistical inference on the training data is going to be effective. Not R2 (nay of the various R2 measures), AIC, BIC, the various deviances (squared error, chi squared error, or asymmetric ones like Poisson, gamma and the like), or the various residuals. It always looks great on the training data. It is only when you keep hold-out data, carefully chosen, that you get to see if the model is any good.

  60. Peter S

    What a fascinating discussion!

    Statistics packages will produce crap in the hands of those who do not understand the statistical theory they purport to use. The best crap producers are social scientists.

  61. Louis Hissink

    The climate people have now discovered “kriging” and use it to fill in missing data at the north pole. Their problem is that they have actually calculated the global temperature of a spherical plane defined by latitude and longitude. In other words the temperature of a 2D object which is physically a complete nonsense (but I repeat myself).

    Except I don’t know whether it is wilful ignorance, or simply plain garden ignorance. Stupidity also needs to be factored.

  62. .

    Please explain Kriging in more detail.

Comments are closed.