[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Adventures in electoral forecasting (was Stata vs SPSS)

From	"Clive Nicholas" <[email protected]>
To	[email protected]
Subject	RE: st: Adventures in electoral forecasting (was Stata vs SPSS)
Date	Mon, 16 Oct 2006 23:50:20 +0100 (BST)
Nick Cox wrote:

> Surprisingly or not, I believe that
> the attitude this person was expressing
> is defensible, although I would naturally
> express it differently.
>
> It does nobody in statistical science any
> good long term if the possibilities of prediction
> from time series models are oversold.
>
> For example, I guess that a standard marketing
> question is predicting the likely sales of a new product.
>
> In such cases, there is typically little (relevant) data,
> a poor understanding of the underlying principles,
> and very high sensitivity to future unknowns.
>
> I doubt that the smartest analysts with the smartest
> models can do much more than make a dent on an
> intrinsically hard problem.

Fair points, all of them. Like you, I couldn't criticise her for _not_
using formal statistical methods to make market predictions; rather it was
what I perceived to be her dismissive attitude of refusing to accept the
possibility that it _might_ improve how market analysts analyse, and the
predictions they make. Of course, any analyst in any field faces, as a
certain Defense Secretary put it, "known unknowns" and "unknown unknowns"
and they plague any such exercise. But that doesn't invalidate the value
of using statistical techniques in the aid of making (possibly!) better
forecasts, _so long as the analyst is explicit about the limitations of
the methods, their data and their theories_.

> After all, Clive, what is your prediction for the results
> of the next British election and how far is that
> based on time series modelling?

I'm very glad you asked! The short answer is that it depends upon what
data you use (survey estimates or constituency-level votes?) and precisely
what question it is you're asking (are you talking about the outcome in
seats, votes or both?).

The long answer starts now. At the (very high) risk of losing most
Statalisters at this point (goodbye!), much of it is based on modelling
and some of it isn't. I'm currently fitting some OLS and ARIMA time-series
models using Gallup and YouGov monthly opinion polls. My data begins in
January 1946 and ends at the British general election of May 2005, but the
series has gaps missing throughout (however, N=639). The response variable
is voting intention (% saying they'd vote for party X). I've managed to
build a very comprehensive model, as there are plenty of "known knowns"
denoting the past: lagged shares; time-trends; changes of party leader and
Prime Minister; months when Britain is at war; macroeconomic measures of
the economy; and a battery of variables indicating by-elections won and
lost by government and opposition parties (my main interest).

Leaving aside the fact that no analyst could sensibly deal with "unknown
unknowns" - other than to consign them to the error term and pray that
none of the RHS variables are correlated with it - the trouble with such
models, of course, is that they cannot factor in "known unknowns" (or are
they "unknown knowns"?), such as future events that will exert positive
and/or negative effects upon voting intentions, which we're pretty certain
will happen _before_ the next election. For these, we don't know when
they'll happen and how big their impact will be until they happen. One
such 'event' right now on the British political scene - with the arrival
of David Cameron as the new leader of the Conservatives and the certainty
of a new Labour PM arriving a year from now - is the growing belief that,
with a Labour government perceived to be fading, failing and running out
of steam against a slowly revitalising Tory party, the next general
election will be very, very close: so close, in fact, that there may be a
hung parliament. If this happens, it will fall to the Liberal Democrats to
decide which of the two bigger parties will be its coalition partner.

But there are too many "known unknowns" here to factor into my present
model. Who is going to be Labour's next leader? (Gordon Brown? John Reid?
Jade Goody?) Will Cameron continue to improve or will he plateau? (Indeed,
has he and have they already?) Would voting intentions change if a hung
parliament was imminent? (Hard to answer if the question isn't asked.)
Just occasionally, wetting your finger and checking for wind direction can
be appealing even to political econometricians.

Anyway, you wanted a prediction. Well, fitting several OLS-Newey West
models (some 'net' - e.g., Con minus Lab; some 'gross'; e.g., % 'voting'
Lib Dem) to make a combined prediction, and making rather large
assumptions about the values of certain key variables - viz.: that the
current term will be as long as the last (47 months); witness the same
number of by-election wins for Labour (3) and the opposition (2); see a
change in PM (which is now certain); and witness the same average annual
rise in inflation as last time (2.41%) - whilst keeping the rest at their
means, I obtain the following scores on those different coloured doors:

Conservatives - 45; Labour - 33; Liberal Democrats - 18 (nobody cares very
much about the Nationalists on a UK scale; I certainly don't).

If I'm right, there won't be a hung parliament, and the Tories would have
a sizeable House of Commons majority (though _not_ as big as the one
Labour enjoyed when they trounced the Tories by as big a margin in 1997,
due to the exotic idiosyncracies of the single-member plurality-rule
voting system, the way it interacts with Britain's electoral geography,
and the success with which Labour has played the system to its great
advantage by ruthlessly targetting marginal seats it did and didn't hold).
I'm sure a much more celebrated election forecaster such as Michael
Lewis-Beck at Iowa would make a much better stab at this than me. But you
get the drift.

The full results, and the (reshpaed) dataset that went with them, have
just been released back to its originator, Pippa Norris of Harvard, and to
Harold Clarke of the University of Texas at Dallas. They are also
available to anyone else on request.

CLIVE NICHOLAS        |t: 0(044)7903 397793
Politics              |e: [email protected]
Newcastle University  |http://www.ncl.ac.uk/geps

Whereever you go and whatever you do, just remember this. No matter how
many like you, admire you, love you or adore you, the number of people
turning up to your funeral will be largely determined by local weather
conditions.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- RE: st: Stata vs SPSS
  - From: "Nick Cox" <[email protected]>
Prev by Date: st: RE: statalist-digest V4 #2490
Next by Date: Re: st: a problem on ml command
Previous by thread: RE: st: Stata vs SPSS
Next by thread: RE: st: Stata vs SPSS
Index(es):
- Date
- Thread