Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Interpolation [was: Re: st: From: Sadia Khalid ...]


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Interpolation [was: Re: st: From: Sadia Khalid ...]
Date   Thu, 9 Jan 2014 18:22:56 +0000

[I changed the thread title. Sadia: Giving a sensible title to your
posts is one of several things you should please note.]

I have some comments on _interpolation_ (usual word). Arguably,
interpolation (wide sense) means predicting values from neighbouring
values, while interpolation (narrow sense) means doing that _within
the range of the data_ and extrapolation means doing that beyond the
range of the data. This range could be in time, space or with
reference to any other coordinates.

Interpolation has a centuries-old history but has been over-shadowed
in recent years within statistical science by much more elaborate
techniques of imputation, even when interpolation offers a simpler,
but different, solution to the same problem. (In general,
interpolation and imputation are not identical problems, however.)

This can be seen in specific terms: official Stata offers only linear
interpolation through -ipolate-, although other techniques exist and
some are available in Stata as user-written commands (e.g. -cipolate-,
-csipolate-, -pchipolate-, -nnipolate- from SSC).

Statistical people are often sceptical about or even hostile to
interpolation, perhaps on the following grounds.

1. As David Hoaglin emphasised, it is easy (for naive users) to forget
that you aren't really producing new and valid data, or even replacing
old and invalid data, except with guesses. So, it is cautions all the
way up, including not fooling yourself about how much to believe
(e.g.) model goodness of fit or significance levels obtained from the
data, including interpolations.

2. Interpolation inevitably understates variability to the extent that
the real but unknown series will usually be rougher than the
interpolated series. Statistical people in various fields are often
trained to regard large variance of unknown magnitude as what you have
to accept but large bias of unknown magnitude as the work of the
devil.

3. Interpolation has poor or undefined statistical properties insofar
as the simplest techniques offer no way of assessing associated error
and/or are based on naive or poorly defined ideas of generating
processes. (There are conversely exceptions at more advanced levels,
e.g within spatial statistics.)

4. Through some tribal or traditional division of labour,
interpolation is often taught (very briefly or briskly) under some
heading such as numerical analysis or mathematical methods for
scientists or engineers. It is often regarded as too trivial or
elementary to be worth much attention even there, and is often omitted
from statistical teaching. Here's a challenge: identify a book or
course on statistical or data analysis that includes serious coverage
of interpolation. There are some, but not I think many.

Nevertheless, interpolation remains central to what we do with data.
It is perhaps worth emphasising that much graphical interpretation
depends on mental interpolation, for example.

Here are some things that could be done, but according to my patchy
reading are often not done:

A. Keeping things graphical. A graph of data and interpolation is
essential to keep track of whether results are plausible or
trustworthy.

B. Test an interpolation method by assessing its ability to reproduce
_known_ data.

C. Use two or more interpolation methods to see how far they (dis)agree.

D. Be especially cautious about extrapolation.

Finally, most interpolation methods are easy to understand and
implement. Thus they are quick to apply. Even when they don't work
well, you have acquired some suitable caution about your data.

Nick
[email protected]


On 9 January 2014 16:04,  <[email protected]> wrote:
> Thanks
>
> @ David Hoaglin
>
>
> for your valuable comments.
>
> Actually I am not filling(extrapolating) all  the missing values. I am
> only extrapolating the missing values if there are only five or less
> missing values.
>
> if i will not extraploate the data , no of observation will be less.
>
> What are your comments on Intrapolation . are people comfortable with
> intraploation ( filling the in between missing values).
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index