Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: impute missing values


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: impute missing values
Date   Thu, 6 May 2010 17:35:40 +0100

It is difficult to advise here because this kind of problem usually has
space and time dimensions in addition to all the usual messiness. Thus
one possible imputation is just to interpolate linearly (or using cubic
splines, or whatever) within a station's own history. 

The trade off between using contemporaneous (or lagged!) data for other
stations, time series data for each station, or both is not easy and
will depend on the kind of climate, the spacing of the stations, etc.
Nor is it even self-evident that the closest station is the best match.
That could often be wildly wrong. 

Even worse: the kind of imputation that is physically sensible will
depend on the climatic element. Thus to a fair first approximation
temperature differs additively between stations but rainfalls differ
multiplicatively. 

On this and other grounds, it would be risky to proceed without
consultation with a meteorologist or climatologist who knows your
region. 

Nick 
n.j.cox@durham.ac.uk 

Federico Belotti

Have you tried with -mi-? It is a structured suite of commands to deal
with multiple-imputation available with Stata 11 (-impute- is a simple,
limited and old version of the command)

Anyway, I need some details before trying to code a solution for your
problem. For instance, how many variables with missing values in each
station? how many obs completely missing? Have all datasets the same
missing pattern?

On 6 May , at 01:47, Giancarlo Musto wrote:

> I have daily weather data for several private weather stations near a
> zone of interest. I have some gaps in the data and sometimes these
> gaps are overlapping. To be precise, there are some days in which I
> have missing values for all the weather stations. I would like to
> consider the closest weather station and impute the missing values on
> the basis of the other stations. I tried to use the command "impute",
> but in this way I still have some missing values. Do you know how I
> could solve this problem?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index