[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Imputation vs. Reducing Dataset?

From   John Simpson <>
Subject   st: Imputation vs. Reducing Dataset?
Date   Mon, 13 Jul 2009 11:53:49 -0600

Hello Statarians,

I have a very large set of data featuring population counts generated by a computer simulation. In order to speed processing populations that grew beyond 15000 within the 100 generation limit were pulled from the simulation. As a result there are numerous populations that now have missing data, making my panels unbalanced.

I am curious how to best fit a model to this data given what is missing. In particular, I have two worries:

1. That unless I do something the missing values will cause any procedure to misrepresent the actual situation as the smaller values that remain towards the end of the time period will skew the mean. I am curious if this is a problem for populations that have died off early as well (do I need to carry the 0 through all the remaining generations?).

2. I am unsure whether imputation (with ice?) or chopping the dataset or both is the best way to proceed. I know that ice needs variables that are missing at random, but is there some way to impute the missing values if I know how they are structured.

Thank you.


John Simpson
Department of Philosophy
University of Alberta
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index