|  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Imputation vs. Reducing Dataset?
Hello Statarians,
I have a very large set of data featuring population counts generated  
by a computer simulation.  In order to speed processing populations  
that grew beyond 15000 within the 100 generation limit were pulled  
from the simulation.  As a result there are numerous populations that  
now have missing data, making my panels unbalanced.
I am curious how to best fit a model to this data given what is  
missing.  In particular, I have two worries:
1. That unless I do something the missing values will cause any  
procedure to misrepresent the actual situation as the smaller values  
that remain towards the end of the time period will skew the mean.  I  
am curious if this is a problem for populations that have died off  
early as well (do I need to carry the 0 through all the remaining  
generations?).
2. I am unsure whether imputation (with ice?) or chopping the dataset  
or both is the best way to proceed.  I know that ice needs variables  
that are missing at random, but is there some way to impute the  
missing values if I know how they are structured.
Thank you.
-John
John Simpson
Department of Philosophy
University of Alberta
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/