[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: re: st: Imputation vs. Reducing Dataset?

From	John Simpson <[email protected]>
To	[email protected]
Subject	Re: re: st: Imputation vs. Reducing Dataset?
Date	Mon, 13 Jul 2009 17:53:28 -0600

Hi David,

That's a good point. I'm not sure if I can use a censored ortruncated model though because I only know some of the Xs and maybethat's not enough.

There are two categories of Xs. The ones that I know are the staticenvironmental variables (things like how much it costs to live intothe next round or to breed). The ones I don't know are the dynamicenvironmental variables such as the distributions of the variousbehaviours that the agents can display. These change over time and soare also Ys when considered on their own.

Ideally I'd like to include build a model (or set of models) that canaccount for both population size and the distribution of behavioursacross the population even though these are not observed after thepopulation reached 15000 members.

Is it possible to get away with using a censored or truncated model inthis case without biasing the model towards the non-censored cases?My worry is that by censoring I'll lose as much as 30% of my panels/observations.


-John



From: David Airey <[email protected]>
Subject: re: st: Imputation vs. Reducing Dataset?
Date: Mon, 13 Jul 2009 14:34:01 -0500

Next Article (by Date): st: wald tests with mfx Rich Steinberg

Previous Article (by Date): st: Treatment for Missing Values - WhatOptions ? Chao Yawo

Top of Thread: st: Imputation vs. Reducing Dataset? John Simpson
Articles sorted by: [Date] [Author] [Subject]

.

Should you also consider a censored model, because you know your Xs
but not the Y for those populations that got larger than your cutoff?

-Dave

> Hello Statarians,
> I have a very large set of data featuring population counts
> generated by a computer simulation. In order to speed processing
> populations that grew beyond 15000 within the 100 generation limit
> were pulled from the simulation. As a result there are numerous
> populations that now have missing data, making my panels unbalanced.
> I am curious how to best fit a model to this data given what is
> missing. In particular, I have two worries:
> 1. That unless I do something the missing values will cause any
> procedure to misrepresent the actual situation as the smaller values
> that remain towards the end of the time period will skew the mean. I
> am curious if this is a problem for populations that have died off
> early as well (do I need to carry the 0 through all the
> remaininggenerations?).
> 2. I am unsure whether imputation (with ice?) or chopping the
> dataset or both is the best way to proceed. I know that ice needs
> variables that are missing at random, but is there some way to
> impute the missing values if I know how they are structured.
> Thank you.
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Next Article (by Date): st: wald tests with mfx Rich Steinberg

Previous Article (by Date): st: Treatment for Missing Values - WhatOptions ? Chao Yawo

Top of Thread: st: Imputation vs. Reducing Dataset? John Simpson
Articles sorted by: [Date] [Author] [Subject]

Go to Harvard School of Public Health LWGate Home Page.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Read HTML file with Stata
Next by Date: st: xttobit error "number of quadrature points must be less than or equal to number of obs"
Previous by thread: re: st: Imputation vs. Reducing Dataset?
Next by thread: st: Treatment for Missing Values - What Options ?
Index(es):
- Date
- Thread