[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"fjc fjc" <fjc120@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: pweights in panel data with increasing sample size |

Date |
Sat, 6 Sep 2008 10:11:19 -0300 |

Dear Statalisters: I have 3 waves of panel data with an increasing sample size. For each wave I have sampling weights, and identifiers for strata and clusters (I describe the data in more detail below). I'd like to use this dataset in two ways: 1) pool the three waves (e.g., to estimate an ordered logit model); 2) use the panel structure (e.g., to estimate a standard linear fixed-effects model). I'm having some trouble deciding how to account for the survey design in the best possible way. I've read the FAQs and previous threads on this topic but I still can't make up my mind. For the pooled data I'd like to use the svy commands (e.g., svy ologit) but I'm not sure how to construct the weights, and if that would have any effect on the identifiers for clusters and strata (the strata and cluster identifiers are such that individuals observed more than once belong to the same stratum and cluster). In the archives I found the suggestion to "weight the weights" (http://www.stata.com/statalist/archive/2004-12/msg00655.html) but I'm not sure whether this is ok when the same individual is observed over time. I have a similar problem for the panel models: I don't know how to construct proper weights that remain constant within panel. In addition, since xtreg does not work with the svy prefix, I can use the cluster option but I won't be able to account for the effects of stratification (is this correct, or I am missing something that would allow me to do it? Maybe I could include dummies for the strata?). I'm using Stata/SE 10.1 on Windows XP. I'd really appreciate any help on these issues. Best, Francisco. Data description: The first wave has a sample size of 1100, 184 clusters, 8 strata, and represents a population of almost 8 million people. The second wave has a sample size of 1500, 250 clusters, 10 strata, and represents a population of almost 11 million people. The third wave has a sample size of 2500, 420 clusters, 10 strata, and represents a population of almost 12 million people. Strata are defined by two criteria: region and socioeconomic category. For wave 1, there are two regions and four socioeconomic categories, which results in the 8 strata mentioned above. For waves 2 and 3, there are 2 regions but 5 socioeconomic categories (the original 4 plus a new one not included in wave 1), resulting in 10 strata. The increase in the sample size from wave 1 to wave 2 has two sources: 300 observations come from the new strata (150 obs. from each new stratum); the other 100 correspond to an increase in the sample size of two old strata, 50 from each of them (same socioeconomic category, different region). The increase in the sample size from wave 2 to wave 3 comes from an increase in the sample size of region 2, evenly distributed across the 5 socioeconomic categories. Some individuals are observed only once, others are observed twice, and others are observed three times. My understanding of the data documentation is that the weights provided are ok for cross-section analysis using each wave separately but there are no weights specifically constructed to use with the panel structure of the data. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Re: (fixed effect) sigma_u & s.d. of u** - Next by Date:
**st: Is this help file in need of an update?** - Previous by thread:
**st: Re: (fixed effect) sigma_u & s.d. of u** - Next by thread:
**st: Is this help file in need of an update?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |