[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
wgould@stata.com (William Gould, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: multilevel |

Date |
Tue, 18 Mar 2008 13:32:05 -0500 |

Daniel Macneil <macneida@gse.harvard.edu> writes, > I am stumped on what i thought was an easy problem. I have children > (id_code) within families(household_code) who go to school (1) or not(0). > How do I find the mean values (or even table) at the household level for how > many children are in school? I know I can use xtsum to get the std dev > within and between, but about just finding out: the number of families with > ALL children in school, SOME children in school and NO children in school, > e.g. > > > +--------------------------------------------+ > | househ~e id_code sex age school | > |--------------------------------------------| > 4. | 1 4 Female 6 0 | > 9. | 2 3 Female 12 0 | > 10. | 2 4 Male 8 1 | > 14. | 3 3 Female 12 0 | > 15. | 3 4 Male 10 1 | > |--------------------------------------------| > 16. | 3 5 Male 14 1 | > 23. | 4 3 Female 14 0 | > 24. | 4 4 Female 12 1 | > 25. | 4 5 Female 6 0 | > 32. | 5 5 Female 13 . | > |--------------------------------------------| First, let's get the number of children in school in the last observation of each household. In other observations, the new variable will be missing. We will have new variable n: +-------------------------------------------------+ | househ~e id_code sex age school n | |-------------------------------------------------| 4. | 1 4 Female 6 0 . | 9. | 2 3 Female 12 0 . | 10. | 2 4 Male 8 1 1 | 14. | 3 3 Female 12 0 . | 15. | 3 4 Male 10 1 . | |-------------------------------------------------| 16. | 3 5 Male 14 1 2 | 23. | 4 3 Female 14 0 . | 24. | 4 4 Female 12 1 . | 25. | 4 5 Female 6 0 1 | |-------------------------------------------------| . sort household . by household: gen n = cond(_n==_N, sum(school), .) That may be too tricky, so here's another way following the same logic: . sort household . by household: gen n = sum(school) . by household: replace n = . if _n<_N Now we can obtain the average number of students households have in school: . summarize n Now let's get the number and fraction of familes with ALL children in school, SOME but not ALL in school, and NO children in school: . by household: gen all = (n==_N) if _n==_N . by household: gen some = (n<_N & n>0) if _n==_N . by household: gen none = (n==0) if _n==_N With that, we can get the fractions via, . summarize all some none or we can get counts via . tabulate all . tabulate some . tabulate none In all of the above, I went to extra work to ensure that the variables were defined in only the last observation of each household. That ensured the means and counts were properly weighted to represent households. Now let's assume that Daniel wants the variables defined for every observation. Perhaps he wants to fit a regression when the exlanatory variables are fraction in school, or dummies for all, some, or none. First, I'll get the fraction: . gen f = n/_N . by household: replacxe f = f[_N] Now I'll just fill the dummies: . by household: replace all = all[_N] . by household: replace some = some[_N] . by household: replace none = none[_N] -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: converting Eviews data format to Stata** - Next by Date:
**st: How to print -help file- originally and physically?** - Previous by thread:
**Re: st: Questions about metareg,** - Next by thread:
**st: How to print -help file- originally and physically?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |