[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Carlo Lazzaro" <carlo.lazzaro@tin.it> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: R: RE: Simple tab needed but multiple records+How do people learn Stata |

Date |
Sun, 7 Oct 2007 23:49:08 +0200 |

Dear Nick, you are as usual right. I tried to reply to Joseph' thread making (probably too) basic assumption. However, at the beginning of the current year when I started using Stata (as Statalisters may know I am an Italian health economist, not that experienced in statistics, by the way) I was not able to even figure out such a trivial posting. Even though it may sound like an advertising stunt, using Stata and the invaluable chance to address some questions to more experienced Statalisters changed to a remarkable extent the way I am now approaching researching and consulting in health economics. Besides, I do enjoy using Stata as well as this "thinking outside the box" -with respect to my recent "quantitative past"- that Stata supports. May my humble experience contribute to the surely endless story on "How do people learn Stata?". Kind Regards, Carlo a -----Messaggio originale----- Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Nick Cox Inviato: domenica 7 ottobre 2007 19.51 A: statalist@hsphsun2.harvard.edu Oggetto: st: RE: Simple tab needed but multiple records Maarten is right. If data are like this id female likes_cats likes_dogs -- ------ ---------- ---------- 1 0 0 0 2 1 1 0 3 0 1 0 ... in which each person is represented by only one observation (record), then it's easy to count how many people satisfy two (or indeed more) different conditions. e.g. -count if female & likes_cats & likes_dogs- Nor are indicators (dummy, logical, Boolean variables) essential as we can always use explicit true or false conditions instead. This kind of structure is I think also assumed by Carlo Lazzaro in his posting in this thread. However, This is not the structure Joseph has and it would be unnatural to force his dataset into a different structure given the irregularity of dates that he presumably has. Hence Kit's proposal is closer to, indeed on, the mark. What's more, this is essentially the same problem as that posted by Paul O'Brien just the same day and already replied to with code <http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.071 0/date/article-161.html> The class of problem is this: 1. There is some kind of grouping, most obviously into panel or longitudinal data. For concreteness, we'll talk "panels" and remember that the idea is more general. (Indeed, no kind of time basis, regular or irregular, is essential here.) 2. Hence, multiple observations for each panel are likely. 3. Some question arises about panels that requires comparison of different observations. 4. For each observation, we can say whether it satisfies some condition. That is a true-or-false calculation. 5. We need to summarise that true-or-false result over all observations in each panel. This can be done with -egen, by(<panelid>)- or -by <panelid>: egen- or -by <panelid>: gen-. 6. Then we need to combine information on different conditions using logical operators such as &, | and !. 7. Finally, we must count panels, not individuals. Nick n.j.cox@durham.ac.uk Kit Baum --------------------------------------------- I think this should work, without the necessity of reshaping: bysort id: gen early = inrange(age, 17, 25) by id: gen late = age > 30 by id: gen both = cond(_n==_N, (sum(early) & sum(late)) , .) count if both == 1 To test, set obs 1000 g id=mod(_n,100)+1 g age=40*uniform() Maarten Buis -------------------------------------------- This kind of problem usually becomes a lot easier when you first use -reshape- to put the data into wide format. Joseph Wagner -------------------------------------------- > I have a dataset of x-ray records with multiple records per > patient. The records consist of id, age, and sex and I need > to know how many persons had an x-ray when they were between > the age of 17 and 25 AND when when they were over 30. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: Simple tab needed but multiple records***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: how do peoplelearn Stata?** - Next by Date:
**st: update to ralloc** - Previous by thread:
**st: RE: Simple tab needed but multiple records** - Next by thread:
**Re: st: RE: Simple tab needed but multiple records + how do peoplelearn Stata?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |