[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Ilya Beylin" <ilya.beylin@bateswhite.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Counts of different values in one variable by another variable |

Date |
Thu, 25 Mar 2004 09:43:02 -0500 |

Donnel, Perhaps your question has already been answered. If not, these lines will do what you're looking for: // after this command, dup_flag stores the number of other // observations with the same HHID. Where there is only // one unique entry per household ID, dup_flag is set to 0. Where // there are two (e.g. a married couple has been sampled) dup_flag = 1 // and so on. duplicates tag HHID, gen(dup_flag) // to see how many are in each "bin": tab dup_flag // if you want to list/display/browse by bin just type li/di/br if // dup_flag == X where X is the bin you wish to list/display/browse I hope this helps, Ilya -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Donnell Butler Sent: Thursday, March 25, 2004 6:10 AM To: statalist@hsphsun2.harvard.edu Subject: Counts of different values in one variable by another variable Good Day, I am trying to do something which I imagine must be easy to do in Stata, but I can't find the solution in the manuals, help books, or FAQ online. Clearly, I am not thinking clearly, because this seems like a simple request. Perhaps, I just don't know how to phrase the question correctly in my search for the answer. Nevertheless, I am hoping that someone can help or direct me to an existing response that my answer my question with the Statalist archive number or month/year. Here is a simplified version of my dilemma: I have a data set with multiple id numbers. There are is always one primary id (hhid), but sometimes there are more than one subsidiary ids (persid). The persid is simply two digits more than the hhid. For example hhid= 12345 and persid=1234501 (or in the cases where there is more than one, persid=1234501, 1234502, 1234503, etc. The records are structured such that for every action on a given date, there is a record. For example: HHID PERSID ACTION DATE 12345 1234501 EAT 1/1/2003 12345 1234501 DRINK 1/2/2003 12345 1234501 DRINK 1/3/2003 12345 1234501 BE MERRY 1/4/2003 12345 1234502 DRINK 1/1/2003 <-Note new person id, but same hhid 12345 1234502 EAT 1/3/2003 12345 1234503 BE MERRY 1/2/2003 <-Note new person id, but same hhid 12346 1234601 BE MERRY 1/1/2003 <-Note new hhid ... and so on. So, here is my dilemma, I am trying to find a command or commands that will do two things: (1) For the entire data set, across all households, how many times are there 1,2,3,...N numbers of unique PERSIDs within a household? That is, how many households have 1,2,3,... N persons. (2) Display the HHID for households that have X number of persons? That is, for households with X number of unique PERSIDS within a household, list the HHIDS. It seems so simple, but the count command can't count within variables. The egen command can't work with by commands. Clearly, there is an obvious answer but I can't seem to figure it out. Please help. Thanks, Donnell Donnell Butler Ph.D. Candidate Princeton University 125 Wallace Hall Princeton, NJ 08540 609-419-1311 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: RE: Counts of different values in one variable by another variable***From:*"Michael Blasnik" <michael.blasnik@verizon.net>

- Prev by Date:
**RE: Analysis of a case control study (was st: more cases than co ntrols)** - Next by Date:
**st: Re: RE: Counts of different values in one variable by another variable** - Previous by thread:
**st: Standard error in Tobit and IVTobit** - Next by thread:
**st: Re: RE: Counts of different values in one variable by another variable** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |