bysort person_id: egen interview_count=sum(question1!=-5)

should work.

Michael Blasnik

michael.blasnik@verizon.net

----- Original Message ----- From: "Jay McCarthy" <jay.mccarthy@gmail.com>

To: <statalist@hsphsun2.harvard.edu>

Sent: Tuesday, November 09, 2004 11:28 AM

Subject: st: Marking/deleting subsets of observations related to some condition on that subset

Dear STATALIST, I have observation data for a survey. The survey has been conducted 10 times with some of the same people. Each person has an id that is used each time. I would like to delete all observations related to people who have not participated 6 or more times. My strategy has been to first create a new variable that will be equal to the number of interviews that person has had total. So, the 1998 data for person 1 will be "6" if they've been in 6 surveys and so will the 2000 data. I first tried, "by person__id : egen interview__count = count( person__id == person__id & question1 != -5 " The strategy here is that "question1" will be "-5" if the person was not interviewed. Then I tried... forvalues pid = 1(1)12686 { egen interview__count = count( person__id == `pid' & question1 != -5 ) if person__id == `pid' } The first failed for unknown reasons and the second succeeded for "pid=1" but then failed for further runs because the variable already existed. I would have used "replace" instead but the "count" function is not available with replace. My next try will be to use the above format to create "interview__count$i" for each person, then run replace to set each person's "interview_count" to "interview_count$i" but I think this is poor way to do it and I'm looking for suggestions. Thank you, Jay McCarthy -- Jay McCarthy <jay.mccarthy@gmail.com>

