Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: creating panel of household surveyed in different year

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: creating panel of household surveyed in different year Date Wed, 22 May 2013 08:48:19 +0100

```Neither solution makes any assumptions about the number of survey
years. But if you add more data, you will need to recalculate. Note
also that both solutions regard a household measured in a single year
as entering and exiting in that year. If that doesn't match what you
want, you will need to say what you want.
Nick
njcoxstata@gmail.com

On 22 May 2013 08:19, Prakash Singh <prakashbhu@gmail.com> wrote:
> Thanks Nick and Chamara for the help
>
> I hope it will work well with addition of more survey years also.
>
> Prakash
>
> On Wed, May 22, 2013 at 12:26 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> As Chamara says, there are different ways to do this. Here is a
>> slightly different approach.
>>
>> bys hhid (survey_year): gen numsurvey=_N
>> bys hhid: gen entry = _n ==  1
>> bys hhid: gen exit = _n == _N
>>
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 22 May 2013 06:46, Chamara Anuranga <kcanuranga@gmail.com> wrote:
>>> Dear Prarkash
>>>
>>> There are different way you can do same thing on Stata. If your
>>> dataset organized like this I suggest to do the following
>>> bys hhid: gen numsurvey=_N
>>> bys hhid: egen minyear=min(survey_year)
>>>
>>> gen entry=survey_year==minyear
>>> bys hhid: egen maxyear=max(survey_year)
>>> gen exit=survey_year==maxyear
>>>
>>>
>>> numsurvey gives the number of times each household appear on the survey
>>> entry is 1 if the household appear on the survey at first time and otherwise 0
>>> exit is 1 if the household surveyed  at the last time
>>> do the label defne for entry and exit variables.
>>>
>>> label define yesno 1 "Yes"  2 "No"
>>> label val entry yesno
>>> label val exit yesno
>>>
>>> if household only survey once entry and exit both get 1 (yes). However
>>> you can  change the variable way you prefer base on numsurvey
>>> variable.
>>>
>>> Hope this help.
>>>
>>> Thanks,
>>> Chamara
>>>
>>> On Wed, May 22, 2013 at 10:19 AM, Prakash Singh <prakashbhu@gmail.com> wrote:
>>>> Thanks Chamara and Nick
>>>>
>>>> Nick, I am providing the id of first ten household surveyed in 1997
>>>> and 2002 below.
>>>>
>>>> hhid    survey_year     entry   exit
>>>> 181004  1997
>>>> 181007  1997
>>>> 181113  1997
>>>> 181801  1997
>>>> 182003  1997
>>>> 182601  1997
>>>> 182615  1997
>>>> 182711  1997
>>>> 182716  1997            yes
>>>> 182803  1997            yes
>>>> 181001  2002    yes
>>>> 181004  2002
>>>> 181007  2002
>>>> 181113  2002
>>>> 181801  2002
>>>> 182003  2002
>>>> 182201  2002    yes
>>>> 182601  2002
>>>> 182615  2002
>>>> 182711  2002
>>>>
>>>> Now if you look at the id, household no 181001 and 182201 were not
>>>> part of 1997 survey household no 182716 and 182803 did not
>>>> participated in the 2002 survey.
>>>>
>>>> My interest is first to generate one variable which identifies
>>>> households participated in all the survey; second variable identifying
>>>> new household in the survey and finally third variable identifying
>>>> household not participated in survey.
>>>> There are two more rounds of data which I am extracting still.
>>>>
>>>> I hope I have made progress in expressing my query.
>>>>
>>>>
>>>>
>>>> Prakash
>>>>
>>>> On Tue, May 21, 2013 at 5:30 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>>>> My own guess is that Prakash's previous post was ignored because it
>>>>> was too vague about precise data structure and the revised post
>>>>> doesn't add much.  At least that is why I deleted it. A specific
>>>>> example showing what you have is usually preferable to a long verbal
>>>>> discussion.
>>>>>
>>>>> The solution below seems unnecessarily complicated. Splitting the
>>>>> dataset into three and then -merge-ing them back again is only
>>>>> possible if there is some identifier in the dataset that tells you
>>>>> which survey round is being referred to. Why not just do it in place?
>>>>>
>>>>> At its simplest the number of rounds in which each household
>>>>> participated may just be the number of times the household appears in
>>>>> the dataset. Otherwise there should be some round identifier. There
>>>>> seems little point in speculating about variables, as Prakash can
>>>>>
>>>>> Same applies to entry and exit: show us how the data are held, and
>>>>> specific suggestions are then much easier.
>>>>>
>>>>> Nick
>>>>> njcoxstata@gmail.com
>>>>>
>>>>>
>>>>> On 21 May 2013 11:42, Chamara Anuranga <kcanuranga@gmail.com> wrote:
>>>>>> Dear Prakash,
>>>>>>
>>>>>> keep id variable in each survey and create new variable to identify each survey.
>>>>>> for the dataset 1
>>>>>> gen svyname1="survey1"
>>>>>> for dataset 2
>>>>>> gen svyname2="survey2"
>>>>>>
>>>>>> etc.
>>>>>> now you have 3 datasets. Merge them base on id. check the missing for svynames
>>>>>>
>>>>>> egen totmiss=rowmiss(svyname*)
>>>>>>
>>>>>> if rowmiss if 0 it mean those household appears in 3 rounds.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Chamara
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, May 21, 2013 at 3:57 PM, Prakash Singh <prakashbhu@gmail.com> wrote:
>>>>>>> Hello every one
>>>>>>> I had sent mail earlier also but may the subject was not appropriate
>>>>>>> to draw attention so I am sending this again with revised subject.
>>>>>>>
>>>>>>> I am working with survey dataset of more than three rounds. The
>>>>>>> identification code for each household is similar in all the rounds.
>>>>>>>
>>>>>>> Now there are some households which are surveyed in all the years,
>>>>>>> there are some households surveyed in some year but not in other years
>>>>>>> (did not participated in the survey). Now I want to map the households so that
>>>>>>> I can know which household is surveyed in more than two rounds and
>>>>>>> also want create panel of household which are surveyed in all the
>>>>>>> years.
>>>>>>>
>>>>>>> I am also interested in entry and exit of households, where entry
>>>>>>> means new household coming in the subsequent round of survey and exit
>>>>>>> means leaving the survey in the subsequent round of survey.
>>>>>>>
>>>>>>> Please suggest how should I workout this problem.
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```