Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Eric Booth <eric.a.booth@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Generating dummy variable with information of household survey from different observations |
Date | Mon, 7 May 2012 08:35:04 -0500 |
On May 7, 2012, at 1:13 AM, Sumiko Hayasaka wrote: > Everything works out until I get to the "foreach" command. It says the > expression is too long [r(130)]. What should I do? > Thank you again! The r(130) error comes from the -inlist()- part of the -generate- command I showed because, at some point, it has too many elements. This means you have a lot of father_row* variables after the initial -reshape-, probably because you don't have individual_id's like {1,2,3…} like you show, but individual id's like {99998,99917,…} that are unique to all (or most) individual_id's. One way to get around this would be to generate individual_id's within the household using the -egen- function 'group()' or : bys household_id (individual_id): g i = _n and then using "i" in place of individual_id in my example (but, you'd need to remember to carry 'individual_id' through the -reshape-). That will get around the too many values issue assuming you don't have many hundreds of people in a household (inlist()'s limit appears to be 250 - though its not in -help limits- so I don't know if that limit is the same across all versions/flavors of Stata --I've got MP, and 250 is the limit I've encountered). Of course, NJC's examples with looping over individuals is resilient against this type of issue with my code, but I wanted to follow up to explain where/why my example failed. - Eric __ Eric A. Booth Public Policy Research Institute Texas A&M University ebooth@ppri.tamu.edu Office: +979.845.6754 On May 7, 2012, at 1:13 AM, Sumiko Hayasaka wrote: > Thanks Eric! > > Everything works out until I get to the "foreach" command. It says the > expression is too long [r(130)]. What should I do? > > Thank you again! > > > On Sun, May 6, 2012 at 11:44 PM, Eric Booth <eric.a.booth@gmail.com> wrote: >> <> >> >> ***************! >> clear >> inp household_id individual_id father_row >> 1011 1 . >> 1011 2 . >> 1011 3 1 >> 1011 4 1 >> >> 1012 1 2 >> 1012 2 . >> >> 1013 1 . >> 1013 2 . >> 1013 3 2 >> 1013 4 1 >> 1013 5 1 >> end >> >> >> levelsof individual_id, loc(a) >> reshape wide father_row, i(household_id) j(individual_id) >> ds father_row* >> loc checklist `r(varlist)' >> loc checklist:subinstr loc checklist " " ", " , all >> foreach n in `a' { >> g father`n' = cond(inlist(`n', `checklist'), 1, 0, .) >> } >> reshape long father_row father, i(household_id) j(individual_id) >> >> >> ***************! >> - Eric >> >> __ >> Eric A. Booth >> Public Policy Research Institute >> Texas A&M University >> ebooth@ppri.tamu.edu >> +979.845.6754 >> >> On May 6, 2012, at 10:34 PM, Sumiko Hayasaka wrote: >> >>> I am trying to generate a dummy variable, with information from a >>> household survey, which can tell if a member of the household is a >>> father or not. I have a household id, an individual id (per >>> household), and a variable that tells me which individual id is marked >>> as being a father (members of the family are asked if their father >>> lives in the household and to give their father's individual id). >>> Therefore, I need to assign a 1 at the row in which someone at the >>> household said that was a father. To illustrate this, the data is >>> something like this (I am trying to get the "father" variable): >>> >>> household_id individual_id father_row father >>> ------------------------------------------------------------------------ >>> 1011 1 . 1 >>> 1011 2 . 0 >>> 1011 3 1 0 >>> 1011 4 1 0 >>> >>> 1012 1 2 0 >>> 1012 2 . 1 >>> >>> 1013 1 . 1 >>> 1013 2 . 1 >>> 1013 3 2 0 >>> 1013 4 1 0 >>> 1013 5 1 0 >>> >>> >>> So, for example, members number 3 and 4 of household number 1011 >>> stated that their father is individual number 1 in that household. >>> This means that I have to put the 1 of "father" (meaning the household >>> member is a father) at the row where father_row indicates (no matter >>> how many times this is done). >>> >>> Is there any way in which I can do this? I really appreciate your >>> help! Thank you! >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/