Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: New method to avoid looping over each observation and across a variable list


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: New method to avoid looping over each observation and across a variable list
Date   Tue, 22 Jan 2013 20:46:25 +0000

Jeph's excellent advice aside, this wide structure is less convenient
for most data analysis with Stata than a long structure.

You are going to be writing these loops across variables again and again.

Nick

On Tue, Jan 22, 2013 at 8:30 PM, Jeph Herrin <[email protected]> wrote:
> Why are you looping over observations?
>
>  gen var_value=.
>
>  foreach V of varlist var0201 - var0312 {
>         replace var_value = `V' if  eventyymm==substr("`V'","var","")
>  }
>
>
> should do it.
>
>
>
> On 1/22/2013 2:57 PM, Jeremy Page wrote:
>>
>> Hello Everybody,
>>
>> I would like some advice about how to change some code that currently
>> loops across a long list of variables for each observation in my data
>> set.
>>
>> My data set has one record per person and there are monthly
>> occurrences of variables that have a suffix of yymm (two digit year
>> and two digit month) to record monthly information.  I also have a and
>> a string variable which contains the year and month an event
>> (eventyymm) also given as "yymm". I would like produce a new variable
>> which gives the information in var0201-var0312 during the month of the
>> event (eventyymm).
>>
>> The example below contains an example data set and my current code.
>> The code produces the correct result but in my actual data set I have
>> millions of observations and about 15 years of yymm variables to loop
>> over. My current method will take an extremely long time to process.
>>
>> Best,
>> Jeremy
>>
>> ******begin example***********
>> clear all
>> input str5 id str4 eventyymm ///
>>        var0201 var0202 var0203 ///
>>        var0204 var0205 var0206 ///
>>        var0207 var0208 var0209 ///
>>        var0210 var0211 var0212 ///
>>        var0301 var0302 var0303 ///
>>        var0304 var0305 var0306 ///
>>        var0307 var0308 var0309 ///
>>        var0310 var0311 var0312
>>        A 0203 0 0 0 0 0 0 1 1 1 1 1 1 ///
>>                   1 1 . . . 0 0 0 0 2 2 2
>>        B 0301 . . . . . 0 0 0 0 0 0 0 ///
>>                   0 0 0 0 9 9 9 1 1 1 1 1
>>        C 0210 1 1 1 1 1 1 1 1 1 1 1 1 ///
>>                   1 1 1 3 3 3 3 3 3 3 3 3
>>        D 0212 0 0 0 0 0 0 . . . . . . ///
>>                   . 9 0 0 0 1 1 1 1 1 1 1
>>        E 0310 3 3 3 3 3 3 3 3 3 3 3 3 ///
>>                   3 3 3 3 3 3 3 3 3 3 3 3
>> end
>>
>> ***generate variable to match with variable name
>> gen varyymm_string = "var" + eventyymm
>>
>> ***generate empty variable to populate in the loop
>> gen var_value = .
>>
>> ***loop over observations
>> foreach i of num 1(1)5 {
>>     ***loop across variables
>>     foreach x of varlist var0201 - var0312 {
>>        replace var_value = `x' if varyymm_string == `"`x'"' & _n == `i'
>>     } /* close loop across variables */
>> } /* close loop over observations */
>>
>> ******end example***********
>> *
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index