[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Daniel Sabath <sabathd@u.washington.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Structure for making line by line changes? |

Date |
Fri, 25 Apr 2003 17:49:12 -0700 (PDT) |

Hi David, Thank you. A lot of what you mentioned makes sense. Please see my previous reply to Fred and Nick as I think I may have explained myself a little clearer there. Now on to the nitty gritty... On Fri, 25 Apr 2003, David Kantor wrote: > At 12:06 PM 4/25/2003 -0700, Dan Sabath wrote: > >Hello, > > > >I am very new to Stata and am having some difficulty wrapping my brain > >around Stata's methods of data processing. > [...] > I wrote a "program" to calculate the value based on args passed into it. > >"checkfoo" returns a 1 or 0 depending if one of the arglist matches the > >first arg. > >so r(checked) = 1 if `k' or `l' is equal to `j' > >otherwise r(checked) = 0. > > > >/*******vastly simplified**********/ > >local j = 1 > >gen k = 2 > >gen l = 3 > > > >while `j' < 6 { > > checkfoo `j' `k' `l' /* r(checked) returned equal to 1 when j = 2 or 3*/ > > replace k = 4 if r(checked) == 1 > > local j = `j' + 1 > >} > >/***********************************/ > >I would like it to replace "k" on the 2nd and 3rd time through the loop > >but not at any other time. > > > >I would be happy if I could just do > >/* psudocode */ > >replace k = 4 if checkfoo `j' `k' `l' /* with checkfoo evaluating true or > >false */ > >[...] > > Nick Cox has replied to this, but I would like to add some comments. > > Your code generates variables k and l. Then you pass macros `k' and `l' to > checkfoo. Variables and macros are different kinds of entities. If you > haven't defied these macros (and they are not defined in your code sample) > then they are empty, and you are only passing one argument (`j') to checkfoo. Ideally I would be passing the value of k and j on that row into checkfoo. perhaps something like k[_n] would work? I'm beginning to see that the implicit loop through the dataset exists in a different location then where I thought it did. My previous email explains it better. > > Note, also that > gen k = 2 > gen l = 3 > > set the variables k and l to 2 and 3 -- for all observations in the entire > dataset. > Yes that was intentional. The actual data is a little more complicated and varies on a row by row basis...but for the example I used this. > When your loop comes to... > replace k = 4 if r(checked) == 1 > > then k will be replaced with 4 -- again, for all observations in the entire > dataset, since r(checked) is a scalar quantity. This behavior I was not expecting. I was expecting r(checked) to change with the values from each row. > > (Actually, this is one place where it would be equivalent to write... > if r(checked) { > replace k = 4 > } > but in general, there is a big difference between the -if- statement and > the -if- qualifier. There is a FAQ on this subject.) It was quite a surprise to find out that the if statement only evaluates its conditions once and not on each row. As a result, i'm not sure when it would be useful. > Since this replace k = 4 will affect every observation, there seems little > point to doing it. Presumably there will be other code that you have > omitted. But, since this -replace- affects all observations equally, it > might better have been a scalar or a macro, > But if, as I might suspect, you are thinking of looping through the > observations, then your code is not correct. But then, most likely, there > is no point to correcting it as such; what you want to do is probably > easily done in a few statements, once you get the idea of how Stata > works. In fact, your "pseudocode" sample is almost (but not quite) a > correct Stata statement -- if you are thinking of replacing k in some > observations and not others. > That is exactly what I was aiming for. > Your pseudocode sample will not work, because in... > replace k = 4 if checkfoo `j' `k' `l' /* with checkfoo evaluating true or > false */ > you cannot create your own function (checkfoo) that can be referenced in an > expression. to my great dissapointment :( > > You can, on the other hand, create a variable to carry the info that you > want. You can also write a program to generate that variable. It is not > clear whether you intended checkfoo to be such a program. (As shown in your > example, it would appear that it yields scalar information, but you may > have had something else in mind.) checkfoo is actually an .ado file /***************** Checkfoo checks arglist[i] against arglist[0]; returns 1 if match and 0 if not match. usage: checkfoo primary_var check1_var check2_var ... returns r(checked) = 0 || 1 ******************/ local checked = 0 local i = `1' while "`2'" ~= "" { if `i'==`2' { local checked = 1 } macro shift } return scalar checked = `checked' end > > Overall I would suggest these points: > > 1: Understand the difference between variables, scalars and > macros. (Scalars and macros are similar in that they have a single value. > Variables have a set of values: one for each observation. Note, also that > if a program returns something in r(), that returned value is a scalar or > macro.) > At what point are scalars and macros evaluated? Can you reset the value in the middle of the run depending on other calculations? IE x = 0; replace y = z if x < 10, x++ > 2: Most Stata statements that operate on the data do so on the whole > dataset at once. (Actually, there is a sequential aspect to the action that > processes the statement, but you usually don't need to think about it.) It > may help to remember that, for example, in you code... > gen k = 2 > gen l = 3 > > first, k is created and set to 2 for all observations; then l is created > and set to 3 for all observations. I believe that this is one of the fundimental differences (and a hard one to get your head around) between stata and other stats languages. The implicit loop through the data exists on each *line* of the do file and not around the program as a whole. Other languages work on the data a line at a time and allow you to make as many calculations / modifications as you like before proceeding. Please correct me if I am missing something. (see http://www.cpc.unc.edu/services/computer/presentations/sas_to_stata/basic.html for more examples of the differences) > > 3: Understand the difference between the -if- statement and the -if- qualifier. > > 4: Looping is useful for actions that occur at a level that is logically > higher than the individual observations. You almost never need to loop > through the observations. If you are attempting to write code to loop > through the observations, you probably are not thinking about the problem > correctly. (Sometimes it is necessary, and I have done it -- *very* rarely.) And this is exactly why I'm asking. I need to get my head adjusted to think about problems in this manner. I really have appreciated all the help you guys have been. Thank you! > > I hop this helps. It certainly has. Thanks again! -dan * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Structure for making line by line changes?***From:*David Kantor <dkantor@jhu.edu>

**References**:**Re: st: Structure for making line by line changes?***From:*David Kantor <dkantor@jhu.edu>

- Prev by Date:
**Re: st: Structure for making line by line changes?** - Next by Date:
**st: Re: Re: having fun with graph in Stata8** - Previous by thread:
**Re: st: Structure for making line by line changes?** - Next by thread:
**Re: st: Structure for making line by line changes?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |