Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: computation across rows


From   Hewan Belay <[email protected]>
To   [email protected]
Subject   RE: st: computation across rows
Date   Fri, 11 Feb 2011 16:38:39 -0800 (PST)

Oops, embarrassing that I missed Nick's suggestion to create the variable obs, sorry! Thanks Sarah for pointing me to that.

Thanks for the alternative recommendation to that of Nick: I had been using this method (creating additional temporary variables and then using these to replace across vars), but I also like Nick's approach. The sorting issue is not a problem for the latter, since the 
g obs=_n would immediately precede the operations in my do file, so it doesn't matter how I had sorted the data prior to that, it would always do the right thing.

Thanks to both of you!
Hewan

--- On Fri, 2/11/11, Sarah Edgington <[email protected]> wrote:

> From: Sarah Edgington <[email protected]>
> Subject: RE: st: computation across rows
> To: [email protected]
> Date: Friday, February 11, 2011, 11:14 PM
> .
> It does assume a variable called obs, which Nick suggested
> creating using 
> > gen long obs = _n
> 
> It sounds like for this issue you might be best off
> creating a variable that
> matches the old district number and using that to group
> observations for
> your calculations.
> To take your example, if you have a variable district that
> is equal to 85
> and 89 you could then have another variable old_dist that
> is 82 for both
> observations.  Then you can do many of your
> calculations using bysort
> old_dist to group the observations across which you want to
> do the
> calculations.
> Doing it with observation numbers instead of some set group
> identifier will
> creates problems if you change the sort order of the data.
> 
> -Sarah
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]
> On Behalf Of Hewan Belay
> Sent: Friday, February 11, 2011 2:55 PM
> To: [email protected]
> Subject: Re: st: computation across rows
> 
> Dear Nick,
> I am a bit confused: Your suggested commands presume that
> there is a
> variable in the dataset called 
>    obs    don't they? I
> nevertheless tried
> it out, and indeed my presumption was right, as I get the
> error message
> 
> . su obs if id==82, meanonly
> variable obs not found
> r(111);
> 
> Unless I am fully misunderstanding your suggestion? As to
> the other parts of
> your response: Yes, I certainly don't want to undertake
> these operations
> manually using the editor, as I may need to later revise
> the operations when
> I get more information about the data. 
> 
> (To give a bit of an idea, my observations in my panel data
> are districts,
> and some of the districts have split as of a certain year,
> and I seek to
> aggregate back the characteristics for these split
> districts--so in the toy
> example, think of district #82 as having split into two
> (#85 and 89) in one
> of my panel years.)
> 
> In light of above mentioned error, please let me know if I
> misunderstood
> your suggestion.
> 
> Hewan
>   
> --- On Fri, 2/11/11, Nick Cox <[email protected]>
> wrote:
> 
> > From: Nick Cox <[email protected]>
> > Subject: Re: st: computation across rows
> > To: [email protected]
> > Date: Friday, February 11, 2011, 1:10 AM I find it
> hard to see a 
> > general pattern under your question. Your toy example
> would seem 
> > easiest to solve by mental arithmetic in the Data
> Editor, but you 
> > wouldn't be asking if that were true of your real
> problem.
> > 
> > Naturally you can just find the subscripts for each
> observation and 
> > use those, but again I assume from your question you
> know that you can 
> > do that.
> > 
> > Some problems a bit like this benefit from a variable
> containing the 
> > observation number:
> > 
> > gen long obs = _n
> > 
> > Then you can find the observation number for -id- 85,
> etc.
> > 
> > su obs if id == 82, meanonly
> > local obs82 = r(min)
> > su obs if id == 85, meanonly
> > local obs85 = r(min)
> > su obs if id == 89, meanonly
> > local obs89 = r(min)
> > 
> > The assumption here is that each identifier occurs
> once only so you 
> > can indifferently pick up r(min), r(max) or r(mean)
> after -summarize-.
> > Then you can do things like
> > 
> > replace Y = Y[`obs85'] + Y[`obs89'] in `obs82'
> > 
> > See also
> > 
> > SJ-6-4  dm0025  . . . . . . . . . .  Stata tip 36:
> Which observations? 
> > Erratum
> >         . . . . . . . . . . . . . . . .
> > . . . . . . . . . . . . . .  N. J. Cox
> >         Q4/06   SJ
> > 6(4):596
> >                 (no
> > commands)
> >         correction of example code for Stata tip
> 36
> > 
> > SJ-6-3  dm0025  . . . . . . . . . . . . . . Stata
> tip 36: Which 
> > observations?
> >         . . . . . . . . . . . . . . . .
> > . . . . . . . . . . . . . .  N. J. Cox
> >         Q3/06   SJ
> > 6(3):430--432
> >                
> >      (no commands)
> >         tip for identifying which
> > observations satisfy some
> >         specified condition
> > 
> > Nick
> > 
> > On Thu, Feb 10, 2011 at 8:17 PM, Hewan Belay <[email protected]>
> > wrote:
> > > Dear Statalist,
> > >
> > > I am trying to do something I expected to be
> very
> > simple, but I'm not finding a
> > > straightforward way to do this. Essentially, I
> would
> > like to do discrete
> > > computations across rows/observations (ie within
> > variables). Here is an example
> > > of what I mean, consider this toy dataset (I
> hope the
> > table is easily visible):
> > >
> > > id       Y    Z    W
> > > 81       4    1    3
> > > 82       .     0    9
> > > 85      2     4    1
> > > 87      3     1     4
> > > 89      6     2    5
> > >
> > > For the id #82, I want the variables Y and W to
> take
> > on the value that results
> > > when adding their respective values for IDs #85
> and
> > 89. In the above toy
> > > example, that means that the missing value would
> > become an 8, and the value of 9
> > > would change to 6. I definitely don't want to
> xpose or
> > reshape my data, as I
> > > have several other operations I am doing on the
> data
> > given its current
> > > structure.
> > >
> > >
> > > So generally speaking, my question is how to do
> > computations across selected
> > > rows. I only have info on this with regard to
> > computations X rows above or
> > > beyond the concerned row, that is using the
> operation
> > [n+1], or when getting
> > > statistics for groups of rows using the -by-
> command.
> > >
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> 
> 
>       
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


 
____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index