Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Hewan Belay <hewan_belay@yahoo.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: computation across rows |

Date |
Fri, 11 Feb 2011 16:38:39 -0800 (PST) |

Oops, embarrassing that I missed Nick's suggestion to create the variable obs, sorry! Thanks Sarah for pointing me to that. Thanks for the alternative recommendation to that of Nick: I had been using this method (creating additional temporary variables and then using these to replace across vars), but I also like Nick's approach. The sorting issue is not a problem for the latter, since the g obs=_n would immediately precede the operations in my do file, so it doesn't matter how I had sorted the data prior to that, it would always do the right thing. Thanks to both of you! Hewan --- On Fri, 2/11/11, Sarah Edgington <sedging@ucla.edu> wrote: > From: Sarah Edgington <sedging@ucla.edu> > Subject: RE: st: computation across rows > To: statalist@hsphsun2.harvard.edu > Date: Friday, February 11, 2011, 11:14 PM > . > It does assume a variable called obs, which Nick suggested > creating using > > gen long obs = _n > > It sounds like for this issue you might be best off > creating a variable that > matches the old district number and using that to group > observations for > your calculations. > To take your example, if you have a variable district that > is equal to 85 > and 89 you could then have another variable old_dist that > is 82 for both > observations. Then you can do many of your > calculations using bysort > old_dist to group the observations across which you want to > do the > calculations. > Doing it with observation numbers instead of some set group > identifier will > creates problems if you change the sort order of the data. > > -Sarah > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] > On Behalf Of Hewan Belay > Sent: Friday, February 11, 2011 2:55 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: computation across rows > > Dear Nick, > I am a bit confused: Your suggested commands presume that > there is a > variable in the dataset called > obs don't they? I > nevertheless tried > it out, and indeed my presumption was right, as I get the > error message > > . su obs if id==82, meanonly > variable obs not found > r(111); > > Unless I am fully misunderstanding your suggestion? As to > the other parts of > your response: Yes, I certainly don't want to undertake > these operations > manually using the editor, as I may need to later revise > the operations when > I get more information about the data. > > (To give a bit of an idea, my observations in my panel data > are districts, > and some of the districts have split as of a certain year, > and I seek to > aggregate back the characteristics for these split > districts--so in the toy > example, think of district #82 as having split into two > (#85 and 89) in one > of my panel years.) > > In light of above mentioned error, please let me know if I > misunderstood > your suggestion. > > Hewan > > --- On Fri, 2/11/11, Nick Cox <njcoxstata@gmail.com> > wrote: > > > From: Nick Cox <njcoxstata@gmail.com> > > Subject: Re: st: computation across rows > > To: statalist@hsphsun2.harvard.edu > > Date: Friday, February 11, 2011, 1:10 AM I find it > hard to see a > > general pattern under your question. Your toy example > would seem > > easiest to solve by mental arithmetic in the Data > Editor, but you > > wouldn't be asking if that were true of your real > problem. > > > > Naturally you can just find the subscripts for each > observation and > > use those, but again I assume from your question you > know that you can > > do that. > > > > Some problems a bit like this benefit from a variable > containing the > > observation number: > > > > gen long obs = _n > > > > Then you can find the observation number for -id- 85, > etc. > > > > su obs if id == 82, meanonly > > local obs82 = r(min) > > su obs if id == 85, meanonly > > local obs85 = r(min) > > su obs if id == 89, meanonly > > local obs89 = r(min) > > > > The assumption here is that each identifier occurs > once only so you > > can indifferently pick up r(min), r(max) or r(mean) > after -summarize-. > > Then you can do things like > > > > replace Y = Y[`obs85'] + Y[`obs89'] in `obs82' > > > > See also > > > > SJ-6-4 dm0025 . . . . . . . . . . Stata tip 36: > Which observations? > > Erratum > > . . . . . . . . . . . . . . . . > > . . . . . . . . . . . . . . N. J. Cox > > Q4/06 SJ > > 6(4):596 > > (no > > commands) > > correction of example code for Stata tip > 36 > > > > SJ-6-3 dm0025 . . . . . . . . . . . . . . Stata > tip 36: Which > > observations? > > . . . . . . . . . . . . . . . . > > . . . . . . . . . . . . . . N. J. Cox > > Q3/06 SJ > > 6(3):430--432 > > > > (no commands) > > tip for identifying which > > observations satisfy some > > specified condition > > > > Nick > > > > On Thu, Feb 10, 2011 at 8:17 PM, Hewan Belay <hewan_belay@yahoo.com> > > wrote: > > > Dear Statalist, > > > > > > I am trying to do something I expected to be > very > > simple, but I'm not finding a > > > straightforward way to do this. Essentially, I > would > > like to do discrete > > > computations across rows/observations (ie within > > variables). Here is an example > > > of what I mean, consider this toy dataset (I > hope the > > table is easily visible): > > > > > > id Y Z W > > > 81 4 1 3 > > > 82 . 0 9 > > > 85 2 4 1 > > > 87 3 1 4 > > > 89 6 2 5 > > > > > > For the id #82, I want the variables Y and W to > take > > on the value that results > > > when adding their respective values for IDs #85 > and > > 89. In the above toy > > > example, that means that the missing value would > > become an 8, and the value of 9 > > > would change to 6. I definitely don't want to > xpose or > > reshape my data, as I > > > have several other operations I am doing on the > data > > given its current > > > structure. > > > > > > > > > So generally speaking, my question is how to do > > computations across selected > > > rows. I only have info on this with regard to > > computations X rows above or > > > beyond the concerned row, that is using the > operation > > [n+1], or when getting > > > statistics for groups of rows using the -by- > command. > > > > > > > * > > * For searches and help try: > > * http://www.stata.com/help.cgi?search > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > ____________________________________________________________________________________ TV dinner still cooling? Check out "Tonight's Picks" on Yahoo! TV. http://tv.yahoo.com/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: computation across rows***From:*"Sarah Edgington" <sedging@ucla.edu>

- Prev by Date:
**Re: st: Binary Choice Panel Data Model with Sample Selection.** - Next by Date:
**st: From: Ari Dothan <ari.dothan@gmail.com>** - Previous by thread:
**RE: st: computation across rows** - Next by thread:
**st: Kappa weights & category detection** - Index(es):