# Re: -by:- is sweet [was: Re: Re: st: Creating a new variable with information from other observations]

 From "Davide Cantoni" To statalist@hsphsun2.harvard.edu Subject Re: -by:- is sweet [was: Re: Re: st: Creating a new variable with information from other observations] Date Mon, 19 May 2008 19:05:15 +0200

```Thank you very much, Nick. This is elegant indeed, and congratulations
for retrieving that piece of high poetry by Sta Ta.

> A more cautious approach slaps an extra condition on the second statement
>
> & is_capital[_N]

why would you need this? To make sure that the -bysort countryid
(is_capital)- command works fine, and puts capital cities at the end
of the block indeed?

But if I do

bysort countryid (is_capital) : gen latitude_capital = latitude[_N] &
is_capital[_N]

I obtain latitude_capital equal to 1 for all observations, instead of
the desired result (which I get if I do not add "& is_capital[_N]").

Davide

2008/5/19 n j cox <n.j.cox@durham.ac.uk>:
> .
>
> A more mundane solution uses -by:-.
>
> gen is_capital = capitalid == cityid
> bysort countryid (is_capital) : gen latitude_capital = latitude[_N]
>
> The indicator (dummy) is 1 when a city is the capital and 0 otherwise.
>
> If you sort each capital city to the end of the block of observations for a
> country, then you can just pick up its value for the new variable.
>
> A more cautious approach slaps an extra condition on the second statement
>
> & is_capital[_N]
>
> So, no loops necessary at all. Or, more precisely, Stata does the loop
> required automatically as a consequence of -by:-.
>
> The following poem [by one Sta Ta?] fell into my hands recently.
>
> Something to repeat?
> Seek a method neat.
> Loops are lovely,
> -by:- is sweet.
>
> The style leaves much to be desired, but the content is good.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Teresio Poggio
>
> from your dataset I'd build a just capitals dataset:
> - select just the capitals (drop if cityid !=capitalid)
> - in the new dataset keep just capitalid and latitude
> - rename latitude into latitude_capital
> - sort the data by capitalid and save it
>
> then open you original data set and sort it by capitalid,
> merge it with the new "just capital dataset" using capitalid as a key
> and the option uniqmaster
> (help merge for details)
>
> Davide Cantoni
>
>> I am having a rather intricate problem in creating a new variable in a
>> panel dataset, and I appreciate any help you could offer. I hope the
>> problem can potentially be of general interest.
>>
>> I have a panel dataset of cities and their characteristics in
>> different countries. I know the latitude of each one of these cities,
>> but now I want to create an additional variable reflecting the
>> latitude of the capital city of the country a given city lies in. So
>> for example: for the cities of New York, Chicago, etc., I want this
>> new variable to contain the latitude of Washington, DC.
>>
>> Here is a description of the dataset's structure: it is a panel in
>> long form, with cities in different countries, observed over different
>> years. Each city has a unique numeric identifier, "cityid". Then there
>> is a country identifier, called "countryid". Finally, there is a
>> variable that repeats the capital city's cityid for each city in a
>> given country, "capitalid". For instance, if the cityid of London was
>> 135, all cities in the dataset that are in the UK would get a value of
>> 135 in the variable "capitalid". Finally, there is a variable called
>> "latitude" that refelcts the latitude of each city.
>>
>> How would I now proceed to create this new variable, call it
>> "latitude_capital", by using the variables above?
>>
>> Basically, the problem I'm having is
>> - tell stata to look up for each city its capitalid
>> - browse the dataset until you find a city that has the cityid equal
>> to this capitalid
>> - find out the latitude of this capital city
>> - go back to the original city and replace "latitude_capital" with the
>> latitude you've just retrieved
>>
>> The additional problem I encounter while trying to construct something
>> with "foreach..." (that, at least, is what I was trying so far) is
>> that the values that the capitalid variable takes are of course not a
>> clean numlist (like "1(1)100"), but rather a sequence of numbers
>> without any regularity, such as 11 12 50 54 60 131... and so on.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```