.

A more mundane solution uses -by:-.

gen is_capital = capitalid == cityid

bysort countryid (is_capital) : gen latitude_capital = latitude[_N]

The indicator (dummy) is 1 when a city is the capital and 0 otherwise.

If you sort each capital city to the end of the block of observations for a country, then you can just pick up its value for the new variable.

A more cautious approach slaps an extra condition on the second statement

& is_capital[_N]

So, no loops necessary at all. Or, more precisely, Stata does the loop required automatically as a consequence of -by:-.

The following poem [by one Sta Ta?] fell into my hands recently.

Something to repeat?

Seek a method neat.

Loops are lovely,

-by:- is sweet.

The style leaves much to be desired, but the content is good.

Nick

n.j.cox@durham.ac.uk

Teresio Poggio

from your dataset I'd build a just capitals dataset:

- select just the capitals (drop if cityid !=capitalid)

- in the new dataset keep just capitalid and latitude

- rename latitude into latitude_capital

- sort the data by capitalid and save it

then open you original data set and sort it by capitalid,

merge it with the new "just capital dataset" using capitalid as a key

and the option uniqmaster

(help merge for details)

Davide Cantoni

> I am having a rather intricate problem in creating a new variable in a

> panel dataset, and I appreciate any help you could offer. I hope the

> problem can potentially be of general interest.

>

> I have a panel dataset of cities and their characteristics in

> different countries. I know the latitude of each one of these cities,

> but now I want to create an additional variable reflecting the

> latitude of the capital city of the country a given city lies in. So

> for example: for the cities of New York, Chicago, etc., I want this

> new variable to contain the latitude of Washington, DC.

>

> Here is a description of the dataset's structure: it is a panel in

> long form, with cities in different countries, observed over different

> years. Each city has a unique numeric identifier, "cityid". Then there

> is a country identifier, called "countryid". Finally, there is a

> variable that repeats the capital city's cityid for each city in a

> given country, "capitalid". For instance, if the cityid of London was

> 135, all cities in the dataset that are in the UK would get a value of

> 135 in the variable "capitalid". Finally, there is a variable called

> "latitude" that refelcts the latitude of each city.

>

> How would I now proceed to create this new variable, call it

> "latitude_capital", by using the variables above?

>

> Basically, the problem I'm having is

> - tell stata to look up for each city its capitalid

> - browse the dataset until you find a city that has the cityid equal

> to this capitalid

> - find out the latitude of this capital city

> - go back to the original city and replace "latitude_capital" with the

> latitude you've just retrieved

>

> The additional problem I encounter while trying to construct something

> with "foreach..." (that, at least, is what I was trying so far) is

> that the values that the capitalid variable takes are of course not a

> clean numlist (like "1(1)100"), but rather a sequence of numbers

> without any regularity, such as 11 12 50 54 60 131... and so on.

