I'll try to put this one last time and I hope things are clear.

Below are two rows of data, each representing an observation in a STATA dataset (NOT raw data). The first number is the variable person ID, the first word is a variable which represents where that person lived at age 0 (thus, where they were born). Person 1 was born in central region, person 2 born in western region. The number after represents the start date of the spell, and then the next number is the end date of the spell. So person 1 lived in central region until she was 10 years old, person 2 lived in western region until she was 30 years old. The next word is where that person moved, if ever. Person 1 moved to Accra at age 11 and stayed there until she was 20, and then moved to Ashanti region when she was 21 and stayed there until she was 27, which is her current age at time of survey.

Each of these columns represents a variable and follows a series (word, number, number). Since person 2 moved no more than twice, she will be missing in future variable series (thus why you see nothing for person 2 while person 1 shows her Ashanti move) until we come upon the next attribute (rural/urban). A person could have moved up to 10 times, so if she moved less than that number of times, she will be missing on subsequent variables and series until the next attribute. So person 1 is missing on the series for 4th move, 5th move, etc, until we come to the ten-series of rural/urban life. Person 1 lived in a rural area from 0-3, person 2 lived in an urban area from 0-30, a rural area from 31-32, and so forth.

1 central  0 10  accra  11 20 ashanti 21 27 rural  0 3    urban 4 6    rural 7 15 urban 16 20 rural 21 27
2 western 0 30 central 31 32                     urban 0 30  rural  31 32

The words are string variables and the number numeric. In case there is a problem reading in e-mail, the line breaks are after person ID, the word, the start date, the end date, etc (each column represents a new variable). Again, if we get past the first variable, which is person ID, the sequence of variables is: place of residence, age started at that place, age ended at that place, 2nd place of residence, age started at that place, age ended at that place, 3rd place of residence, etc.

The big question is: How can STATA take this spell file and change it into a person-year file?

