[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: data manipulation |

Date |
Tue, 2 Jul 2002 19:35:05 +0100 |

pmitra@mindspring.com > I have a data set like the following. > > zipcode region quantity revenue flag month year > "37075","11",1,0,"EXP","03","2002 > "37075","04",53,0,"EXP","03","2002" > "38242","04",14,0,"EXP","03","2002" > "38242","11",1,0,"EXP","03","2002" > "70503","06",25,0,"EXP","03","2002" > "70503","11",1,0,"EXP","03","2002" > > I would like to replace the value of region by its value corresponding to > higher quantity by zipcode. For example, for zipcode "37075", I > would like to > replace the region's value of "11" by "04" since region "04" has higher > quantity. The new data set will look like the following: > > "37075","04",1,0,"EXP","03","2002 > "37075","04",53,0,"EXP","03","2002" > "38242","04",14,0,"EXP","03","2002" > "38242","04",1,0,"EXP","03","2002" If you -sort zipcode quantity- then the last value within each -zipcode- will have the highest quantity, so you can assign the corresponding -region- to all observations for that -zipcode-. This is all achievable within one line: bysort zipcode (quantity) : replace region = region[_N] However, if any -quantity- is missing, this will mess up things. So we need to protect against that: gen present = !missing(quantity) bysort zipcode (present quantity): replace region = region[_N] Taking that more slowly, missing(quantity) is 1 if quantity is missing and 0 otherwise !missing(quantity) flips that round: it is 1 is quantity is non-missing and 0 otherwise. The -sort- is hierarchical. * First, we -sort- on -zipcode-. * Within -zipcode- we sort on -present-. Thus all the missing values of -quantity- go first. * Within -present- we sort on -quantity-. Thus the last value of -quantity- will be the highest non-missing value of -quantity- for that -zipcode-. We look across the observation, pick up the value of -region- and smear it across all the observations for that -zipcode-. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: data manipulation***From:*<pmitra@mindspring.com>

- Prev by Date:
**st: data manipulation** - Next by Date:
**st: RE: Multiple graphs** - Previous by thread:
**st: data manipulation** - Next by thread:
**st: RE: Multiple graphs** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |