[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Dividing a variable |

Date |
Wed, 23 Oct 2002 20:22:37 +0100 |

Hoetker, Glenn > > Hoping someone can help me with a problem involving dividing up a > variable. My data consists of patent numbers and inventors and looks > like this: > > nmi > wku > Schmitt, Ty; Gandre, Jerry > 5586003 > Sato, N. Albert; Baker, David C.; Waldron, Christie J. > 5586324 > Swamy, N. Deepak > 5587885 > > I would like it to look like this: > > nmi > wku > Schmitt, Ty > 5586003 > Gandre, Jerry > 5586003 > Sato, N. Albert > 5586324 > Baker, David C. > 5586324 > Waldron, Christie J. > 5586324 > Swamy, N. Deepak > 5587885 > > That is, I want to create a record containing each inventor > and his or > her associated patent number. If Ty Schmitt had five > patents, he should > show up in five records. The number of inventors per > patent varies from > one to many. > > I've looked for egen functions (and their extensions) and done some > experimenting, but am floundering. Any help would be very > appreciated! The "nmi whu" stuff I don't understand. I am going to be optimistic and assume it is a preamble you can strip off. My suggestion is to use -split- from SSC and -reshape-. For -split-, . ssc inst split For -reshape-, we're using the Third Law of Reshaping: * You may need two -reshape-s to get where you want to be *. Here's my log: . l whatever 1. Schmitt, Ty; Gandre, Jerry 2. 5586003 3. Sato, N. Albert; Baker, David C.; Waldron, Christie J. 4. 5586324 5. Swamy, N. Deepak 6. 5587885 First we set up row and column identifiers for a -reshape-: . egen id = seq(), b(2) . egen field = seq(), t(2) . l whatever id field 1. Schmitt, Ty; Gandre, Jerry 1 1 2. 5586003 1 2 3. Sato, N. Albert; Baker, David C.; Waldron, Christie J. 2 1 4. 5586324 2 2 5. Swamy, N. Deepak 3 1 6. 5587885 3 2 Now we map each pair of observations into one: . reshape wide whatever, i(id) j(field) . l Observation 1 id 1 whatev~1 Schmitt, Ty;.. whatev~2 5586003 Observation 2 id 2 whatev~1 Sato, N. Alb.. whatev~2 5586324 Observation 3 id 3 whatev~1 Swamy, N. De.. whatev~2 5587885 . rename whatever1 who -split- works on some separator. Here it's a semi-colon: . split who, p(;) variables created as string: who1 who2 who3 We have the original -who- and the parts -who?-. The original will just be in the way: . drop who Now the finish is in sight: . reshape long who, i(id) . drop if who == "" . compress . l id _j whatever2 who 1. 1 1 5586003 Schmitt, Ty 2. 1 2 5586003 Gandre, Jerry 3. 2 1 5586324 Sato, N. Albert 4. 2 2 5586324 Baker, David C. 5. 2 3 5586324 Waldron, Christie J. 6. 3 1 5587885 Swamy, N. Deepak Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Dividing a variable***From:*"Hoetker, Glenn" <ghoetker@uiuc.edu>

- Prev by Date:
**Re: st: Dividing a variable** - Next by Date:
**st: RE: Clarification of prior message** - Previous by thread:
**Re: st: Dividing a variable** - Next by thread:
**st: Clarification of prior message** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |