[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: SV: RE: RE: Data transformation

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: SV: RE: RE: Data transformation
Date	Mon, 12 Nov 2007 18:00:37 -0000

If I understand correctly, you are saying that you have some multiple-vehicle (> 2) accidents. For anything in that territory, you might 
find this FAQ of use or interest: 

How do I create variables summarizing for each individual properties of the other members of a group?
http://www.stata.com/support/faqs/data/members.html

Whenever you have missing values in one observation, they 
will get carried across in the way you would want. In the 
simplest kind of case, 

	    foo 
obs 1:    2000
obs 2:      . 

. gen otherfoo = foo[3 - _n] 

Consider obs 1. The other value of -foo- is -foo[3 - 1]- or 
-foo[2]- which is missing. All the code being used 
is out in the open. 

In any case, you can try this out for yourself. You 
need not depend on an answer from the list. 

This is all in contrast to -egen, total()- which ignores 
missings. Often that is a feature, but in your case 
the direct approach has benefits. The condition -if _N == 2-
makes explicit that the trick is for two vehicle accidents. 

bys Accidentnr : gen othersweight = weight[3 - _n] if _N == 2 
bys Accidentnr : gen otherscost = cost[3 - _n] if _N == 2 

Nick
[email protected] 


Lina Jonsson


For your curiosity I have arranged the data in a way that this subset only consists of two-vehicle accidents. 

Another question for my curiosity: Will your way of doing things take care of the problem of missing values that I face using Keiths code. Due to missing values I had to add a command after generating the "other"-variables, namely:

replace othervar=. if othervar==var

This only works when I have variables that never take the value zero, so not for dummies without further transformations. (Which I have done) The above command also makes the "other"-variable missing for cases where we actually have data on the "other"-variable but not on the "own"-variable, not a problem for me since I drop these observations anyway in the regression but it whould be nice to know how to solve this problem just for my curiosity.


/Lina Jonsson

  

> -----Ursprungligt meddelande-----
> Fr�n: [email protected] 
> [mailto:[email protected]] F�r Nick Cox
> Skickat: den 12 november 2007 13:29
> Till: [email protected]
> �mne: st: RE: RE: Data transformation
> 
> I jumped to an incorrect reading of your problem through not 
> reading it carefully enough. 
> 
> Point of curiosity: What about accidents involving three or 
> more vehicles? Just not in the data? 
> 
> Point of technique: Keith's code was this 
> 
> bys Accidentnr: egen othersweight=sum(weight) bys Accidentnr: 
> egen otherscost=sum(cost) replace 
> othersweight=othersweight-weight replace 
> otherscost=otherscost-cost rename weight ownweight rename cost owncost
> 
> namely: other value = sum of two values - this value 
> 
> Given that there are two, and only two, cars for each 
> accident, you can get there in this way too: 
> 
> bys Accidentnr : gen othersweight = weight[3 - _n] bys 
> Accidentnr : gen otherscost = cost[3 - _n] 
> 
> The trick is simply a flip or reflection, exploiting the fact 
> under -by:- the subscript _n is determined within groups. 
> Thus if _n is 1, 3 - _n is 2, and vice versa. 
> 
> Actually, if there is just one car in any accident, a call to 
> observation 3 - _n will yield missing values, which is 
> appropriate too.  
> 
> Lina Jonsson
> 
> I have a dataset concerning accidents involving two vehicles 
> that I have in two formats, wide and long like this:
> 
> long:
> 
> Accidentnr	vehiclenr	weight 	cost
> 1		0		1000		35000
> 1		1		1500		150000
> 2		0		1200		150000
> 2		1		1700		750000
> 
> wide:
> 
> Accidentnr	weight0	weight1	cost0		cost1
> 1		1000		1500		35000		150000
> 2		1200		1700		150000	750000
> 
> Now I whould like to transform the data to a long format but 
> with information also concering the other vehicle involved in 
> each accident like this:
> 
> Accidentnr	vehiclenr	ownweight 	othersweight	owncost
> otherscost
> 1		0		1000		1500
> 35000		150000
> 1		1		1500		1000
> 150000	35000	
> 2		0		1200		1700
> 150000	750000
> 2		1		1700		1200
> 750000	150000

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: string variable
  - From: [email protected]

References:
- st: new package hangroot
  - From: Maarten buis <[email protected]>
- st: RE: new package hangroot
  - From: "Nick Cox" <[email protected]>
- st: Data transformation
  - From: "Lina Jonsson" <[email protected]>
- st: RE: Data transformation
  - From: "Nick Cox" <[email protected]>
- st: RE: RE: Data transformation
  - From: "Nick Cox" <[email protected]>
- st: SV: RE: RE: Data transformation
  - From: "Lina Jonsson" <[email protected]>

Prev by Date: st: RE: scale errors in kdensity
Next by Date: Re: st: issues with right clicking and editing graphs in Mac OS X 10.5
Previous by thread: st: SV: RE: RE: Data transformation
Next by thread: st: string variable
Index(es):
- Date
- Thread