Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: SV: RE: RE: Data transformation


From   "Lina Jonsson" <[email protected]>
To   <[email protected]>
Subject   st: SV: RE: RE: Data transformation
Date   Mon, 12 Nov 2007 17:22:17 +0100

Dear Nick

For your curiosity I have arranged the data in a way that this subset only consists of two-vehicle accidents. 

Another question for my curiosity: Will your way of doing things take care of the problem of missing values that I face using Keiths code. Due to missing values I had to add a command after generating the "other"-variables, namely:

replace othervar=. if othervar==var

This only works when I have variables that never take the value zero, so not for dummies without further transformations. (Which I have done) The above command also makes the "other"-variable missing for cases where we actually have data on the "other"-variable but not on the "own"-variable, not a problem for me since I drop these observations anyway in the regression but it whould be nice to know how to solve this problem just for my curiosity.


/Lina Jonsson

  

> -----Ursprungligt meddelande-----
> Fr�n: [email protected] 
> [mailto:[email protected]] F�r Nick Cox
> Skickat: den 12 november 2007 13:29
> Till: [email protected]
> �mne: st: RE: RE: Data transformation
> 
> I jumped to an incorrect reading of your problem through not 
> reading it carefully enough. 
> 
> Point of curiosity: What about accidents involving three or 
> more vehicles? Just not in the data? 
> 
> Point of technique: Keith's code was this 
> 
> bys Accidentnr: egen othersweight=sum(weight) bys Accidentnr: 
> egen otherscost=sum(cost) replace 
> othersweight=othersweight-weight replace 
> otherscost=otherscost-cost rename weight ownweight rename cost owncost
> 
> namely: other value = sum of two values - this value 
> 
> Given that there are two, and only two, cars for each 
> accident, you can get there in this way too: 
> 
> bys Accidentnr : gen othersweight = weight[3 - _n] bys 
> Accidentnr : gen otherscost = cost[3 - _n] 
> 
> The trick is simply a flip or reflection, exploiting the fact 
> under -by:- the subscript _n is determined within groups. 
> Thus if _n is 1, 3 - _n is 2, and vice versa. 
> 
> Actually, if there is just one car in any accident, a call to 
> observation 3 - _n will yield missing values, which is 
> appropriate too.  
> 
> Lina Jonsson
> 
> I have a dataset concerning accidents involving two vehicles 
> that I have in two formats, wide and long like this:
> 
> long:
> 
> Accidentnr	vehiclenr	weight 	cost
> 1		0		1000		35000
> 1		1		1500		150000
> 2		0		1200		150000
> 2		1		1700		750000
> 
> wide:
> 
> Accidentnr	weight0	weight1	cost0		cost1
> 1		1000		1500		35000		150000
> 2		1200		1700		150000	750000
> 
> Now I whould like to transform the data to a long format but 
> with information also concering the other vehicle involved in 
> each accident like this:
> 
> Accidentnr	vehiclenr	ownweight 	othersweight	owncost
> otherscost
> 1		0		1000		1500
> 35000		150000
> 1		1		1500		1000
> 150000	35000	
> 2		0		1200		1700
> 150000	750000
> 2		1		1700		1200
> 750000	150000
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index