# st: SV: RE: RE: Data transformation

 From "Lina Jonsson" To Subject st: SV: RE: RE: Data transformation Date Mon, 12 Nov 2007 17:22:17 +0100

```Dear Nick

For your curiosity I have arranged the data in a way that this subset only consists of two-vehicle accidents.

Another question for my curiosity: Will your way of doing things take care of the problem of missing values that I face using Keiths code. Due to missing values I had to add a command after generating the "other"-variables, namely:

replace othervar=. if othervar==var

This only works when I have variables that never take the value zero, so not for dummies without further transformations. (Which I have done) The above command also makes the "other"-variable missing for cases where we actually have data on the "other"-variable but not on the "own"-variable, not a problem for me since I drop these observations anyway in the regression but it whould be nice to know how to solve this problem just for my curiosity.

/Lina Jonsson

> -----Ursprungligt meddelande-----
> Från: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] För Nick Cox
> Skickat: den 12 november 2007 13:29
> Till: statalist@hsphsun2.harvard.edu
> Ämne: st: RE: RE: Data transformation
>
> I jumped to an incorrect reading of your problem through not
>
> Point of curiosity: What about accidents involving three or
> more vehicles? Just not in the data?
>
>
> bys Accidentnr: egen othersweight=sum(weight) bys Accidentnr:
> egen otherscost=sum(cost) replace
> othersweight=othersweight-weight replace
> otherscost=otherscost-cost rename weight ownweight rename cost owncost
>
> namely: other value = sum of two values - this value
>
> Given that there are two, and only two, cars for each
> accident, you can get there in this way too:
>
> bys Accidentnr : gen othersweight = weight[3 - _n] bys
> Accidentnr : gen otherscost = cost[3 - _n]
>
> The trick is simply a flip or reflection, exploiting the fact
> under -by:- the subscript _n is determined within groups.
> Thus if _n is 1, 3 - _n is 2, and vice versa.
>
> Actually, if there is just one car in any accident, a call to
> observation 3 - _n will yield missing values, which is
> appropriate too.
>
> Lina Jonsson
>
> I have a dataset concerning accidents involving two vehicles
> that I have in two formats, wide and long like this:
>
> long:
>
> Accidentnr	vehiclenr	weight 	cost
> 1		0		1000		35000
> 1		1		1500		150000
> 2		0		1200		150000
> 2		1		1700		750000
>
> wide:
>
> Accidentnr	weight0	weight1	cost0		cost1
> 1		1000		1500		35000		150000
> 2		1200		1700		150000	750000
>
> Now I whould like to transform the data to a long format but
> with information also concering the other vehicle involved in
> each accident like this:
>
> Accidentnr	vehiclenr	ownweight 	othersweight	owncost
> otherscost
> 1		0		1000		1500
> 35000		150000
> 1		1		1500		1000
> 150000	35000
> 2		0		1200		1700
> 150000	750000
> 2		1		1700		1200
> 750000	150000
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```