# Re: st: Reshape problem

 From hendratno@dnet.net.id To statalist@hsphsun2.harvard.edu Subject Re: st: Reshape problem Date Fri, 5 May 2006 06:48:03 +0700 (WIT)

```In this case, We have to create a routine program because it's not
standard format. Here is the example :

/* Open the datasets */
/* The example here is original.dta that contain :
+---------------------------------+
|               v1   v2   v3   v4 |
|---------------------------------|
1. |              Sex    M    M    M |
2. |              Age   47   66   56 |
3. |         Left eye                |
4. |        Right eye    Y    Y    Y |
5. |     Lower eyelid    Y    Y    Y |
6. |     Upper eyelid                |
7. |  Lateral canthus                |
8. |   Medial canthus    Y    Y      |
9. | Recurrent lesion                |
10. |   Primary lesion    Y    Y    Y |
+---------------------------------+

The result of process like this :
+-----------------------------------------------------------------------------------------------------------------------------+
| Sex    Age  Left_eye   Right_eye  Lower_eyelid Upper_eyelid
Lateral_canthus Medial_canthus Recurrent_lesion  Primary_lesion |
|-----------------------------------------------------------------------------------------------------------------------------|
|   M     47         .           Y             Y            .
.              Y                .                 |
|   M     66         .           Y             Y            .
.              Y                .               Y |
|   M     56         .           Y             Y            .
.                               .               Y |
+-----------------------------------------------------------------------------------------------------------------------------+

*/

use original,clear

/* Fill in the number of variable you have */
/* In this sample, the local numvar is having 4 variables */
/* The result of this process is destination.dta */

local numvar=4

local vartot=_N
local i = 1

while `i' <= `vartot' {
local j=1

while `j' <= `numvar' {
use original in `i',clear
local varx = v`j'
local nvar = trim("`varx'"[`j'])
local varv = "`varx'"

drop _all

if `j'==1 {
if `i'>1 {
use destination,clear
}
local varname = subinstr("`nvar'"," ","_",1)
qui gen  `varname'=""
}
else {
local obs=`j'-1
use destination,clear
if `j' >= 2 {
if `i'==1 {
qui set obs `obs'
}
qui replace `varname'="`varv'" in `obs'
}
}
qui save destination,replace
local ++j
}
local ++i
}

/* In case contain numerik variables */
qui destring,replace

----- Original Message -----
From: "Nick Cox" <n.j.cox@durham.ac.uk>
To: <statalist@hsphsun2.harvard.edu>
Sent: Friday, May 05, 2006 2:57 AM
Subject: RE: st: Reshape problem

>I agree with Radu that this is a problem requiring
> a transpose of the data.
>
> As Radu hints, -xpose- requires all variables
> to be numeric. But -sxpose- on SSC is a rough-and-ready
> transpose for string variables.
>
> So I had a go:
>
> . sxpose, clear force
>
> . l
>
>     +-------------------------------------------------------------------------------------------------------+
>  1. | _var1 | _var2 |    _var3 |     _var4 |        _var5 |        _var6
|           _var7 |          _var8 |
>     |     2 |     3 |        4 |         5 |            6 |            7
|               8 |              9 |
>     |-------------------------------------------------------------------------------------------------------|
>     |                             _var9                  |
_var10                  |
>     |                                10                  |
11                  |
>     +-------------------------------------------------------------------------------------------------------+
>
>     +-------------------------------------------------------------------------------------------------------+
>  2. | _var1 | _var2 |    _var3 |     _var4 |        _var5 |        _var6
|           _var7 |          _var8 |
>     |   Sex |   Age | Left eye | Right eye | Lower eyelid | Upper eyelid
| Lateral canthus | Medial canthus |
>     |-------------------------------------------------------------------------------------------------------|
>     |                             _var9                  |
_var10                  |
>     |                  Recurrent lesion                  |
Primary lesion                  |
>     +-------------------------------------------------------------------------------------------------------+
>
>     +-------------------------------------------------------------------------------------------------------+
>  3. | _var1 | _var2 |    _var3 |     _var4 |        _var5 |        _var6
|           _var7 |          _var8 |
>     |     M |    47 |          |         Y |            Y |
|                 |              Y |
>     |-------------------------------------------------------------------------------------------------------|
>     |                             _var9                  |
_var10                  |
>     |                                                    |
Y                  |
>     +-------------------------------------------------------------------------------------------------------+
>
>     +-------------------------------------------------------------------------------------------------------+
>  4. | _var1 | _var2 |    _var3 |     _var4 |        _var5 |        _var6
|           _var7 |          _var8 |
>     |     M |    66 |          |         Y |            Y |
|                 |              Y |
>     |-------------------------------------------------------------------------------------------------------|
>     |                             _var9                  |
_var10                  |
>     |                                                    |
Y                  |
>     +-------------------------------------------------------------------------------------------------------+
>
>     +-------------------------------------------------------------------------------------------------------+
>  5. | _var1 | _var2 |    _var3 |     _var4 |        _var5 |        _var6
|           _var7 |          _var8 |
>     |     M |    56 |          |         Y |            Y |
|                 |                |
>     |-------------------------------------------------------------------------------------------------------|
>     |                             _var9                  |
_var10                  |
>     |                                                    |
Y                  |
>     +-------------------------------------------------------------------------------------------------------+
>
> . drop in 1
> (1 observation deleted)
>
> . foreach v of var _var* {
>  2.         label var `v' "`= `v'[1]'"
>  3. }
>
> . drop in 1
> (1 observation deleted)
>
> . destring, replace
> _var1 contains non-numeric characters; no replace
> _var2 has all characters numeric; replaced as byte
> _var3 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var4 contains non-numeric characters; no replace
> _var5 contains non-numeric characters; no replace
> _var6 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var7 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var8 contains non-numeric characters; no replace
> _var9 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var10 contains non-numeric characters; no replace
>
> . d
>
> Contains data from spiedel.dta
>  obs:             3
> vars:            10                          4 May 2006 20:39
> size:           183 (99.9% of memory free)
> -------------------------------------------------------------------------------
>              storage  display     value
> variable name   type   format      label      variable label
> -------------------------------------------------------------------------------
> _var1           str3   %9s                    Sex
> _var2           byte   %10.0g                 Age
> _var3           byte   %10.0g                 Left eye
> _var4           str9   %9s                    Right eye
> _var5           str12  %12s                   Lower eyelid
> _var6           byte   %10.0g                 Upper eyelid
> _var7           byte   %10.0g                 Lateral canthus
> _var8           str14  %14s                   Medial canthus
> _var9           byte   %10.0g                 Recurrent lesion
> _var10          str14  %14s                   Primary lesion
> -------------------------------------------------------------------------------
> Sorted by:
>     Note:  dataset has changed since last saved
>
> . l
>
>     +--------------------------------------------------------------------------------+
>     | _var1   _var2   _var3   _var4   _var5   _var6   _var7   _var8
_var9   _var10 |
>     |--------------------------------------------------------------------------------|
>  1. |     M      47       .       Y       Y       .       .       Y
.        Y |
>  2. |     M      66       .       Y       Y       .       .       Y
.        Y |
>  3. |     M      56       .       Y       Y       .       .
.        Y |
>     +--------------------------------------------------------------------------------+
>
>
> Naturally, I can't see what problems lurk in the rest of the data, but so
> far, so good.
>
> Nick
> n.j.cox@durham.ac.uk
>
>
>> i don't know if this is the best way, but you could use the
>> -xpose- command.
>>
>> but before that you need to make all you variables numerical.
>> for example:
>>
>> /*coding males as 1 females as 0, Y as 1 no(missing) as 0 */
>> /*coding age as numeric age*/
>>
>> forval i = 2/256 {
>> gen real_v`i' = 1 if v`i' == "M" | v`i' == "Y"
>> replace real_v`i' = 0 if v`i' == "F" | v`i' == ""
>> replace real_v`i' = real(v`i') in 2
>> }
>>
>> /*now you no longer really need id and v1, as long as you make note of
>> what each row means*/
>>
>> drop id v1
>>
>> /*now you can transpose*/
>> xpose, clear
>>
>> /*now you rename to get the variables name back*/
>> rename v1 sex
>> rename v2 age
>> ...
>>
>> /*now you label values*/
>> label define yesno 1 yes 2 no
>> label define gender 1 male 0 female
>> ...
>>
>> hope this helps. there are probably more elegant solutions though.
>
> Thomas Speidel
>
>> > I have a dataset that I need to re-organize in a long format.
>> > Here is a sample:
>> >
>> > +--------------------------------------+
>> > | id                 v1   v2   v3   v4 |
>> > |--------------------------------------|
>> > |  2                Sex    M    M    M |
>> > |  3                Age   47   66   56 |
>> > |  4           Left eye                |
>> > |  5          Right eye    Y    Y    Y |
>> > |  6       Lower eyelid    Y    Y    Y |
>> > |--------------------------------------|
>> > |  7       Upper eyelid                |
>> > |  8    Lateral canthus                |
>> > |  9     Medial canthus    Y    Y      |
>> > | 10   Recurrent lesion                |
>> > | 11     Primary lesion    Y    Y    Y |
>> > +--------------------------------------+
>> >
>> > There are 255 observations (v2-v256) occupying the columns.
>>  I need each
>> > observation to be a distinct row.  Almost all of the
>> entries are string,
>> > (missing means "NO"). I am having difficulties using the
>> reshape command
>> > to achieve my goal, possibly because of the strings.  Any
>> suggestion on
>> > how to approach this?  Should I create a loop to encode all
>> variables?
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```