Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Reshape problem


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Reshape problem
Date   Fri, 5 May 2006 10:47:44 +0100

Hendratno is replying to my solution, but 
does not comment on it. 

Here is my code again, but cut to 
the essentials and revised:

sxpose, clear force destring 
drop in 1
foreach v of var _var* {
	label var `v' "`= `v'[1]'"
}
drop in 1

Among other details, I note that this
is shorter. 

Nick 
n.j.cox@durham.ac.uk 

hendratno@dnet.net.id
 
> In this case, We have to create a routine program because it's not
> standard format. Here is the example :
> 
> /* Open the datasets */
> /* The example here is original.dta that contain :
>         +---------------------------------+
>      |               v1   v2   v3   v4 |
>      |---------------------------------|
>   1. |              Sex    M    M    M |
>   2. |              Age   47   66   56 |
>   3. |         Left eye                |
>   4. |        Right eye    Y    Y    Y |
>   5. |     Lower eyelid    Y    Y    Y |
>   6. |     Upper eyelid                |
>   7. |  Lateral canthus                |
>   8. |   Medial canthus    Y    Y      |
>   9. | Recurrent lesion                |
>  10. |   Primary lesion    Y    Y    Y |
>      +---------------------------------+
> 
> The result of process like this :
> +-------------------------------------------------------------
> ----------------------------------------------------------------+
> | Sex    Age  Left_eye   Right_eye  Lower_eyelid Upper_eyelid
> Lateral_canthus Medial_canthus Recurrent_lesion  Primary_lesion |
> |-------------------------------------------------------------
> ----------------------------------------------------------------|
> |   M     47         .           Y             Y            . 
>             
>  .              Y                .                 |
> |   M     66         .           Y             Y            . 
>             
>  .              Y                .               Y |
> |   M     56         .           Y             Y            . 
>             
>  .                               .               Y |
> +-------------------------------------------------------------
> ----------------------------------------------------------------+
> 
> */
> 
> use original,clear
> 
> /* Fill in the number of variable you have */
> /* In this sample, the local numvar is having 4 variables */
> /* The result of this process is destination.dta */
> 
> local numvar=4
> 
> local vartot=_N
> local i = 1
> 
> while `i' <= `vartot' {
>    local j=1
> 
>    while `j' <= `numvar' {
>       use original in `i',clear
>       local varx = v`j'
>       local nvar = trim("`varx'"[`j'])
>       local varv = "`varx'"
> 
>       drop _all
> 
>       if `j'==1 {
>          if `i'>1 {
>              use destination,clear
>          }
>          local varname = subinstr("`nvar'"," ","_",1)
>          qui gen  `varname'=""
>       }
>       else {
>          local obs=`j'-1
>          use destination,clear
>          if `j' >= 2 {
>              if `i'==1 {
>                 qui set obs `obs'
>              }
>              qui replace `varname'="`varv'" in `obs'
>          }
>       }
>       qui save destination,replace
>       local ++j
>    }
>    local ++i
> }
> 
> /* In case contain numerik variables */
> qui destring,replace

Nick Cox 

> >I agree with Radu that this is a problem requiring
> > a transpose of the data.
> >
> > As Radu hints, -xpose- requires all variables
> > to be numeric. But -sxpose- on SSC is a rough-and-ready
> > transpose for string variables.
> >
> > So I had a go:
> >
> > . sxpose, clear force
> >
> > . l
> >
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >  1. | _var1 | _var2 |    _var3 |     _var4 |        _var5 | 
>        _var6
> |           _var7 |          _var8 |
> >     |     2 |     3 |        4 |         5 |            6 | 
>            7
> |               8 |              9 |
> >     
> |-------------------------------------------------------------
> ------------------------------------------|
> >     |                             _var9                  |  
>             
>            _var10                  |
> >     |                                10                  |  
>             
>                11                  |
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >  2. | _var1 | _var2 |    _var3 |     _var4 |        _var5 | 
>        _var6
> |           _var7 |          _var8 |
> >     |   Sex |   Age | Left eye | Right eye | Lower eyelid | 
> Upper eyelid
> | Lateral canthus | Medial canthus |
> >     
> |-------------------------------------------------------------
> ------------------------------------------|
> >     |                             _var9                  |  
>             
>            _var10                  |
> >     |                  Recurrent lesion                  |  
>             
>    Primary lesion                  |
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >  3. | _var1 | _var2 |    _var3 |     _var4 |        _var5 | 
>        _var6
> |           _var7 |          _var8 |
> >     |     M |    47 |          |         Y |            Y | 
>             
> |                 |              Y |
> >     
> |-------------------------------------------------------------
> ------------------------------------------|
> >     |                             _var9                  |  
>             
>            _var10                  |
> >     |                                                    |  
>             
>                 Y                  |
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >  4. | _var1 | _var2 |    _var3 |     _var4 |        _var5 | 
>        _var6
> |           _var7 |          _var8 |
> >     |     M |    66 |          |         Y |            Y | 
>             
> |                 |              Y |
> >     
> |-------------------------------------------------------------
> ------------------------------------------|
> >     |                             _var9                  |  
>             
>            _var10                  |
> >     |                                                    |  
>             
>                 Y                  |
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >  5. | _var1 | _var2 |    _var3 |     _var4 |        _var5 | 
>        _var6
> |           _var7 |          _var8 |
> >     |     M |    56 |          |         Y |            Y | 
>             
> |                 |                |
> >     
> |-------------------------------------------------------------
> ------------------------------------------|
> >     |                             _var9                  |  
>             
>            _var10                  |
> >     |                                                    |  
>             
>                 Y                  |
> >     
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> > . drop in 1
> > (1 observation deleted)
> >
> > . foreach v of var _var* {
> >  2.         label var `v' "`= `v'[1]'"
> >  3. }
> >
> > . drop in 1
> > (1 observation deleted)
> >
> > . destring, replace
> > _var1 contains non-numeric characters; no replace
> > _var2 has all characters numeric; replaced as byte
> > _var3 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var4 contains non-numeric characters; no replace
> > _var5 contains non-numeric characters; no replace
> > _var6 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var7 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var8 contains non-numeric characters; no replace
> > _var9 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var10 contains non-numeric characters; no replace
> >
> > . d
> >
> > Contains data from spiedel.dta
> >  obs:             3
> > vars:            10                          4 May 2006 20:39
> > size:           183 (99.9% of memory free)
> > 
> --------------------------------------------------------------
> -----------------
> >              storage  display     value
> > variable name   type   format      label      variable label
> > 
> --------------------------------------------------------------
> -----------------
> > _var1           str3   %9s                    Sex
> > _var2           byte   %10.0g                 Age
> > _var3           byte   %10.0g                 Left eye
> > _var4           str9   %9s                    Right eye
> > _var5           str12  %12s                   Lower eyelid
> > _var6           byte   %10.0g                 Upper eyelid
> > _var7           byte   %10.0g                 Lateral canthus
> > _var8           str14  %14s                   Medial canthus
> > _var9           byte   %10.0g                 Recurrent lesion
> > _var10          str14  %14s                   Primary lesion
> > 
> --------------------------------------------------------------
> -----------------
> > Sorted by:
> >     Note:  dataset has changed since last saved
> >
> > . l
> >
> >     
> +-------------------------------------------------------------
> -------------------+
> >     | _var1   _var2   _var3   _var4   _var5   _var6   _var7 
>   _var8  
> _var9   _var10 |
> >     
> |-------------------------------------------------------------
> -------------------|
> >  1. |     M      47       .       Y       Y       .       . 
>       Y     
>  .        Y |
> >  2. |     M      66       .       Y       Y       .       . 
>       Y     
>  .        Y |
> >  3. |     M      56       .       Y       Y       .       . 
>             
>  .        Y |
> >     
> +-------------------------------------------------------------
> -------------------+
> >
> >
> > Naturally, I can't see what problems lurk in the rest of 
> the data, but so
> > far, so good.
> >
> > Nick
> > n.j.cox@durham.ac.uk
> >
> > Radu Ban
> >
> >> i don't know if this is the best way, but you could use the
> >> -xpose- command.
> >>
> >> but before that you need to make all you variables numerical.
> >> for example:
> >>
> >> /*coding males as 1 females as 0, Y as 1 no(missing) as 0 */
> >> /*coding age as numeric age*/
> >>
> >> forval i = 2/256 {
> >> gen real_v`i' = 1 if v`i' == "M" | v`i' == "Y"
> >> replace real_v`i' = 0 if v`i' == "F" | v`i' == ""
> >> replace real_v`i' = real(v`i') in 2
> >> }
> >>
> >> /*now you no longer really need id and v1, as long as you 
> make note of
> >> what each row means*/
> >>
> >> drop id v1
> >>
> >> /*now you can transpose*/
> >> xpose, clear
> >>
> >> /*now you rename to get the variables name back*/
> >> rename v1 sex
> >> rename v2 age
> >> ...
> >>
> >> /*now you label values*/
> >> label define yesno 1 yes 2 no
> >> label define gender 1 male 0 female
> >> ...
> >>
> >> hope this helps. there are probably more elegant solutions though.
> >
> > Thomas Speidel
> >
> >> > I have a dataset that I need to re-organize in a long format.
> >> > Here is a sample:
> >> >
> >> > +--------------------------------------+
> >> > | id                 v1   v2   v3   v4 |
> >> > |--------------------------------------|
> >> > |  2                Sex    M    M    M |
> >> > |  3                Age   47   66   56 |
> >> > |  4           Left eye                |
> >> > |  5          Right eye    Y    Y    Y |
> >> > |  6       Lower eyelid    Y    Y    Y |
> >> > |--------------------------------------|
> >> > |  7       Upper eyelid                |
> >> > |  8    Lateral canthus                |
> >> > |  9     Medial canthus    Y    Y      |
> >> > | 10   Recurrent lesion                |
> >> > | 11     Primary lesion    Y    Y    Y |
> >> > +--------------------------------------+
> >> >
> >> > There are 255 observations (v2-v256) occupying the columns.
> >>  I need each
> >> > observation to be a distinct row.  Almost all of the
> >> entries are string,
> >> > (missing means "NO"). I am having difficulties using the
> >> reshape command
> >> > to achieve my goal, possibly because of the strings.  Any
> >> suggestion on
> >> > how to approach this?  Should I create a loop to encode all
> >> variables?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index