Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: data transformation

From   "Joseph Coveney" <[email protected]>
To   <[email protected]>
Subject   st: Re: data transformation
Date   Mon, 4 Mar 2013 12:35:03 +0900

Seok-Woo Kwon wrote:

I do not know how to describe my problem in general terms. So let me
use an example to describe it. (I am using Stata 12 for Windows.)

I have about 10,000 observations and 1,000 variables on individuals,
but 4 observations and 3 variables will be enough to show the problem.
My data looks like this:

           id       X      Y    Z
          001      1      2    X
          002      2      3    Y
          003      3      5    2
          004      4      6    Y

I need data that looks like this:

           id       X      Y    Z     Z_transformed
          001       1    2    X                   1
          002       2    3    Y                   3
          003       3    5    2                   2
          004       4    6    Y                   6

That is, some values in the variable "Z" will be a variable name (like
"X" or "Y"). I would like to transform that variable into a value of
that variable for the observation.
For example, the value of Z for id 001 is X. Instead of X, I would
like to show the value of X for id oo1 (which is 1). Is there a way to
program this in Stata?


Yes.  See below.  You might want to look upstream to see whether there is a way
to get the data the way you need it in the first place.

Joseph Coveney
. clear *

. set more off

. input str3 id byte (X Y) str1 Z

            id         X         Y          Z
  1.           "001"      1      2    "X"
  2.           "002"      2      3    "Y"
  3.           "003"      3      5    "2"
  4.           "004"      4      6    "Y"
  5. end

. *
. * For pretty
. *
. preserve

. drop Z

. tempfile tmpfil0

. quietly save `tmpfil0'

. restore

. *
. * Segregate data with nonnumeric values of Z
. *
. preserve

. quietly drop if missing(real(Z))

. keep id Z

. quietly destring Z, generate(Z_transformed)

. tempfile tmpfil1

. quietly save `tmpfil1'

. restore

. quietly keep if missing(real(Z))

. *
. * Reshape target variables long
. *
. preserve

. keep id Z

. tempfile tmpfil2

. quietly save `tmpfil2'

. restore

. drop Z

. foreach var of varlist _all {
  2.         if "`var'" == "id" continue
  3.         rename `var' Z_transformed`var'
  4. }

. quietly reshape long Z_transformed, i(id) j(Z) string

. *
. * Let -merge- select the values
. *
. merge m:1 id Z using `tmpfil2', assert(match master) keep(match) ///
>     nogenerate noreport

. *
. * Reassembly
. *
. append using `tmpfil1'

. *
. * Pretty
. *
. merge 1:1 id using `tmpfil0', assert(match) nogenerate noreport

. order id X Y Z

. sort id

. list, noobs separator(0) abbreviate(20)

  |  id   X   Y   Z   Z_transformed |
  | 001   1   2   X               1 |
  | 002   2   3   Y               3 |
  | 003   3   5   2               2 |
  | 004   4   6   Y               6 |

. exit

end of do-file

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index