Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Arranging variables across rows
From 
 
Nick Cox <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: Arranging variables across rows 
Date 
 
Tue, 26 Jun 2012 23:30:59 +0100 
-rowsort- is from SJ. I think you are correct; it does not help here
at all, nor does it purport to.
My main advice is to restructure to a long structure as fast as
possible. With this structure this will only be the first of several
awkward problems and not even the most difficult of those.
I created some similar data and did this
. list
     +-------------------------------------------------+
     |  A1    A2    A3    A4    B1    B2   B3   family |
     |-------------------------------------------------|
  1. | 101   102   103   104   102   104    .    alpha |
  2. | 201   202   203   204   203     .    .     beta |
     +-------------------------------------------------+
. keep family A*
. reshape long A, i(family)
(note: j = 1 2 3 4)
Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                        2   ->       8
Number of variables                   5   ->       3
j variable (4 values)                     ->   _j
xij variables:
                           A1 A2 ... A4   ->   A
-----------------------------------------------------------------------------
. drop if _j == .
(0 observations deleted)
. drop _j
. gen treated = 0
. rename A person
. save Afile, replace
file Afile.dta saved
. use rowprob
. keep family B*
. reshape long B, i(family)
(note: j = 1 2 3)
Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                        2   ->       6
Number of variables                   4   ->       3
j variable (3 values)                     ->   _j
xij variables:
                               B1 B2 B3   ->   B
-----------------------------------------------------------------------------
. rename B person
. drop if person == .
(3 observations deleted)
. drop _j
. gen treated = 1
. list
     +---------------------------+
     | family   person   treated |
     |---------------------------|
  1. |  alpha      102         1 |
  2. |  alpha      104         1 |
  3. |   beta      203         1 |
     +---------------------------+
. append using Afile
. collapse (max) treated, by(family person)
. list
     +---------------------------+
     | family   person   treated |
     |---------------------------|
  1. |  alpha      101         0 |
  2. |  alpha      102         1 |
  3. |  alpha      103         0 |
  4. |  alpha      104         1 |
  5. |   beta      201         0 |
     |---------------------------|
  6. |   beta      202         0 |
  7. |   beta      203         1 |
  8. |   beta      204         0 |
     +---------------------------+
Here's the code in case it's useful.
list
keep family A*
reshape long A, i(family)
drop if _j == .
drop _j
gen treated = 0
rename A person
save Afile, replace
use rowprob
keep family B*
reshape long B, i(family)
rename B person
drop if person == .
drop _j
gen treated = 1
list
append using Afile
collapse (max) treated, by(family person)
list
I think this is messier than a single -reshape- because you have
different numbers of A and B variables and they don't map on to one
another. There would be a -merge- solution as well, for sure.
On Tue, Jun 26, 2012 at 10:37 PM, samuel gyetvay <[email protected]> wrote:
> I have two sets of variables, let's call them A1, A2, ... A19 and B1,
> B2, ... B8.
>
> A1, A2, ... A19 give identification numbers for up to 19 individuals
> per family. Each family occupies a row in the data set.
>
> B1, B2, ... B8 list identification numbers of up to 8 individuals who
> have received treatment.
>
> I need to preserve the order and placement of variables in A1, ... A19
> and would like to create a dummy variable equal to 1 whenever an
> individual has received treatment. Basically, I need to go from
> something that like this:
>
> A1  A2  A3 ...  A19
> 101 102 103 ... 19
>
> B1 B2 B3 ... B8
> 103  .    .    ...  .
>
> To something like this
>
> A1  A2  A3 ...   A19
> 101 102 103 ... 119
>
> D1 D2 D3 ... D19
> 0   0    1   ... 0
>
> I am aware of the command rowsort, but it does not solve this
> particular problem. rowsort would turn
>
> B1 B2 B3 ... B8
> .      .  102 ...  .
>
>  into
>
> B1 B2 B3 ... B8
> 102 .   .    ...  .
>
> when what I need is
>
> B1 B2 B3 ... B8
> .    102  .  ...  .
>
> I could create a dummy variable equal to 1 when A is equal to B
>
>
> Hopefully this question is clearly phrased, and there exists a simple
> solution. Please let me know if you have any suggestions or if
> anything is unclear.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/