Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reshape query


From   Richard Herron <[email protected]>
To   [email protected]
Subject   Re: st: reshape query
Date   Fri, 3 Jan 2014 15:20:56 -0500

You can do the -reshape- in one pass. Also, the -id- variable isn't
necessary in this case and I think I would help the Yes/No variable as
a string so you get more meaningful variable names (I think if you
used -encode- then the 1 is N and 2 is Y, but I would mess that up).

Does the following help?

* * * * *

clear

* some data
input age     str1 invsurg2_string        id      age_
 17      Y               1               1
 18      Y               1               1
 19      Y               1               2
 20      Y               1               2
 21      Y               1               1
 22      Y               1               3
 23      Y               1               3
 24      Y               1               9
 25      Y               1               12
 26      Y               1               19
 27      Y               1               26
 28      Y               1               32
 29      Y               1               41
 30      Y               1               65
 31      Y               1               86
 32      Y               1               75
 17      N               2               1
 18      N               2               1
 19      N               2               2
 20      N               2               2
 21      N               2               1
 22      N               2               3
 23      N               2               3
 24      N               2               9
 25      N               2               12
 26      N               2               19
 27      N               2               26
 28      N               2               32
 29      N               2               41
 30      N               2               65
 31      N               2               86
 32      N               2               75
end
encode invsurg2_string, generate(invsurg2)
replace age_ = age_ + 1 if (invsurg2_string == "N")
drop invsurg2_string
list

* reshape
* no need for -id-, -reshape wide- can use -age- directly
drop id

* I would use -invsurg2- as string to get clearer varnames
decode invsurg2, generate(YN)
drop invsurg2

* now reshape to wide with -, string-
reshape wide age_, i(age) j(YN) string
renpfix age_ // -renpfix- from SSC
order age Y N
list

. list
     +---------------+
     | age    Y    N |
     |---------------|
  1. |  17    1    2 |
  2. |  18    1    2 |
  3. |  19    2    3 |
  4. |  20    2    3 |
  5. |  21    1    2 |
     |---------------|
  6. |  22    3    4 |
  7. |  23    3    4 |
  8. |  24    9   10 |
  9. |  25   12   13 |
 10. |  26   19   20 |
     |---------------|
 11. |  27   26   27 |
 12. |  28   32   33 |
 13. |  29   41   42 |
 14. |  30   65   66 |
 15. |  31   86   87 |
     |---------------|
 16. |  32   75   76 |
     +---------------+

On Fri, Jan 3, 2014 at 12:01 PM, Tim Evans <[email protected]> wrote:
> Hi all,
>
>
> I'm using Stata 11.2 and trying to reshape a dataset to an output that I would like.
>
> I am trying to get my data in the format (note that invsurg2 is spurious at the moment and will be dropped when I get to this point)
>
> age     y       n
> 17      1       0
> 18      1       0
> 19      2       0
> 20      2       0
> 21      1       0
> 22      3       0
> 23      3       1
> 24      9       0
> 25      12      5
> ~~
> ~~
> 94      22      108
> 95      9       87
> 96      12      65
> 97      2       44
> 98      4       16
> 99      1       11
>
> y and n relate to invsurg2 which is shown below
>
> My initial data look like this:
>
> age     invsurg2        id      age_
> 17      Y               1               1
> 18      Y               1               1
> 19      Y               1               2
> 20      Y               1               2
> 21      Y               1               1
> 22      Y               1               3
> 23      Y               1               3
> 24      Y               1               9
> 25      Y               1               12
> 26      Y               1               19
> 27      Y               1               26
> 28      Y               1               32
> 29      Y               1               41
> 30      Y               1               65
> 31      Y               1               86
> 32      Y               1               75
>
> Where invsurg2 is a Y/N variable (numeric behind a label), id has been generated using:
> bysort age: g id=_n
> and age_ currently relates to a count, but for reasons of reshaping, found it helpful to call the variable age_
>
> This run of data runs to age 99, and then is repeated, but for invsurg2==N
>
> I reshape the data using the following:
>
> reshape wide age_, i(id inv) j(age)
>
> Which returns the following:
>
> id      invsurg2        age_17  age_18  age_19  age_20  age_21  age_22  age_23  age_24  age_25  age_94  age_95  age_96  age_97  age_98  age_99
> 1       Y               1       1       2       2       1       3       3       9       12      22      9       12      2       4       1
> 2       N               0       0       0       0       0       0       1       0       5       108     87      65      44      16      11
>
> What I have been struggling with is how to complete my final reshape to provide this desired output:
>
> age     y       n
> 17      1       0
> 18      1       0
> 19      2       0
> 20      2       0
>
> As indicated earlier, invsurg2
>
> I would appreciate any help here.
>
> Best wishes
>
> Tim
>
> **************************************************************************
> The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
> **************************************************************************
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index