Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: reshape query
From
Tim Evans <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: reshape query
Date
Mon, 6 Jan 2014 10:04:17 +0000
Hi Richard, Cam,
Thanks for your responses. I do still want to manipulate the data within Stata, so I still need to do some work with -reshape-.
Richard, I've followed your code and this did the trick - many thanks for your help.
Best wishes
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Richard Herron
Sent: 03 January 2014 20:21
To: [email protected]
Subject: Re: st: reshape query
You can do the -reshape- in one pass. Also, the -id- variable isn't necessary in this case and I think I would help the Yes/No variable as a string so you get more meaningful variable names (I think if you used -encode- then the 1 is N and 2 is Y, but I would mess that up).
Does the following help?
* * * * *
clear
* some data
input age str1 invsurg2_string id age_
17 Y 1 1
18 Y 1 1
19 Y 1 2
20 Y 1 2
21 Y 1 1
22 Y 1 3
23 Y 1 3
24 Y 1 9
25 Y 1 12
26 Y 1 19
27 Y 1 26
28 Y 1 32
29 Y 1 41
30 Y 1 65
31 Y 1 86
32 Y 1 75
17 N 2 1
18 N 2 1
19 N 2 2
20 N 2 2
21 N 2 1
22 N 2 3
23 N 2 3
24 N 2 9
25 N 2 12
26 N 2 19
27 N 2 26
28 N 2 32
29 N 2 41
30 N 2 65
31 N 2 86
32 N 2 75
end
encode invsurg2_string, generate(invsurg2) replace age_ = age_ + 1 if (invsurg2_string == "N") drop invsurg2_string list
* reshape
* no need for -id-, -reshape wide- can use -age- directly drop id
* I would use -invsurg2- as string to get clearer varnames decode invsurg2, generate(YN) drop invsurg2
* now reshape to wide with -, string-
reshape wide age_, i(age) j(YN) string
renpfix age_ // -renpfix- from SSC
order age Y N
list
. list
+---------------+
| age Y N |
|---------------|
1. | 17 1 2 |
2. | 18 1 2 |
3. | 19 2 3 |
4. | 20 2 3 |
5. | 21 1 2 |
|---------------|
6. | 22 3 4 |
7. | 23 3 4 |
8. | 24 9 10 |
9. | 25 12 13 |
10. | 26 19 20 |
|---------------|
11. | 27 26 27 |
12. | 28 32 33 |
13. | 29 41 42 |
14. | 30 65 66 |
15. | 31 86 87 |
|---------------|
16. | 32 75 76 |
+---------------+
On Fri, Jan 3, 2014 at 12:01 PM, Tim Evans <[email protected]> wrote:
> Hi all,
>
>
> I'm using Stata 11.2 and trying to reshape a dataset to an output that I would like.
>
> I am trying to get my data in the format (note that invsurg2 is
> spurious at the moment and will be dropped when I get to this point)
>
> age y n
> 17 1 0
> 18 1 0
> 19 2 0
> 20 2 0
> 21 1 0
> 22 3 0
> 23 3 1
> 24 9 0
> 25 12 5
> ~~
> ~~
> 94 22 108
> 95 9 87
> 96 12 65
> 97 2 44
> 98 4 16
> 99 1 11
>
> y and n relate to invsurg2 which is shown below
>
> My initial data look like this:
>
> age invsurg2 id age_
> 17 Y 1 1
> 18 Y 1 1
> 19 Y 1 2
> 20 Y 1 2
> 21 Y 1 1
> 22 Y 1 3
> 23 Y 1 3
> 24 Y 1 9
> 25 Y 1 12
> 26 Y 1 19
> 27 Y 1 26
> 28 Y 1 32
> 29 Y 1 41
> 30 Y 1 65
> 31 Y 1 86
> 32 Y 1 75
>
> Where invsurg2 is a Y/N variable (numeric behind a label), id has been generated using:
> bysort age: g id=_n
> and age_ currently relates to a count, but for reasons of reshaping,
> found it helpful to call the variable age_
>
> This run of data runs to age 99, and then is repeated, but for
> invsurg2==N
>
> I reshape the data using the following:
>
> reshape wide age_, i(id inv) j(age)
>
> Which returns the following:
>
> id invsurg2 age_17 age_18 age_19 age_20 age_21 age_22 age_23 age_24 age_25 age_94 age_95 age_96 age_97 age_98 age_99
> 1 Y 1 1 2 2 1 3 3 9 12 22 9 12 2 4 1
> 2 N 0 0 0 0 0 0 1 0 5 108 87 65 44 16 11
>
> What I have been struggling with is how to complete my final reshape to provide this desired output:
>
> age y n
> 17 1 0
> 18 1 0
> 19 2 0
> 20 2 0
>
> As indicated earlier, invsurg2
>
> I would appreciate any help here.
>
> Best wishes
>
> Tim
>
> **********************************************************************
> **** The information contained in the EMail and any attachments is
> confidential and intended solely and for the attention and use of the
> named addressee(s). It may not be disclosed to any other person
> without the express authority of Public Health England, or the
> intended recipient, or both. If you are not the intended recipient,
> you must not disclose, copy, distribute or retain this message or any
> part of it. This footnote also confirms that this EMail has been swept
> for computer viruses by Symantec.Cloud, but please re-sweep any
> attachments before opening or saving. http://www.gov.uk/PHE
> **********************************************************************
> ****
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/