Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: reshape with j split


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: reshape with j split
Date   Mon, 15 Dec 2003 09:58:20 -0000

I think you're moving in the right direction
for a -reshape- to wide. As for the ANOVA,
your original data structure looks better.

egen j = concat(s1*), p("_")
drop s1*
reshape wide s2peak, i(animal) j(j) string

Nick
n.j.cox@durham.ac.uk

David Airey

> I have a reshape question. I find this one of the hardest
> commands to
> remember how to use.
>
> I cannot find a help example that exactly parallels my situation. I
> have an identifier that is split between variables. This
> situation is
> common in ANOVA where treatment cells may be identified by
> more than
> one factor.
>
> I have data like:
>
> . list, sep(6)
>
>       +----------------------------------------+
>       | s1level   s1s2de~y   animal   s2peak~e |
>       |----------------------------------------|
>    1. |       0         50   1_1_0F     773.75 |
>    2. |       0        100   1_1_0F    1001.63 |
>    3. |      75         50   1_1_0F      472.5 |
>    4. |      75        100   1_1_0F    927.875 |
>    5. |      85         50   1_1_0F    611.375 |
>    6. |      85        100   1_1_0F    654.375 |
>       |----------------------------------------|
>    7. |       0         50   1_1_1F    1116.88 |
>    8. |       0        100   1_1_1F    1101.38 |
>    9. |      75         50   1_1_1F    544.875 |
>   10. |      75        100   1_1_1F    567.875 |
>   11. |      85         50   1_1_1F    443.875 |
>   12. |      85        100   1_1_1F        466 |
>       |----------------------------------------|
>   13. |       0         50   1_1_2F      309.5 |
>   14. |       0        100   1_1_2F    336.286 |
>   15. |      75         50   1_1_2F    442.625 |
> etc.
>
> where the first two variables s1level and s1s2delay define
> 6 treatment
> conditions from which s2peakvalue was measured fore each animal. I
> would like to reshape this data to calculate a ratio from the
> conditions within each animal. I would like to get a data set that
> looks like:
>
> animal s2peak0_50 s2peak0_100 s2peak75_50 s2peak75_100 s2peak85_50
> s2peak85_100
>
> in order to calculate a ratios of each of variables 4-7 with the
> average of variables 2 and 3. I can do this directly in the
> long form
> by the following code:
>
> egen step = seq(), from(0) to(5) block(1)
> gen ppi2 = ((s2peak[_n-step]+s2peak[_n-step+1])/2 -
> s2peak[_n])/((s2peak[_n-step]+s2peak[_n-step+1])/2)*100
> drop if s1level == 0
>
>       +----------------------------------------------------+
>       | s1level   s1s2de~y   animal   s2peak~e        ppi2 |
>       |----------------------------------------------------|
>    1. |      75         50   1_1_0F      472.5    46.77181 |
>    2. |      75        100   1_1_0F    927.875   -4.527213 |
>    3. |      85         50   1_1_0F    611.375    31.12723 |
>    4. |      85        100   1_1_0F    654.375    26.28318 |
>       |----------------------------------------------------|
>    5. |      75         50   1_1_1F    544.875    50.87344 |
>    6. |      75        100   1_1_1F    567.875    48.79973 |
>    7. |      85         50   1_1_1F    443.875    59.97971 |
>    8. |      85        100   1_1_1F        466     57.9849 |
>       |----------------------------------------------------|
>    9. |      75         50   1_1_2F    442.625   -37.08108 |
>   10. |      75        100   1_1_2F        265    17.92943 |
>   11. |      85         50   1_1_2F      264.5    18.08428 |
>   12. |      85        100   1_1_2F    192.375    40.42141 |
>       |----------------------------------------------------|
>   13. |      75         50   1_1_3F    448.875    50.06605 |
>   14. |      75        100   1_1_3F    462.143     48.5901 |
>   15. |      85         50   1_1_3F    576.875    35.82702 |
> etc.
>
> but I'm wondering if reshape to wide and then back to long
> would not be
> more reliable. As long as data are not missing, I currently have no
> problems. Must I, before I go for wide, say something like,
>
> . egen treatment = group(s1level s1s2delay), label
> . drop s1level s1s2delay
>
>       +------------------------------+
>       | animal   s2peak~e   treatm~t |
>       |------------------------------|
>    1. | 1_1_0F     773.75       0 50 |
>    2. | 1_1_0F    1001.63      0 100 |
>    3. | 1_1_0F      472.5      75 50 |
>    4. | 1_1_0F    927.875     75 100 |
>    5. | 1_1_0F    611.375      85 50 |
>    6. | 1_1_0F    654.375     85 100 |
>       |------------------------------|
>    7. | 1_1_1F    1116.88       0 50 |
>    8. | 1_1_1F    1101.38      0 100 |
>    9. | 1_1_1F    544.875      75 50 |
>   10. | 1_1_1F    567.875     75 100 |
>       +------------------------------+
> etc.
>
> and only then,
>
> . reshape wide s2peakvalue, i(animal) j(treatment)
>
>         animal  s2peak~1  s2peak~2  s2peak~3  s2peak~4  s2peak~5
> s2peak~6
>    1.   1_1_0F    773.75   1001.63     472.5   927.875   611.375
> 654.375
>    2.   1_1_1F   1116.88   1101.38   544.875   567.875
> 443.875
> 466
>    3.   1_1_2F     309.5   336.286   442.625       265     264.5
> 192.375
> etc.
>
> but then I lose my way back to the proper long format for
> ANOVA as well
> as the factor labels, etc.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index