Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: reshape vs reshape


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: reshape vs reshape
Date   Thu, 11 Dec 2008 19:55:59 -0000

Ashim: 

It is always best to start with a totally concrete and specific report
about 

1. Your data 
2. Exactly what you did in exactly the same sequence 
3. Exactly what happened. 

And then to separate out whatever puzzlement, speculations, guesses you
want to add. 

Of these, your dataset (1) is very clear. So, I -reshape-d wide in the
way I guess you did. I then find that I can -reshape long- and -reshape
wide- back and forth without getting any problem. 

Thus I don't see that you have a reproducible problem. 

See also 

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.0812/date/article-400.html> 

for a very recent thread which reported a -reshape- problem that no one
but the sender could repeat. Whether the eventual diagnosis in that
thread matches your situation I can't say. 

It is important to realise that the -reshape- code has been banged on
many, many times. That is no proof of absence of bugs, but if it fell
over or gave poor results because of missing data that would have been
noticed a long, long time ago. 

Code follows my signature. 

Nick 
n.j.cox@durham.ac.uk 

. reshape wide yearmonth med avg10, i(i) j(j)
(note: j = 1 2 3 4 5 6)

Data                               long   ->   wide
------------------------------------------------------------------------
-----
Number of obs.                      120   ->      20
Number of variables                   5   ->      19
j variable (6 values)                 j   ->   (dropped)
xij variables:
                              yearmonth   ->   yearmonth1 yearmonth2 ...
yearmonth6
                                    med   ->   med1 med2 ... med6
                                  avg10   ->   avg101 avg102 ... avg106
------------------------------------------------------------------------
-----

. reshape long
(note: j = 1 2 3 4 5 6)

Data                               wide   ->   long
------------------------------------------------------------------------
-----
Number of obs.                       20   ->     120
Number of variables                  19   ->       5
j variable (6 values)                     ->   j
xij variables:
   yearmonth1 yearmonth2 ... yearmonth6   ->   yearmonth
                     med1 med2 ... med6   ->   med
               avg101 avg102 ... avg106   ->   avg10
------------------------------------------------------------------------
-----

. reshape wide
(note: j = 1 2 3 4 5 6)

Data                               long   ->   wide
------------------------------------------------------------------------
-----
Number of obs.                      120   ->      20
Number of variables                   5   ->      19
j variable (6 values)                 j   ->   (dropped)
xij variables:
                              yearmonth   ->   yearmonth1 yearmonth2 ...
yearmonth6
                                    med   ->   med1 med2 ... med6
                                  avg10   ->   avg101 avg102 ... avg106

Nick 
n.j.cox@durham.ac.uk 

Ashim Kapoor

I always thought the following 2 are the same : -

gen i = ....
gen j=.....

reshape long stubnameS , i(i) j(j)

do some stuff.
_______________________________________

after this whether I do reshape wide or I do reshape wide
stubnameS,i(i) j(j) I get the same result?
--------------------------------------------------------------------

I guess something happens when some of the data is missing and reshape
chops off stuff.

Consider the following data :

i	j	yearmonth	med	avg10
1	1	Nov 1987	-3.768014	
1	2	Dec 1987	0	
1	3	Jan 1988	0	
1	4	Feb 1988	0	
1	5	Mar 1988	0	
1	6	Apr 1988	0	
2	1	Nov 2008	-3.085494	-36.09174
2	2	Dec 2008	0	3.961556
2	3			
2	4			
2	5			
2	6			
3	1	Oct 2008	-2.641026	-34.1902
3	2	Nov 2008	-.4579781	-3.798143
3	3	Dec 2008	0	3.961556
3	4			
3	5			
3	6			
4	1	Mar 2008	-1.758796	-16.16949
4	2	Apr 2008	0	6.934669
4	3	May 2008	0	1.170446
4	4	Jun 2008	-1.387379	-10.02304
4	5	Jul 2008	0	-2.129111
4	6	Aug 2008	0	-1.896609
5	1	Jan 2008	-1.756086	-18.59356
5	2	Feb 2008	0	1.012038
5	3	Mar 2008	0	-3.576064
5	4	Apr 2008	0	6.934669
5	5	May 2008	0	1.170446
5	6	Jun 2008	-1.387379	-10.02304
6	1	Dec 2008	-1.72031	-21.32613
6	2			
6	3			
6	4			
6	5			
6	6			
7	1	Sep 2001	-1.694896	-22.79853
7	2	Oct 2001	.585517	5.033921
7	3	Nov 2001	.8310239	8.23893
7	4	Dec 2001	0	2.686081
7	5	Jan 2002	0	-2.899273
7	6	Feb 2002	0	2.088569
8	1	Jan 1995	-1.505338	-24.29245
8	2	Feb 1995	0	-4.121847
8	3	Mar 1995	0	.230197
8	4	Apr 1995	0	4.277149
8	5	May 1995	0	1.438767
8	6	Jun 1995	0	.0061265
9	1	Aug 1998	-1.427644	-19.1855
9	2	Sep 1998	0	2.650797
9	3	Oct 1998	0	8.566883
9	4	Nov 1998	0	5.821879
9	5	Dec 1998	0	.9273936
9	6	Jan 1999	0	1.503127
10	1	Feb 2008	-1.310205	-12.71627
10	2	Mar 2008	0	-3.576064
10	3	Apr 2008	0	6.934669
10	4	May 2008	0	1.170446
10	5	Jun 2008	-1.387379	-10.02304
10	6	Jul 2008	0	-2.129111
11	1	Nov 2000	-1.172016	-15.50871
11	2	Dec 2000	0	1.43953
11	3	Jan 2001	0	5.089918
11	4	Feb 2001	-.9371673	-6.773209
11	5	Mar 2001	0	-7.839975
11	6	Apr 2001	0	5.265727
12	1	Sep 2008	-1.157544	-14.82977
12	2	Oct 2008	-2.907829	-21.48954
12	3	Nov 2008	-.4579781	-3.798143
12	4	Dec 2008	0	3.961556
12	5			
12	6			
13	1	Feb 2003	-1.083756	-13.52327
13	2	Mar 2003	0	-1.865085
13	3	Apr 2003	0	6.748137
13	4	May 2003	.6192567	4.772079
13	5	Jun 2003	0	3.747078
13	6	Jul 2003	.5929043	4.696925
14	1	Oct 1997	-1.082076	-19.66709
14	2	Nov 1997	0	.0263549
14	3	Dec 1997	0	2.08433
14	4	Jan 1998	0	-1.861335
14	5	Feb 1998	0	6.526412
14	6	Mar 1998	0	2.53162
15	1	Jul 2002	-1.053805	-18.01329
15	2	Aug 2002	0	1.502194
15	3	Sep 2002	-1.086407	-11.71621
15	4	Oct 2002	0	4.853646
15	5	Nov 2002	.720527	5.278154
15	6	Dec 2002	0	-4.561733
16	1	Sep 2002	-1.036257	-18.30384
16	2	Oct 2002	0	4.853646
16	3	Nov 2002	.720527	5.278154
16	4	Dec 2002	0	-4.561733
16	5	Jan 2003	0	-6.756925
16	6	Feb 2003	0	-2.204613
17	1	Oct 2001	-1.029956	-13.01497
17	2	Nov 2001	.8310239	8.23893
17	3	Dec 2001	0	2.686081
17	4	Jan 2002	0	-2.899273
17	5	Feb 2002	0	2.088569
17	6	Mar 2002	0	3.244352
18	1	Oct 1981	-1.014946	
18	2	Nov 1981	.3329222	
18	3	Dec 1981	0	
18	4	Jan 1982	0	
18	5	Feb 1982	0	
18	6	Mar 1982	0	
19	1	Aug 2008	-.9911072	-14.04875
19	2	Sep 2008	-1.803169	-10.80405
19	3	Oct 2008	-2.907829	-21.48954
19	4	Nov 2008	-.4579781	-3.798143
19	5	Dec 2008	0	3.961556
19	6			
20	1	Aug 2001	-.9734117	-11.69779
20	2	Sep 2001	-1.575722	-12.46415
20	3	Oct 2001	.585517	5.033921
20	4	Nov 2001	.8310239	8.23893
20	5	Dec 2001	0	2.686081
20	6	Jan 2002	0	-2.899273

__________________________________________________

In this data set if I do reshape wide I ONLY get 2 values for j in the
reshaped data where as if I do reshape wide yearmonth med avg10, i(i)
j(j) I get ALL the 6 values for j.

Can someone tell me if this is due to some reason because of missing
data values  ?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index