Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Problems with the reshape command


From   Syed Basher <syed.basher@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Problems with the reshape command
Date   Wed, 19 Jan 2011 19:46:04 -0800 (PST)

Thank you so much Rebeca - adding item in by port ... visually improves the wide 
structure. I tried the -collapse- command before using -reshape-, which reports 
only one observation (perhaps the average) by port, but I needed to show price 
of all entries by port and item in a wide structure, so your modification to 
original post was really helpful.

Syed


----- Original Message ----
From: "POPE, REBECCA" <RPOPE@uams.edu>
To: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Sent: Thu, January 20, 2011 2:52:42 AM
Subject: RE: st: RE: Problems with the reshape command

Since you mention that you are giving this data to someone else who might prefer 
the wide form, I'm going to expand on my original answer a bit. Nick's valid 
points about data structure aside, I can certainly understand having to supply 
data to another person in a format they like.

"line" is admittedly arbitrary, but my intent was to illustrate a dirty 
work-around if you absolutely had to convert to wide form. Ideally, if you were 
going to attempt to reshape your data, there would be some _meaningful_ variable 
to add to "port" (e.g. date of shipment - you're outside my industry, so this is 
a "best guess") that might enhance any analysis, whether done by you or someone 
else. Something like "line" would be a variable of last resort if there is no 
logical option in your data.

As noted in my original reply, it is possible to further consolidate the after 
reshaping the data. However, since I didn't know your ultimate objectives, I 
left that rather vague. I'm sorry if that caused any confusion. As Nick 
mentioned in a previous reply to this post, one powerful tool in Stata is the 
-collapse- command. If you wanted to get, for example, average price at each 
port for each item you could run -collapse- before or after -reshape- (or just 
not run reshape). If you need to maintain individual price observations, this 
won't be a good choice.

Note: I don't know how useful it will be for your real data set, but just in 
case you find it helpful, to replicate the last table in your original post, the 
code is:
. by port item, sort: generate line = _n
. reshape wide price, i(port line) j(item)
and optionally 
. list port price*, noobs

Regards,
Rebecca

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu 
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Syed Basher
Sent: Wednesday, January 19, 2011 3:52 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: Problems with the reshape command

In my case, long structure is equally (or better) informative as wide structure. 

In fact, with the wide structure I get numerous empty cells which is visually 
uncomfortable. I guess I will leave the choice between long and wide structure 
to the end-user in my office by supplying them both structures. I appreciate 
Nick's meddling on this matter.

Syed



----- Original Message ----
From: Nick Cox <njcoxstata@gmail.com>
To: statalist@hsphsun2.harvard.edu
Sent: Wed, January 19, 2011 10:19:58 PM
Subject: Re: st: RE: Problems with the reshape command

Rebecca is clearly right in the sense that if you create a
sufficiently fine identifier, -reshape- will oblige. But what is
useful about the data structure created? It rules out as many analyses
as it allows because -line- is arbitrary and separates things you
might want to compare.

But even if this is what Syed is asking for, there is a deeper
question: With this structure, why -reshape- at all? Almost all
analysis questions are easier to answer with a long structure.

Nick

On Wed, Jan 19, 2011 at 5:33 PM, POPE, REBECCA <RPOPE@uams.edu> wrote:
> Hi Syed,
> I'm a bit confused by your use of the term "cross-tab", but since you are using 
>
>reshape, I'm going to assume you are just trying to get the prices for the 
>different goods to become variables. If so, do you have some other additional 
>identifying variable that you could use in your reshape command? If you have 
>multiple prices for the same item at the same port, might the shipments be from 

>different suppliers or have arrived on different dates? If so, you could use 
>something like the following:
>
> . reshape wide price, i(port date) j(item)
>
> I'm guessing this won't give you exactly what you want because there will still 
>
>be multiple lines per port (at least if your real data looks like the 
>hypothetical data), but you'll have gotten around reshape's objections and can 
>use other commands to consolidate after that. Other users might have more 
>elegant solutions, but I hope this helps.
>
> If you don't have another logical ID variable to add to port, you can generate 

>a fake one by doing the following:
>
> . by port, sort: generate line = _n
> . reshape wide price, i(port line) j(item)
>
> port   line   pri~1006   pri~2011   pri~2045   pri~4029   pri~4061   pri~7031  

>pri~8041
>------------------------------------------------------------------------------------------
>
>-
> 1      1          .          .          .          .      92.79          .    
>     .
> 1      2      37.55          .          .          .          .          .    
>     .
> 1      3          .          .      16.21          .          .          .    
>     .
> 2      1          .          .          .          .          .          .    
> 12.55
> 2      2          .      13.13          .          .          .          .    
>     .
>------------------------------------------------------------------------------------------
>
>-
> 2      3          .      89.68          .          .          .          .    
>     .
> 3      1          .          .          .      27.62          .          .    
>     .
> 3      2          .          .      15.18          .          .          .    
>     .
> 3      3          .          .          .          .          .      68.01    
>     .
> 3      4          .          .          .      15.47          .          .    
>     .
>
>
> Regards,
> Rebecca
>
> Rebecca A. Pope
> Program Manager
> UAMS CCTR Health Services Research
> Fay W. Boozman College of Public Health
> Dept. of Health Policy and Management
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu 
>[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Syed Basher
> Sent: Wednesday, January 19, 2011 10:49 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: Problems with the reshape command
>
> Dear Statalist,
>
> I am using Stata 11.1.  I have the following hypothetical data:
>
>     +--------------------------+
>      | port       item     price      |
>      |--------------------------|
>  1. |        3    4029   27.62  |
>  2. |        3    4029   15.47  |
>  3. |        1    1006   37.55  |
>  4. |        3    2045   15.18  |
>  5. |        1    2045   16.21  |
>      |------------------------|
>  6. |        1    4061   92.79 |
>  7. |        2    8041   12.55 |
>  8. |        2    2011   89.68 |
>  9. |        3    7031   68.01 |
> 10. |        2    2011   13.13 |
>      |-----------------------|
>
> I would like to reshape the data to wide format using:
> . reshape wide price, i(port) j(item)
>
> This is of course problematic in Stata since "item" is not unique within
> "port".  Eventually I would like to obtain the following cross-tab (in wide
> format):
>
> port |       1006     2011     2045     4029     4061     7031     8041
> -------------------------------------------------------------------
> 1     |        37.55                16.21                 92.79
> 2     |
> 89.68                                                        12.55
> 2     |                   13.33
> 3     |                                15.18     27.62                  68.01
> 3     |                                              15.47
>
> I have been consulting Stata's FAQs on this issue
> (http://www.stata.com/support/faqs/data/reshape3.html) without much success.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



      
*
*   For searches and help try:
*  http://www.stata.com/help.cgi?search
*  http://www.stata.com/support/statalist/faq
*  http://www.ats.ucla.edu/stat/stata/
Confidentiality Notice: This e-mail message, including any attachments,
is for the sole use of the intended recipient(s) and may contain
confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the 
intended recipient, please contact the sender by reply
e-mail and destroy all copies of the original message..


*
*   For searches and help try:
*  http://www.stata.com/help.cgi?search
*  http://www.stata.com/support/statalist/faq
*  http://www.ats.ucla.edu/stat/stata/



      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index