[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Ning Li <N.Li@ms.unimelb.edu.au> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: from long to wide |

Date |
Thu, 01 Nov 2007 12:29:36 +1100 |

Garry, Marten, Nick, Austin and Bill,

Thanks very much for your responses to my question.

1) I explained to and requested RADL several times

2) STATA commands are very useful, any restriction on it causes huge inconvenience

3) I will try Austin's codes and take Bill's advice

Cheers

Ning

At 01:40 AM 11/1/2007, you wrote:

Ning <N.Li@ms.unimelb.edu.au> asked,

> How to reshape a data set from long form to wide form without using the

> command reshape?

and went on to explain why,

> Recently I am working on a data set from a remote access data library. The

> RADL receives STATA codes submitted by its users, it runs jobs and returns

> outputs via its web interface. For confidentiality reasons, not all the

> STATA commands are allowed in the RADL. One of these commands is reshape, we

> know reshape a dataset won't reveal extra information though. I tried to

> program but failed.)

Nick Cox <n.j.cox@durham.ac.uk> responded rather negatively to Ning's

question because, as he said, "-reshape- is no threat to anyone." I agree.

Nonetheless, if Ning is getting data from a remote library and they have

turned off -reshape-, what is Ning to do?

> HID PID INJDAM

> A 1 a

> A 2 b

> A 2 c

> B 1 b

> B 2 a

>

> where HID and PID represent household and personal ids respectively, and

> INJDAMC the damage type (eg, a=burns, b=bruising, c=..)in a person's most

> recent injury.

Here's how to convert this wide data to long form:

// setup

. use original, clear

. sort HID PID

. by HID PID: assert _N==1 // checking assumptions

. save original, replace // save sorted by HID

// do PID==1

. use original, clear

. keep if PID==1

. drop PID

. rename INJDAM INJDAM1

. save long, replace

// do PID==2

. use original, clear

. keep if PID==2

. drop PID

. rename INJDAM INJDAM2

. merge HID using long

. drop _merge

. sort HID

. save long, replace

If there was a PID==3, one would continue

// do PID==3

. use original, clear

. keep if PID==3

. drop PID

. rename INJDAM INJDAM3

. merge HID using long

. drop _merge

. sort HID

. save long, replace

An outline of the process is,

1. Check assumptions. HID PID should uniquely identify the

observations. In addition, resave original.dta sorted by

at least HID, because we are going to need it that way.

2. Determine the values of PID. Interactively, the easy way to

that is -tabulate PID-. Write down all the values that PID

takes on.

3. Do the following for each value of PID:

a. Use original.dta and keep just the observations for the

specific PID being processed right now.

Then drop variable PID.

b. rename all the other variables except HID to have a

suffix indicating the PID value. In the example, we

had one such variable: INJDAM.

c. If this is the first PID value, save the beginnings

of the dataset in long form: -save long-. For subsequent

PID values, you must merge by HID the data in memory with

long.dta, drop _merge, re-sort by HID, and resave long.dta.

I admit it's a lot of work.

For everyone's information, Stata was the first package to have a

-reshape- command, and the terms reshape, long, and wide were

introduced by us. In my past as an analyst, I had reshaped data

by hand many times and it was always a tedious experience. We were

very proud the day we realized that Stata's programming language was

up to mechanizing the task because many programming languages are not.

-- Bill

wgould@stata.com

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: from long to wide***From:*wgould@stata.com (William Gould, StataCorp LP)

- Prev by Date:
**Re: st: dictionary file** - Next by Date:
**Re: st: 2007 West Coast Users Group meeting- will more detailed materialsbe available?** - Previous by thread:
**Re: st: from long to wide** - Next by thread:
**st: suppressing constant in qreg** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |