Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: generate

From   Nick Cox <>
To   "''" <>
Subject   RE: st: generate
Date   Thu, 7 Oct 2010 12:17:45 +0100

It's the same data, wide or long. Which limit, observations or variables, do you imagine will bite first? Look at -help limits- for your version of Stata (not stated here). 

Before you replied, I was going to reinforce Dimitriy's advice. I would reach for -reshape- in this instance and I would keep the data in long form, at least on the information you have given. 

In a concurrent thread, I have commented:

Some things are easier with a  wide structure but most things are easier otherwise. 

There is much more discussion in 

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
        Q1/09   SJ 9(1):137--157
        shows how to exploit functions, egen functions, and Mata
        for working rowwise; rowsort and rowranks are introduced

Although that column shows that you can do many things rowwise, the underlying theme is that it isn't usually trivial. 


Mirriam Gee

Thank you very much Dimitry for your suggestion. It worked perfectly
well but my main worry is I have many hid (30000) and many g
variables( eventually i will work with over 2000 variables), so i will
end up having memory limitation problems if I use reshape command.
Unless of course if I also divide my dataset into smaller groups.

On Wed, Oct 6, 2010 at 10:55 PM, Dimitriy V. Masterov

> Mirriam Gee wants to:
>> generate new variable(s) X1- X20 which contains the first 20
>> numbers ( excluding the zeros) from g1- g100?. For example:
> There's probably a more elegant way of doing this, but this can be
> accomplished with the -reshape- command to make your data easier to
> work with, and then reshaping it again to get it like you want it for
> your analysis. First, preserve the data and then reshape long to get
> the X variable. Then, reshape wide and save the X variables. Restore
> the G variables data, and merge the Xs back in with the Gs:
> #delimit;
> /* Preserve your data */
> preserve;
> /* Preserve your data */
> preserve;
> /* Create the x variables with 2 reshapes */
> keep hid g*;
> reshape long g, i(hid) j(which_g);
> drop if g==0;
> rename g x;
> bys hid: gen t=_n;
> drop which_g;
> reshape wide x, i(hid) j(t);
> tempfile temp;
> save "`temp'";
> /* Restore data */
> restore;
> /* Merge the x variables with the g variables */
> merge 1:1 hid using "`temp'";
> drop x21-_merge;

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index