Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: generate


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: generate
Date   Thu, 7 Oct 2010 12:17:45 +0100

It's the same data, wide or long. Which limit, observations or variables, do you imagine will bite first? Look at -help limits- for your version of Stata (not stated here). 

Before you replied, I was going to reinforce Dimitriy's advice. I would reach for -reshape- in this instance and I would keep the data in long form, at least on the information you have given. 

In a concurrent thread, I have commented:

Some things are easier with a  wide structure but most things are easier otherwise. 

There is much more discussion in 

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
        Q1/09   SJ 9(1):137--157
        shows how to exploit functions, egen functions, and Mata
        for working rowwise; rowsort and rowranks are introduced

Although that column shows that you can do many things rowwise, the underlying theme is that it isn't usually trivial. 

Nick 
n.j.cox@durham.ac.uk 

Mirriam Gee

Thank you very much Dimitry for your suggestion. It worked perfectly
well but my main worry is I have many hid (30000) and many g
variables( eventually i will work with over 2000 variables), so i will
end up having memory limitation problems if I use reshape command.
Unless of course if I also divide my dataset into smaller groups.

On Wed, Oct 6, 2010 at 10:55 PM, Dimitriy V. Masterov

> Mirriam Gee wants to:
>> generate new variable(s) X1- X20 which contains the first 20
>> numbers ( excluding the zeros) from g1- g100?. For example:
>
> There's probably a more elegant way of doing this, but this can be
> accomplished with the -reshape- command to make your data easier to
> work with, and then reshaping it again to get it like you want it for
> your analysis. First, preserve the data and then reshape long to get
> the X variable. Then, reshape wide and save the X variables. Restore
> the G variables data, and merge the Xs back in with the Gs:
>
> #delimit;
> /* Preserve your data */
> preserve;
>
> /* Preserve your data */
> preserve;
>
> /* Create the x variables with 2 reshapes */
> keep hid g*;
> reshape long g, i(hid) j(which_g);
>
> drop if g==0;
> rename g x;
> bys hid: gen t=_n;
> drop which_g;
>
> reshape wide x, i(hid) j(t);
>
> tempfile temp;
> save "`temp'";
>
> /* Restore data */
> restore;
>
> /* Merge the x variables with the g variables */
> merge 1:1 hid using "`temp'";
> drop x21-_merge;

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index