Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: generate


From   Mirriam Gee <mirrgee@googlemail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: generate
Date   Fri, 8 Oct 2010 09:52:51 +0200

I am referring to limit in terms of number of observations. I am using
stata version 9.2.



On Thu, Oct 7, 2010 at 1:17 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> It's the same data, wide or long. Which limit, observations or variables, do you imagine will bite first? Look at -help limits- for your version of Stata (not stated here).
>
> Before you replied, I was going to reinforce Dimitriy's advice. I would reach for -reshape- in this instance and I would keep the data in long form, at least on the information you have given.
>
> In a concurrent thread, I have commented:
>
> Some things are easier with a  wide structure but most things are easier otherwise.
>
> There is much more discussion in
>
> SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
>        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
>        Q1/09   SJ 9(1):137--157
>        shows how to exploit functions, egen functions, and Mata
>        for working rowwise; rowsort and rowranks are introduced
>
> Although that column shows that you can do many things rowwise, the underlying theme is that it isn't usually trivial.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Mirriam Gee
>
> Thank you very much Dimitry for your suggestion. It worked perfectly
> well but my main worry is I have many hid (30000) and many g
> variables( eventually i will work with over 2000 variables), so i will
> end up having memory limitation problems if I use reshape command.
> Unless of course if I also divide my dataset into smaller groups.
>
> On Wed, Oct 6, 2010 at 10:55 PM, Dimitriy V. Masterov
>
>> Mirriam Gee wants to:
>>> generate new variable(s) X1- X20 which contains the first 20
>>> numbers ( excluding the zeros) from g1- g100?. For example:
>>
>> There's probably a more elegant way of doing this, but this can be
>> accomplished with the -reshape- command to make your data easier to
>> work with, and then reshaping it again to get it like you want it for
>> your analysis. First, preserve the data and then reshape long to get
>> the X variable. Then, reshape wide and save the X variables. Restore
>> the G variables data, and merge the Xs back in with the Gs:
>>
>> #delimit;
>> /* Preserve your data */
>> preserve;
>>
>> /* Preserve your data */
>> preserve;
>>
>> /* Create the x variables with 2 reshapes */
>> keep hid g*;
>> reshape long g, i(hid) j(which_g);
>>
>> drop if g==0;
>> rename g x;
>> bys hid: gen t=_n;
>> drop which_g;
>>
>> reshape wide x, i(hid) j(t);
>>
>> tempfile temp;
>> save "`temp'";
>>
>> /* Restore data */
>> restore;
>>
>> /* Merge the x variables with the g variables */
>> merge 1:1 hid using "`temp'";
>> drop x21-_merge;
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index