Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: generate

From	Mirriam Gee <[email protected]>
To	[email protected]
Subject	Re: st: generate
Date	Fri, 8 Oct 2010 09:52:51 +0200

I am referring to limit in terms of number of observations. I am using
stata version 9.2.



On Thu, Oct 7, 2010 at 1:17 PM, Nick Cox <[email protected]> wrote:
> It's the same data, wide or long. Which limit, observations or variables, do you imagine will bite first? Look at -help limits- for your version of Stata (not stated here).
>
> Before you replied, I was going to reinforce Dimitriy's advice. I would reach for -reshape- in this instance and I would keep the data in long form, at least on the information you have given.
>
> In a concurrent thread, I have commented:
>
> Some things are easier with a  wide structure but most things are easier otherwise.
>
> There is much more discussion in
>
> SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
>        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
>        Q1/09   SJ 9(1):137--157
>        shows how to exploit functions, egen functions, and Mata
>        for working rowwise; rowsort and rowranks are introduced
>
> Although that column shows that you can do many things rowwise, the underlying theme is that it isn't usually trivial.
>
> Nick
> [email protected]
>
> Mirriam Gee
>
> Thank you very much Dimitry for your suggestion. It worked perfectly
> well but my main worry is I have many hid (30000) and many g
> variables( eventually i will work with over 2000 variables), so i will
> end up having memory limitation problems if I use reshape command.
> Unless of course if I also divide my dataset into smaller groups.
>
> On Wed, Oct 6, 2010 at 10:55 PM, Dimitriy V. Masterov
>
>> Mirriam Gee wants to:
>>> generate new variable(s) X1- X20 which contains the first 20
>>> numbers ( excluding the zeros) from g1- g100?. For example:
>>
>> There's probably a more elegant way of doing this, but this can be
>> accomplished with the -reshape- command to make your data easier to
>> work with, and then reshaping it again to get it like you want it for
>> your analysis. First, preserve the data and then reshape long to get
>> the X variable. Then, reshape wide and save the X variables. Restore
>> the G variables data, and merge the Xs back in with the Gs:
>>
>> #delimit;
>> /* Preserve your data */
>> preserve;
>>
>> /* Preserve your data */
>> preserve;
>>
>> /* Create the x variables with 2 reshapes */
>> keep hid g*;
>> reshape long g, i(hid) j(which_g);
>>
>> drop if g==0;
>> rename g x;
>> bys hid: gen t=_n;
>> drop which_g;
>>
>> reshape wide x, i(hid) j(t);
>>
>> tempfile temp;
>> save "`temp'";
>>
>> /* Restore data */
>> restore;
>>
>> /* Merge the x variables with the g variables */
>> merge 1:1 hid using "`temp'";
>> drop x21-_merge;
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: generate
  - From: Nick Cox <[email protected]>
- R: st: generate
  - From: "Carlo Lazzaro" <[email protected]>

References:
- st: generate
  - From: Mirriam Gee <[email protected]>
- Re: st: generate
  - From: "Dimitriy V. Masterov" <[email protected]>
- Re: st: generate
  - From: Mirriam Gee <[email protected]>
- RE: st: generate
  - From: Nick Cox <[email protected]>

Prev by Date: st: parmest
Next by Date: R: st: generate
Previous by thread: RE: st: generate
Next by thread: R: st: generate
Index(es):
- Date
- Thread