Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Normalize Variables by s.d. (programmatically)


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Normalize Variables by s.d. (programmatically)
Date   Thu, 22 Dec 2011 13:58:33 +0000

I haven't read this through, but in order not to get bitten by any 244
character limit,

0. Don't evaluate using

local mymacro = <stuff>

Copy using

local mymacro <stuff>

1. Don't use -length()- to assess length: use the extended macro
function -: length-. See help for -extended_fcn-.

2. Again, don't use -subinstr()-: use the equivalent extended macro function.

Nick

On Thu, Dec 22, 2011 at 1:36 PM, Ryan Turner <rjturner@cmu.edu> wrote:
> On Dec 21, 2011, at 11:01 PM, Richard Williams wrote:
>
>> At 04:49 PM 12/21/2011, Austin Nichols wrote:
>>> Ryan Turner <rjturner@cmu.edu>:
>>> help regress
>>> (see option beta) or
>>> http://repec.org/bocode/e/estout/esttab.html#esttab003
>>
>> That was my first impulse too, but if you wade through his code you see there is an -xtreg- command at the end of it, and xtreg doesn't have a beta option. (Ryan, regress can mean a lot of things, so it would be good to be explicit about what you mean at the beginning).
>>
>> I haven't worked my way through Ryan's code, but I wonder if Ben Jann's -center- command, available from SSC, could simplify the process.
>
> Thanks Austin, Richard,
>
> Yes I found regress option beta in my searches; however, as noted I am using xtreg.  Further I would like only to divide by std but not subtract the mean, and I would like to leave dummies unnormalized.
>
> The real problem with my program below is that I am trying to dynamically allocate backup variables for original values, and operate on the original variable names.  The reasoning for this is that I didn't want my regression output cluttered with temp variable tags e.g. std_`varname'.  However this made my program much more complicated because I couldn't rely on stata's native handling of varlists, wildcards, and difference operators, and I would always have to test the existence of variables.  A big mess.
>
> I got it working quite nicely (as below) by simply generating temporary variables and using some sed magic to clean up my regression output.  However, the program given quickly reaches the string limit of 244 characters since I am fully expanding all wildcards.  By shortening my varnames (and relying on additional sed magic) I stay within the limit and am able to run my current regressions of interest, but the program is not robust.
>
> I looked at -center-.  I don't think that is what I want unless there is something special in byable that I don't yet understand.  I think my real question is, how do I efficiently operate on the observations that would be included in a regression (e.g. that are not missing)?  I test if !missing(comma separated varlist).  The only other way I can think of is to run the regression twice; discard the first and use e(sample) for the second.  Is there a better way?
>
> Thanks for your feedback.
>
> Ryan Turner
>
> // divide regressors by std
> capture program drop doreg
> program define doreg
>    syntax varlist [, *]
>    local reg_list
>
>    // expand wildcards and remove spaces
>    foreach item of var `varlist' {
>        local reg_list "`reg_list' `item'"
>    }
>    //assert length("`reg_list'") <= 244
>
>    // generate test of what observations are included in regression
>    local include_list = "!missing(" + subinstr("`reg_list'"," ",",",.) + ")"
>    assert length("`include_list'") <= 244
>
>
>    local drop_list
>    local dummies
>    foreach item of var `reg_list' {
>        if strpos("`item'","d_") == 1 {
>            // add to dummies list
>            local dummies "`dummies' `item'"
>            assert length("`reg_list'") <= 244
>            //display "`dummies'"
>
>            // don't normalize dummies
>            continue
>        }
>
>        quietly: summ `item' if `include_list'
>        gen std_`item' = `item' / r(sd)
>        //gen std_`item' = `item'
>        local drop_list "`drop_list' std_`item'"
>    }
>
>    // do the actual regression
>    xtreg `drop_list' `dummies', `options'
>
>    // drop so that we don't accidently reuse this one-time varlist
>    drop `drop_list'
>    quietly: summ
> end

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index