Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Martin Weiss" <martin.weiss1@gmx.de> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | AW: AW: AW: st: sort of standardization |
Date | Wed, 12 May 2010 17:06:18 +0200 |
<> " if the variable is truly continuous (as in your examples), then there is no reason, on a practical basis, to add anything" Official Stata is committed to my version, though: ************* clear* set seed 1001 set obs 10000 gen x=rnormal() gen int y=_n tabstat x y, stat(range) su x, mean di in r r(max)-r(min) su y, mean di in r r(max)-r(min) ************* HTH Martin -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Richard Goldstein Gesendet: Mittwoch, 12. Mai 2010 17:01 An: statalist@hsphsun2.harvard.edu Cc: Martin Weiss Betreff: Re: AW: AW: st: sort of standardization good point -- the "1" should have been "unit of measure" to encompass everything -- if the variable is truly continuous (as in your examples), then there is no reason, on a practical basis, to add anything On 5/12/10 10:56 AM, Martin Weiss wrote: > > <> > > > " I think of the range as the min > to the max *inclusive* of each endpoint;" > > Gotcha! But what do non-integer values do to your conviction? > > ************* > clear* > set obs 1000 > set seed 1001 > gen x=rnormal() > su x > di in r "Range " %3.2fc r(max)-r(min) " or " %3.2fc r(max)-r(min) +1 " ?" > ************* > > HTH > Martin > > > -----Ursprüngliche Nachricht----- > Von: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Richard > Goldstein > Gesendet: Mittwoch, 12. Mai 2010 16:51 > An: statalist@hsphsun2.harvard.edu > Cc: Martin Weiss > Betreff: Re: AW: st: sort of standardization > > Martin, > > look at it this way -- if my min is 1 and my max is 10, then the range > is 10 (it seems to me), not 9 -- i.e., I think of the range as the min > to the max *inclusive* of each endpoint; StataCorp apparently disagrees ;-) > > Rich > > On 5/12/10 10:46 AM, Martin Weiss wrote: >> >> <> >> >> " local range=r(max)-r(min)+1" >> >> Rich, what does the "+1" term do for the "range"? I took the definition in >> my code from [R], page 204. Am I missing anything? >> >> HTH >> Martin >> >> >> -----Ursprüngliche Nachricht----- >> Von: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Richard >> Goldstein >> Gesendet: Mittwoch, 12. Mai 2010 16:40 >> An: statalist@hsphsun2.harvard.edu >> Cc: Ginevra Biino >> Betreff: Re: st: sort of standardization >> >> if I understand correctly what you want, I would do the following within >> a -foreach- loop: >> >> summarize variable >> calculate the range from r(min) and r(max) >> divide the old variable by this calculated range inside a -gen- >> >> e.g., >> >> foreach var of varlist .... { >> qui su `var' >> local range=r(max)-r(min)+1 >> gen `var'3=`var'/`range' >> } >> >> Rich >> >> On 5/12/10 10:29 AM, Ginevra Biino wrote: >>> Dear Statalist, >>> I have to standardize many variables (in order to run PCA). >>> Besides generating the n corresponding std(varname) vars, which I have >>> already done, I also want to generate n new variables obtained dividing >>> each variable by its range. Can anybody help me? >>> Ginevra * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/