Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: AW: AW: st: sort of standardization


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   AW: AW: AW: st: sort of standardization
Date   Wed, 12 May 2010 17:06:18 +0200

<> 

" if the variable is truly continuous (as in your examples),
then there is no reason, on a practical basis, to add anything"




Official Stata is committed to my version, though:



*************
clear*
set seed 1001
set obs 10000
gen x=rnormal()
gen int y=_n
tabstat x y, stat(range)
su x, mean
di in r r(max)-r(min)
su y, mean
di in r r(max)-r(min)
*************



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Richard
Goldstein
Gesendet: Mittwoch, 12. Mai 2010 17:01
An: [email protected]
Cc: Martin Weiss
Betreff: Re: AW: AW: st: sort of standardization

good point -- the "1" should have been "unit of measure" to encompass
everything -- if the variable is truly continuous (as in your examples),
then there is no reason, on a practical basis, to add anything

On 5/12/10 10:56 AM, Martin Weiss wrote:
> 
> <> 
> 
> 
> " I think of the range as the min
> to the max *inclusive* of each endpoint;"
> 
> Gotcha! But what do non-integer values do to your conviction?
> 
> *************
> clear*
> set obs 1000
> set seed 1001
> gen x=rnormal()
> su x
> di in r "Range " %3.2fc r(max)-r(min) " or "  %3.2fc r(max)-r(min) +1 " ?"
> *************
> 
> HTH
> Martin
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Richard
> Goldstein
> Gesendet: Mittwoch, 12. Mai 2010 16:51
> An: [email protected]
> Cc: Martin Weiss
> Betreff: Re: AW: st: sort of standardization
> 
> Martin,
> 
> look at it this way -- if my min is 1 and my max is 10, then the range
> is 10 (it seems to me), not 9 -- i.e., I think of the range as the min
> to the max *inclusive* of each endpoint; StataCorp apparently disagrees
;-)
> 
> Rich
> 
> On 5/12/10 10:46 AM, Martin Weiss wrote:
>>
>> <> 
>>
>> " local range=r(max)-r(min)+1"
>>
>> Rich, what does the "+1" term do for the "range"? I took the definition
in
>> my code from [R], page 204. Am I missing anything?
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von Richard
>> Goldstein
>> Gesendet: Mittwoch, 12. Mai 2010 16:40
>> An: [email protected]
>> Cc: Ginevra Biino
>> Betreff: Re: st: sort of standardization
>>
>> if I understand correctly what you want, I would do the following within
>> a -foreach- loop:
>>
>> summarize variable
>> calculate the range from r(min) and r(max)
>> divide the old variable by this calculated range inside a -gen-
>>
>> e.g.,
>>
>> foreach var of varlist .... {
>> qui su `var'
>> local range=r(max)-r(min)+1
>> gen `var'3=`var'/`range'
>> }
>>
>> Rich
>>
>> On 5/12/10 10:29 AM, Ginevra Biino wrote:
>>> Dear Statalist,
>>> I have to standardize many variables (in order to run PCA).
>>> Besides generating the n corresponding std(varname) vars, which I have
>>> already done, I also want to generate n new  variables obtained dividing
>>> each variable by its range. Can anybody help me?
>>> Ginevra
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index