Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: bysort problem


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: bysort problem
Date   Mon, 26 Feb 2007 12:56:44 -0000

I don't understand all of this, but there is a bundle
of small difficulties that may be behind your problem. 

I really don't know what you mean by saying that -bysort- 
fails. It does what it is told. The fact that you have some 
duplicates on `E' and `SE' may not be what you desire,
or what you think should be true, but it is not the fault of 
-bysort-. This isn't a programming issue; it is something
to do with your data. 

Existing Stata commands -isid- and -duplicates- are 
also available in this territory. 

However, I would recommend 

bysort `E' `SE' `touse' : gen `_id' = _n 

as being more robust code in general. 

Moreover, adding 0.001 is dangerous because you may run into
precision problems, as often documented on this list -- and
in the manual and FAQ. 

You can label temporary variables just like any other variable. 

tempvar foo
gen `foo' = 42
label var `foo' "The answer" 

Nick 
n.j.cox@durham.ac.uk 

Nikolaos A. Patsopoulos
 
> I'm currently writing  a program that in some point checks if 
> more than 
> observations have two vars (E and SE) equal. If more than one exists 
> then SE is increased by 0.001:
> 
>     tempvar _id
>     qui gen `_id'=0
>     local _idN=2
>     while `_idN'!=1 {
>           
>             bysort `E' `SE': replace `_id'=_n if `touse'
>             count if `_id'>1 & `touse'
>             replace `SE'=`SE'+(`_id'-1)*0.001 if `_id'>1
>             sum `_id' if `touse', detail
>       
>             local _idN=r(max)
>             list `E' `SE' `_id' if `touse'
>     
>     }
> 
> when I run the above piece of code bysort fails in the second pass:
> 
>     2
> (2 real changes made)
> 
>                           __000002
> -------------------------------------------------------------
>       Percentiles      Smallest
>  1%            1              1
>  5%            1              1
> 10%            1              1       Obs                   7
> 25%            1              1       Sum of Wgt.           7
> 
> 50%            1                      Mean           1.285714
>                         Largest       Std. Dev.        .48795
> 75%            2              1
> 90%            2              1       Variance       .2380952
> 95%            2              2       Skewness       .9486833
> 99%            2              2       Kurtosis            1.9
> 
>      +------------------------+
>      | var1   var2   __000002 |
>      |------------------------|
>   1. | .145   .014          1 |
>   2. | .145   .015          2 |
>   3. | .145   .015          1 |
>   4. | .167   .015          1 |
>   5. | 1.89   .015          1 |
>      |------------------------|
>   6. | 1.89   .016          2 |
>   7. | 1.89   .016          1 |
>      +------------------------+
>     0
> (0 real changes made)
> 
>                           __000002
> -------------------------------------------------------------
>       Percentiles      Smallest
>  1%            1              1
>  5%            1              1
> 10%            1              1       Obs                   7
> 25%            1              1       Sum of Wgt.           7
> 
> 50%            1                      Mean                  1
>                         Largest       Std. Dev.             0
> 75%            1              1
> 90%            1              1       Variance              0
> 95%            1              1       Skewness              .
> 99%            1              1       Kurtosis              .
> 
>      +------------------------+
>      | var1   var2   __000002 |
>      |------------------------|
>   1. | .145   .014          1 |
>   2. | .145   .015          1 |
>   3. | .145   .015          1 |
>   4. | .167   .015          1 |
>   5. | 1.89   .015          1 |
>      |------------------------|
>   6. | 1.89   .016          1 |
>   7. | 1.89   .016          1 |
>      +------------------------+
> 
> The obs in 3 and 6 should have _id (__000002) 0.016 and ) 0.017, 
> respectively.
> 
> What do I miss?
> 
> Another sort question:
> 
> How can I label tempvars and locals?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index