Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: bysort problem


From   "Sergiy Radyakin" <Radyakin@aoek.uni-hannover.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: bysort problem
Date   Mon, 26 Feb 2007 16:53:02 +0100

Hi Nikolaos!

I guess this code works:
***--------------------------------------------------------------------------------------
clear
set more off

input var1 var2
.145 .14
.145 .15
.145 .15
.167 .15
1.89 .15
1.89 .16
1.89 .16
end


list

tempvar _id
local _idN=2
while `_idN'!=1 {
di "------------------------------"
qui gen `_id'=0
list
bysort var1 var2: replace `_id'=_n
replace var1=var1+(`_id'-1)*0.001 if `_id'>1
sum `_id', detail
local _idN=r(max)
list
drop `_id'
di "=============================="
}

***--------------------------------------------------------------------------------------



However I do not understand why do you write "The obs in 3 and 6 should have _id (__000002) 0.016 and ) 0.017,
respectively." You change var1, not the temporary id variable. Why do you expect _id==0.016?

Have you considered a possibility that adding 0.001 might assign your observation to a different group (defined by a pair of your var1;var2 variables?) Or is it exactly the desired behaviour? Imagine var2=const for all observations. You have var1 for obs 1 to 1000 equal to 0.001 to 1. And you have one more observation with var1=0.001. This procedure will add 0.001 1000 times moving this observation all the way to 1.001.


Regards,
Sergiy














----- Original Message ----- From: "Nikolaos A. Patsopoulos" <npatsop@cc.uoi.gr>
To: <statalist@hsphsun2.harvard.edu>
Sent: Monday, February 26, 2007 8:50 AM
Subject: st: bysort problem



I'm currently writing a program that in some point checks if more than observations have two vars (E and SE) equal. If more than one exists then SE is increased by 0.001:

tempvar _id
qui gen `_id'=0
local _idN=2
while `_idN'!=1 {
bysort `E' `SE': replace `_id'=_n if `touse'
count if `_id'>1 & `touse'
replace `SE'=`SE'+(`_id'-1)*0.001 if `_id'>1
sum `_id' if `touse', detail
local _idN=r(max)
list `E' `SE' `_id' if `touse'
}

when I run the above piece of code bysort fails in the second pass:

2
(2 real changes made)

__000002
-------------------------------------------------------------
Percentiles Smallest
1% 1 1
5% 1 1
10% 1 1 Obs 7
25% 1 1 Sum of Wgt. 7

50% 1 Mean 1.285714
Largest Std. Dev. .48795
75% 2 1
90% 2 1 Variance .2380952
95% 2 2 Skewness .9486833
99% 2 2 Kurtosis 1.9

+------------------------+
| var1 var2 __000002 |
|------------------------|
1. | .145 .014 1 |
2. | .145 .015 2 |
3. | .145 .015 1 |
4. | .167 .015 1 |
5. | 1.89 .015 1 |
|------------------------|
6. | 1.89 .016 2 |
7. | 1.89 .016 1 |
+------------------------+
0
(0 real changes made)

__000002
-------------------------------------------------------------
Percentiles Smallest
1% 1 1
5% 1 1
10% 1 1 Obs 7
25% 1 1 Sum of Wgt. 7

50% 1 Mean 1
Largest Std. Dev. 0
75% 1 1
90% 1 1 Variance 0
95% 1 1 Skewness .
99% 1 1 Kurtosis .

+------------------------+
| var1 var2 __000002 |
|------------------------|
1. | .145 .014 1 |
2. | .145 .015 1 |
3. | .145 .015 1 |
4. | .167 .015 1 |
5. | 1.89 .015 1 |
|------------------------|
6. | 1.89 .016 1 |
7. | 1.89 .016 1 |
+------------------------+

The obs in 3 and 6 should have _id (__000002) 0.016 and ) 0.017, respectively.

What do I miss?

Another sort question:

How can I label tempvars and locals?

Thanks in advance,

Nikos

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index