Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Issue with rounding


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Issue with rounding
Date   Fri, 6 Oct 2006 12:42:24 +0100

The way to solve these problems is to avoid them. 
Your problem is to check something, not to
calculate something. 

The most direct way to see whether values of 
-some- variable remain constant in groups of 
-block- is 

bysort block (some) : assert some[1] == some[_N] 

If there is any difference between values, this 
will show up as a difference between the first (smallest) 
and the last (largest). Conversely, if all values are
the same, the first and last will also be the same. 

Nick 
n.j.cox@durham.ac.uk 

Newbie
 
> I have the following data:
> 
> Id	date		var
> A	1.1.90	10.1
> A	1.2.90	11.2
> A	1.3.90	12.3
> ...
> A	1.11.04	3.1
> A	1.12.04	4.2
> A	1.1.05	4.2
> A	1.2.05	4.2
> A	1.3.05	4.2
> ... (only -date- changes, with var fixed at 4)
> B	1.1.92	100
> B	1.2.92	110
> B	1.3.92	120
> ...
> B	1.11.03	30.1
> B	1.12.03	40.5
> B	1.1.04	40.5
> B	1.2.04	40.5
> .. (only -date- changes, with var fixed at 40) When -var- 
> becomes fixed, it
> means that id stopped being updated. Given that I have 
> thousands of -id- the
> task of checking this one by one is cumbersome. One way of 
> determining this
> is to, for each observation for each Id, calculate the average of the
> remaining values and check if this average is the same as the 
> value in var.
> I did the following:
> 
> . gsort id -date
> 
> . by id: gen n=_n
> 
> . by id: gen sum=sum(var)
> 
> . by id: gen avg=sum/n
> 
> . sort id date
> 
> . by id: gen ddate=1 if avg==var
> 
> 
> Given that ddate returned all the values as missing values, I took the
> difference between avg and var:
> 
> . drop ddate
> 
> . gen diff=avg-var
> 
> When checking the results in diff I realized that diff 
> yielded values close
> to 0 but not 0 (something like 8.179e-07). Even with the last 
> value, when
> avg is actually equal to var the result was something in the line of
> 8.179e-07 (for instance: var=111.2499, sum=111.2499, 
> avg=111.2499, n=1, and
> diff=8.179e-07). I understand that 8.179e-07 is close 0, and 
> I could do
> something like:
> . replace diff=0 if abs(diff)<0.00001
> But I'm afraid I could lose some observations. Any ideas 
> about the reasons
> for this to happen and how to solve this? The values for var 
> are truncated
> to 4 decimal points by database download.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index