Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: mysterious inaccuracy when adding big numbers


From   Trang Nguyen <nqtrang.hanoi@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: mysterious inaccuracy when adding big numbers
Date   Fri, 8 Apr 2011 11:15:50 -0400

Thank you so much, Richard and Nick. This is so helpful! I spent
several hours and didn't figure it out.

I wish I had known what the problem was called to search for it before
posting a question.

Trang

On Fri, Apr 8, 2011 at 11:00 AM, Richard Goldstein
<richgold@ix.netcom.com> wrote:
> 1. if you type -search precision- you will learn about this issue
>
> 2. id's are generally best as strings, however, so I would do something
> like the following:
>
> gen str id=str(province)+str(district)+str(commune)+str(household)
>
> 3. if you need to make it numeric insert "double" between your "gen" and
> your "ID"
>
> Rich
>
> On 4/8/11 10:49 AM, Trang Nguyen wrote:
>> Hi.
>>
>> I am working on a dataset with households as observations that are
>> nested in communes, districts and provinces. I have variables
>> - province: province number (3 digits)
>> - district: district number within each province (max 2 digits)
>> - commune: commune number within each district (max 2 digits)
>> - household: household number within each commune (max 2 digits)
>>
>> I wanted to make a unique ID for each household that doesn't repeat
>> across communes, districts and provinces, that also shows me all the
>> province/district/commune information. So I did this:
>> gen ID = province*1000000 + district*10000 + commune*100 + household
>>
>> I got a variable ID that is correct for the province, district and
>> commune components, but the last two digits do not match the value of
>> the household variable. Instead they are 04 or 12 or 20.
>>
>> Could someone please help me figure out why this is so? My output is
>> below. Thanks much!
>>
>> . gen ID = province*1000000 + district*10000 + commune*100 + household
>>
>> . count if ID != province*1000000 + district*10000 + commune*100 + household
>>  7932
>>
>> . format ID %15.0g
>>
>> . list province district commune household ID in 1/20
>>
>>     +------------------------------------------------------+
>>     | province   district   commune   househ~d          ID |
>>     |------------------------------------------------------|
>>  1. |      101          1         3          1   101010304 |
>>  2. |      101          1         3          2   101010304 |
>>  3. |      101          1         3          4   101010304 |
>>  4. |      101          1         3          5   101010304 |
>>  5. |      101          1        17          4   101011704 |
>>     |------------------------------------------------------|
>>  6. |      101          1        17          5   101011704 |
>>  7. |      101          1        17          6   101011704 |
>>  8. |      101          1        17          8   101011712 |
>>  9. |      101          1        17          9   101011712 |
>>  10. |      101          1        17         10   101011712 |
>>     |------------------------------------------------------|
>>  11. |      101          1        17         11   101011712 |
>>  12. |      101          3         3          3   101030304 |
>>  13. |      101          3         3          4   101030304 |
>>  14. |      101          3         3          5   101030304 |
>>  15. |      101          3         3          6   101030304 |
>>     |------------------------------------------------------|
>>  16. |      101          3         3          7   101030304 |
>>  17. |      101          5        11          3   101051104 |
>>  18. |      101          5        11          6   101051104 |
>>  19. |      101          5        11          9   101051112 |
>>  20. |      101          5        11         10   101051112 |
>>     +------------------------------------------------------+
>>
>> . codebook ID
>>
>> -----------------------------------------------------------------------------------
>> ID                                                             (unlabeled)
>> -----------------------------------------------------------------------------------
>>
>>                  type:  numeric (float)
>>
>>                 range:  [1.010e+08,8.231e+08]        units:  1
>>         unique values:  1308                     missing .:  0/8341
>>
>>                  mean:   4.6e+08
>>              std. dev:   2.6e+08
>>
>>           percentiles:        10%       25%       50%       75%       90%
>>                           1.1e+08   2.1e+08   4.1e+08   7.1e+08   8.1e+08
>>
>> Thanks much!
>>
>> Trang
>>
>> ------------------------
>> Trang Nguyen
>> Doctoral student
>> Johns Hopkins School of Public Health
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index