[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: RE: RE: generating a new variable with the egen command |

Date |
Mon, 28 Nov 2005 17:51:44 -0000 |

I agree with the main point, very strongly. Indeed Aristotle and Gauss said the same, so Maarten stands in a good line. But these discussions can, secondarily, be muddied, and muddled, by terminology. Maarten starts out by talking about accuracy, but then he changes terminology to precision, so it is not clear whether he is meaning the same thing, or something else. I'd distinguish four concepts, and others might want to go further. 1. A first question is how data are recorded or reported, which does not necessarily involve any claim on accuracy or precision. Thus a common convention in meteorology is to record temperatures to 0.1 deg C or F. A common convention in demography is to report a census population with no rounding, so that up to 10 digits may be given. (I know that a census in practice is usually another kind of estimate, not the main point here.) In the first case, the tacit view is that we could use better technology to get more digits, but that would usually be not only too expensive but rather silly, as any intervention changes the temperature anyway and the temperature 1 metre away is different. In the second case I presume that no demographer knowing about the conduct of censuses expects the census figure to be accurate to anything like the number of digits reported, but for all sorts of other reasons rounding of census results appears taboo. I like to think of this as the "resolution" of the data, but I doubt that is a standard term. 2. Accuracy at least I think in the physical sciences implies closeness to some 'true' or 'real' or 'correct' value, which is variously (a) philosophically problematic to many, (b) practically difficult in the (usual) absence of any notion of what that value is, or even of a "gold standard" measurement, that is a measurement produced by the best method available (a term less common in economics these days than in medicine?). It seems that when many social scientists talk about validity, they mean this, or something close to it. 3. Precision in the statistical sense implies uncertainty as indicated, ideally, by variability of repetitions (unless you are a Bayesian). It seems that when many social scientists talk about reliability, they mean this, or something close to it. 4. Precision in the computing sense refers to how a number is held internally, and to how results depend on the details of calculations. Thus binary-based machines, and I don't use any other, have to struggle with holding 0.1. In a strict sense, they can't do it! These seem four different senses, but are nevertheless often confused. I was brought up on the analogy of repeatedly aiming at a bull's eye target (e.g. by firing several arrows or bullets), accuracy being how close you are to the target on average and precision being how tightly your hits cluster, but every year when I "remind" students of what I hope they already know I get many blank stares. Be that as it may, in most sciences we don't know where the bull's eye is. Nick n.j.cox@durham.ac.uk P.S. You probably know the story of someone firing at a wall and then painting a target around their hits. This is a good little joke, except that quite a lot of science is like this, in my own field too. Maarten Buis > Since precision came up again, I would like to add a comment: > > While it is good practice to try to minimize rounding errors > during computation (e.g. during computing new variables that > are sums), you should keep in mind how precise your > measurement on that variable actually is. For instance, I > teach introductory statistics to first year social science > students. In the Netherlands grades run from 0 (didn't even > spell their own name right) to 10 (brilliant). Each year at > least one of them asks whether I would want to give them > grades with two decimal points accuracy. I think I make good > exams, but they cannot distinguish between a student with a > statistics capability worth a 6.01 and worth a 6.02. > > Eight digits accurate (float) should be more than enough for > most measurements; in most real data I would consider sixteen > digits (double) overkill. > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: RE: recode values using -foreach- and a numlist (possibly)** - Next by Date:
**st: Seasons from Dates** - Previous by thread:
**st: RE: RE: recode values using -foreach- and a numlist (possibly)** - Next by thread:
**st: Seasons from Dates** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |