From
Nick Cox <njcoxstata@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: sum issue (wrong values)

Date
Wed, 24 Oct 2012 12:13:20 +0100

You already have been given the answer to your problem, so there is no remaining puzzle. -summarize- is summarizing the values underneath the value labels. Which version do you wish to -summarize-? If the value labels are the real values, you need to -decode- that variable and then convert it to numeric with -destring- or -real()-. Nick On Wed, Oct 24, 2012 at 11:45 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Sorry; belay that. You do have value labels. > > Nick > > On Wed, Oct 24, 2012 at 11:43 AM, Nick Cox <njcoxstata@gmail.com> wrote: >> That does look puzzling. Three of us suggested that value labels might >> be getting in the way, but there are none. >> I don't have any further suggestions, beyond wondering whether your >> executable is corrupted. Can you get the same results from >> >> . list in 1/10 >> >> immediately before or after the -summarize-? >> >> Nick >> >> On Wed, Oct 24, 2012 at 11:33 AM, Christian Bärtsch >> <christian.baertsch@student.unisg.ch> wrote: >>> Thanks Nick - sorry I am only getting used to the correct terms. >>> >>> Yes it is correct, that I am looking at the -summarize- command in stata. >>> >>> >>> . sum latency_int >>> >>> Variable | Obs Mean Std. Dev. Min Max >>> -------------+-------------------------------------------------------- >>> latency_int | 5760 1102.242 700.1589 2 1999 >>> >>> . describe latency_int >>> >>> storage display value >>> variable name type format label variable label >>> ------------------------------------------------------------------------------- >>> latency_int int %8.0g latency_int >>> >>> >>> . >>> . count >>> 5760 >>> >>> . >>> . summarize latency_int, detail >>> >>> latency_int >>> ------------------------------------------------------------- >>> Percentiles Smallest >>> 1% 14 2 >>> 5% 57 2 >>> 10% 122.5 2 Obs 5760 >>> 25% 377.5 2 Sum of Wgt. 5760 >>> >>> 50% 1221.5 Mean 1102.242 >>> Largest Std. Dev. 700.1589 >>> 75% 1779 1998 >>> 90% 1902 1998 Variance 490222.5 >>> 95% 1948 1998 Skewness -.2200922 >>> 99% 1988 1999 Kurtosis 1.417508 >>> >>> And here also an extract from list >>> >>> . list latency_int >>> >>> +----------+ >>> | latenc~t | >>> |----------| >>> 1. | 4720 | >>> 2. | 3923 | >>> 3. | 1844 | >>> 4. | 1435 | >>> 5. | 2955 | >>> |----------| >>> 6. | 1483 | >>> 7. | 3459 | >>> 8. | 1004 | >>> 9. | 1716 | >>> 10. | 1372 | >>> |----------| >>> >>> Thanks. >>> >>> >>> 2012/10/24 Nick Cox <njcoxstata@gmail.com>: >>>> This is ambiguous as between the -summarize- command (which can be >>>> abbreviated -sum-) and the -sum()- function, which gives cumulative or >>>> running sums, although it seems you mean the first. In Stata (not >>>> "STATA") commands and functions are quite different families. >>>> >>>> Even then, you must show us exactly what you typed and exactly what >>>> Stata did by copying output. Otherwise it is difficult to guess what >>>> is going on. Does -latency_int- have value labels, which are what you >>>> see when you -list-, but not what are -summarize-d? You should show us >>>> the results of >>>> >>>> describe latency_int >>>> count >>>> summarize latency_int, detail >>>> >>>> Nick >>>> >>>> On Wed, Oct 24, 2012 at 11:13 AM, Christian Bärtsch >>>> <christian.baertsch@student.unisg.ch> wrote: >>>> >>>>> I have a issue using the sum function of STATA. I have a data set, >>>>> where I have a variable called latency_int (type: int; and something >>>>> over 5700 values). I use the command sum(latency_int). There I get the >>>>> minimum of 2 and the maximum of 1999 even though the data set contains >>>>> values from 44 to 81000 (those values are shown when I use >>>>> list(latency_int). It must be a pretty basic mistake, however I have >>>>> not been able to come up with a solution for days. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

