Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sum issue (wrong values)


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: sum issue (wrong values)
Date   Wed, 24 Oct 2012 12:13:20 +0100

You already have been given the answer to your problem, so there is no
remaining puzzle. -summarize- is summarizing the values underneath the
value labels. Which version do you wish to -summarize-? If the value
labels are the real values, you need to -decode- that variable and
then convert it to numeric with -destring- or -real()-.

Nick

On Wed, Oct 24, 2012 at 11:45 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Sorry; belay that. You do have value labels.
>
> Nick
>
> On Wed, Oct 24, 2012 at 11:43 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> That does look puzzling. Three of us suggested that value labels might
>> be getting in the way, but there are none.
>> I don't have any further suggestions, beyond wondering whether your
>> executable is corrupted. Can you get the same results from
>>
>> . list in 1/10
>>
>> immediately before or after the -summarize-?
>>
>> Nick
>>
>> On Wed, Oct 24, 2012 at 11:33 AM, Christian Bärtsch
>> <christian.baertsch@student.unisg.ch> wrote:
>>> Thanks Nick - sorry I am only getting used to the correct terms.
>>>
>>> Yes it is correct, that I am looking at the -summarize- command in stata.
>>>
>>>
>>> . sum latency_int
>>>
>>>     Variable |       Obs        Mean    Std. Dev.       Min        Max
>>> -------------+--------------------------------------------------------
>>>  latency_int |      5760    1102.242    700.1589          2       1999
>>>
>>> . describe latency_int
>>>
>>>               storage  display     value
>>> variable name   type   format      label      variable label
>>> -------------------------------------------------------------------------------
>>> latency_int     int    %8.0g       latency_int
>>>
>>>
>>> .
>>> . count
>>>  5760
>>>
>>> .
>>> . summarize latency_int, detail
>>>
>>>                          latency_int
>>> -------------------------------------------------------------
>>>       Percentiles      Smallest
>>>  1%           14              2
>>>  5%           57              2
>>> 10%        122.5              2       Obs                5760
>>> 25%        377.5              2       Sum of Wgt.        5760
>>>
>>> 50%       1221.5                      Mean           1102.242
>>>                         Largest       Std. Dev.      700.1589
>>> 75%         1779           1998
>>> 90%         1902           1998       Variance       490222.5
>>> 95%         1948           1998       Skewness      -.2200922
>>> 99%         1988           1999       Kurtosis       1.417508
>>>
>>> And here also an extract from list
>>>
>>> . list latency_int
>>>
>>>       +----------+
>>>       | latenc~t |
>>>       |----------|
>>>    1. |     4720 |
>>>    2. |     3923 |
>>>    3. |     1844 |
>>>    4. |     1435 |
>>>    5. |     2955 |
>>>       |----------|
>>>    6. |     1483 |
>>>    7. |     3459 |
>>>    8. |     1004 |
>>>    9. |     1716 |
>>>   10. |     1372 |
>>>       |----------|
>>>
>>> Thanks.
>>>
>>>
>>> 2012/10/24 Nick Cox <njcoxstata@gmail.com>:
>>>> This is ambiguous as between the -summarize- command (which can be
>>>> abbreviated -sum-) and the -sum()- function, which gives cumulative or
>>>> running sums, although it seems you mean the first. In Stata (not
>>>> "STATA") commands and functions are quite different families.
>>>>
>>>> Even then, you must show us exactly what you typed and exactly what
>>>> Stata did by copying output. Otherwise it is difficult to guess what
>>>> is going on. Does -latency_int- have value labels, which are what you
>>>> see when you -list-, but not what are -summarize-d? You should show us
>>>> the results of
>>>>
>>>> describe latency_int
>>>> count
>>>> summarize latency_int, detail
>>>>
>>>> Nick
>>>>
>>>> On Wed, Oct 24, 2012 at 11:13 AM, Christian Bärtsch
>>>> <christian.baertsch@student.unisg.ch> wrote:
>>>>
>>>>> I have a issue using the sum function of STATA. I have a data set,
>>>>> where I have a variable called latency_int (type: int; and something
>>>>> over 5700 values). I use the command sum(latency_int). There I get the
>>>>> minimum of 2 and the maximum of 1999 even though the data set contains
>>>>> values from 44 to 81000 (those values are shown when I use
>>>>> list(latency_int). It must be a pretty basic mistake, however I have
>>>>> not been able to come up with a solution for days.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index