Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: getting more precise numbers from -summarize-
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: getting more precise numbers from -summarize-
Date
Thu, 19 May 2011 08:09:55 +0100
You will get only so far in this direction.
Use -return list- to see the results from -summarize-. Then pick up
the results from r(mean), etc. In trying to get the same results from
MS Excel, bear in mind its documented limitations for statistical
purposes. There have been various papers in
Computational Statistics and Data Analysis on this topic.
As you are concerned about precision, use -double-s to hold squares
and higher powers.
Nick
On Wed, May 18, 2011 at 9:18 PM, Hewan Belay <[email protected]> wrote:
> I am trying to make the command –sum- conform with the format characteristics of the variables it’s summarising, by using –sum varlist, format-, but can’t make it do that in certain cases. I think, but am not sure, that that has to do with the columns of the output from –sum- not able to expand their space beyond a narrow limit.
>
> Let me use an example accessible on the web to illustrate my problem. Consider the two variables, one (var2) with smaller values than the other (var4). In the example below, I show a sample of values of each variable, then show that –sum-, and –sum varlist, format- produce the same thing when the variables are not formatted:
>
> webuse abdata
> g var2 = indoutpt ^ 2
> g var4 = indoutpt ^ 4
> g random=uniform()
> list var* if random<0.01, sep(0)
> sum var*
> sum var*, format
>
> The output for the last two commands is:
>
> . sum var*
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> var2 | 1031 10873.36 2121.907 7551.61 16477.65
> var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
>
> . sum var*, format
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> var2 | 1031 10873.36 2121.907 7551.61 16477.65
> var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
>
> However, when I format the variables, and then do –sum, format-, the summary of the smaller variable (var2) conforms to how I expect it to look, but that is not the case with the larger variable (var4):
>
> format var* %20.1fc
> list var* if random<0.01, sep(0)
> sum var*
> sum var*, format
>
> The output for the last three command lines is (note that the -list- command would reproduce different results on every run since I'm randomly pulling values from the variables:
>
>
> . list var* if random<0.01, sep(0)
>
> +--------------------------+
> | var2 var4 |
> |--------------------------|
> 236. | 9,399.3 88,346,880.0 |
> 295. | 16,170.7 261,490,992.0 |
> 299. | 9,871.9 97,455,048.0 |
> 358. | 13,549.0 183,574,320.0 |
> 392. | 9,347.0 87,365,752.0 |
> 401. | 13,386.5 179,198,096.0 |
> 650. | 9,020.3 81,364,920.0 |
> 653. | 13,427.0 180,284,752.0 |
> 882. | 12,105.5 146,543,152.0 |
> 1014. | 10,512.5 110,511,848.0 |
>
> . sum var*
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> var2 | 1031 10873.36 2121.907 7551.61 16477.65
> var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
>
> . sum var*, format
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> var2 | 1031 10,873.4 2,121.9 7,551.6 16,477.6
> var4 | 1031 1.2e+08 5.0e+07 5.7e+07 2.7e+08
>
>
> Is this an issue with the amount of width space the columns have? If yes, how can I expand the column width? The help file in the manual for the –sum- command doesn’t show any way. Or is the problem a different one?
>
> This is important to me because I copy and paste the summarised results in excel to undertake computations with them, and if the numbers in the summary column are imprecise (e.g. a mean of 1.2e+08 for var4, where I wanted the mean to appear as a non-scientific number) then my excel calculations will be very imprecise as well.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/