Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: getting more precise numbers from -summarize-
From
Hewan Belay <[email protected]>
To
Stata List <[email protected]>
Subject
st: getting more precise numbers from -summarize-
Date
Wed, 18 May 2011 13:18:14 -0700 (PDT)
Dear Statalist,
I am trying to make the command –sum- conform with the format characteristics of the variables it’s summarising, by using –sum varlist, format-, but can’t make it do that in certain cases. I think, but am not sure, that that has to do with the columns of the output from –sum- not able to expand their space beyond a narrow limit.
Let me use an example accessible on the web to illustrate my problem. Consider the two variables, one (var2) with smaller values than the other (var4). In the example below, I show a sample of values of each variable, then show that –sum-, and –sum varlist, format- produce the same thing when the variables are not formatted:
webuse abdata
g var2 = indoutpt ^ 2
g var4 = indoutpt ^ 4
g random=uniform()
list var* if random<0.01, sep(0)
sum var*
sum var*, format
The output for the last two commands is:
. sum var*
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
var2 | 1031 10873.36 2121.907 7551.61 16477.65
var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
. sum var*, format
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
var2 | 1031 10873.36 2121.907 7551.61 16477.65
var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
However, when I format the variables, and then do –sum, format-, the summary of the smaller variable (var2) conforms to how I expect it to look, but that is not the case with the larger variable (var4):
format var* %20.1fc
list var* if random<0.01, sep(0)
sum var*
sum var*, format
The output for the last three command lines is (note that the -list- command would reproduce different results on every run since I'm randomly pulling values from the variables:
. list var* if random<0.01, sep(0)
+--------------------------+
| var2 var4 |
|--------------------------|
236. | 9,399.3 88,346,880.0 |
295. | 16,170.7 261,490,992.0 |
299. | 9,871.9 97,455,048.0 |
358. | 13,549.0 183,574,320.0 |
392. | 9,347.0 87,365,752.0 |
401. | 13,386.5 179,198,096.0 |
650. | 9,020.3 81,364,920.0 |
653. | 13,427.0 180,284,752.0 |
882. | 12,105.5 146,543,152.0 |
1014. | 10,512.5 110,511,848.0 |
. sum var*
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
var2 | 1031 10873.36 2121.907 7551.61 16477.65
var4 | 1031 1.23e+08 4.97e+07 5.70e+07 2.72e+08
. sum var*, format
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
var2 | 1031 10,873.4 2,121.9 7,551.6 16,477.6
var4 | 1031 1.2e+08 5.0e+07 5.7e+07 2.7e+08
Is this an issue with the amount of width space the columns have? If yes, how can I expand the column width? The help file in the manual for the –sum- command doesn’t show any way. Or is the problem a different one?
This is important to me because I copy and paste the summarised results in excel to undertake computations with them, and if the numbers in the summary column are imprecise (e.g. a mean of 1.2e+08 for var4, where I wanted the mean to appear as a non-scientific number) then my excel calculations will be very imprecise as well.
Thank you,
Hewan
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/