[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Adding an observation to a dataset
The most efficient and most direct way to get a
sum is to use -summarize-. After -summarize-
the sum is accessible in r(sum). This is
documented at [R] summarize.
To be really efficient, you should use -summarize,
So, for example,
set obs `= _N + 1'
su nitem, meanonly
replace nitem = r(sum) in l
Notice in passing how bumping up the number
of observations can be telescoped to one line.
Other items such as percents can be got
directly using also the results of -count-.
This may not seem especially direct, as no
sum is displayed and you have to invoke it
explicitly. However, the point is that there
is no need to create a new variable and even
less need to fire up -egen-, which as an
interpreted command adds overhead as compared
with -summarize-. (Using the -sum()- function
to get a cumulative sum and then reading off
the last value would be more efficient than
that.) Often the inefficiency will be trivial,
but Allan's posting raises the question of the
best way to do it.
What Allan is doing by adding an extra observation
is a one-way street. He makes it clear that in
his case what he is doing follows a -collapse-
and so is all a means to an end. But other readers
should note that almost always, that would
need to be followed by one or more of
* deleting the extra observations afterwards
* excluding it from subsequent commands by using
-if- or -in-
* returning to the original dataset.
To construct a table of various derived statistics, one approach is to
collapse the data. I then wanted an extra line for totals. Searching
help found a function within Mata but otherwise suggested creating a
temporary file with one case then appending it to the original.
However, a direct approach worked in a DO file and seems worth
documenting on this list:
set obs `nn'
The other part of the task, for completeness, includes forming totals of
numbers, weights, percentages:
egen pcbyn = pc(nitem)
egen totn = sum(nitem)
egen totnpc = sum(pcbyn) /* should be 100 */
* Copy column totals down to pseudo-observation (now 'last').
set obs `nn'
replace Species = "Total ... " in l
replace nitem=totn in l
replace pcbyn=totnpc in l
list species nitem pcbyn
Searching the archive for advice on copying output to Word found a
suggestion to use Ctrl/Shift/C which would copy a table with tabs.
Didn't work for me. Still limited to copying text and formatting in
Word as Courier/9pt to get fixed spacing.
* For searches and help try: