Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Adding an observation to a dataset


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Adding an observation to a dataset
Date   Wed, 07 Jun 2006 15:15:58 +0100

The most efficient and most direct way to get a
sum is to use -summarize-. After -summarize-
the sum is accessible in r(sum). This is
documented at [R] summarize.

To be really efficient, you should use -summarize,
meanonly-.

So, for example,

set obs `= _N + 1'
su nitem, meanonly
replace nitem = r(sum) in l

Notice in passing how bumping up the number
of observations can be telescoped to one line.

Other items such as percents can be got
directly using also the results of -count-.

This may not seem especially direct, as no
sum is displayed and you have to invoke it
explicitly. However, the point is that there
is no need to create a new variable and even
less need to fire up -egen-, which as an
interpreted command adds overhead as compared
with -summarize-. (Using the -sum()- function
to get a cumulative sum and then reading off
the last value would be more efficient than
that.) Often the inefficiency will be trivial,
but Allan's posting raises the question of the
best way to do it.

What Allan is doing by adding an extra observation
is a one-way street. He makes it clear that in
his case what he is doing follows a -collapse-
and so is all a means to an end. But other readers
should note that almost always, that would
need to be followed by one or more of

* deleting the extra observations afterwards

* excluding it from subsequent commands by using
-if- or -in-

* returning to the original dataset.

Nick
n.j.cox@durham.ac.uk

Allan Reese

To construct a table of various derived statistics, one approach is to collapse the data. I then wanted an extra line for totals. Searching help found a function within Mata but otherwise suggested creating a temporary file with one case then appending it to the original. However, a direct approach worked in a DO file and seems worth documenting on this list:

local nn=_N+1
set obs `nn'

The other part of the task, for completeness, includes forming totals of numbers, weights, percentages:

egen pcbyn = pc(nitem)
egen totn = sum(nitem)
egen totnpc = sum(pcbyn) /* should be 100 */
* Copy column totals down to pseudo-observation (now 'last').
quietly {
local nn=_N+1
set obs `nn'
replace Species = "Total ... " in l
replace nitem=totn[1] in l
replace pcbyn=totnpc[1] in l
}
list species nitem pcbyn

Searching the archive for advice on copying output to Word found a suggestion to use Ctrl/Shift/C which would copy a table with tabs. Didn't work for me. Still limited to copying text and formatting in Word as Courier/9pt to get fixed spacing.


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index