Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: first|string option for collapse


From   Caleb Southworth <caleb@uoregon.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: first|string option for collapse
Date   Thu, 5 Oct 2006 16:06:21 -0700 (PDT)

A problem that comes up occasionally when collapsing data on organizations
is the inability to take the first instance of a variable in the set being
collapsed. This would be most useful when that variable is a string, but I
believe is also of general use. Below is my work around, but perhaps
someone has a better one?

Imagine data on organizations:

id	org_name	_1935	_1936
1	foo		10	.
1	foo_blah	.	12
2	noo		54	55

I have duplicate ids and I want to collapse the data, but I don't want to
lose the name. The non-existent command would be

-collapse (first) org_name (sum) _*, by(id)-

instead

save orig
duplicates drop id, force
keep id org_name
save temp_name, replace
use orig, clear
collapse (sum) _*, by(id)
merge id using temp_name, sort

(first) in collapse could also be used to priviledge the data on one type
of case, i.e. the start date of an organization: - sort id start- then
collapse keeping earliest start date.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index