From
"Austin Nichols" <austinnichols@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: how to generate sum of distinct id1, by id2, in the last n years

Date
Tue, 18 Sep 2007 11:06:42 -0400

Pierre Azoulay <pierre.azoulay@gmail.com>: The language setting up the problem seems perversely unclear: "create a variable that records the sum of distinct [id values] in the last 3 years" does not seem what you want at all, though a sum can help you get what you want, if you want the number of distinct values of id across years t, t-1, and t-2 saved in a new variable at t, like so: clear input star_id id year nbpapers 1 2 1972 1 1 2 1973 0 1 2 1974 2 1 2 1975 3 1 2 1976 0 1 2 1977 4 1 3 1970 1 1 3 1971 0 1 3 1972 0 1 3 1973 2 1 4 1978 2 1 4 1979 1 1 5 1977 4 1 5 1978 1 1 5 1979 0 1 5 1980 1 1 5 1981 1 end g obs=_n expand 3 bys obs: gen n=_n gen yr=year+n-1 bys star yr id: g d=_n==1 egen ndistinct=sum(d), by(star yr) drop if n>1 collapse ndist, by(star year) fillin star y li, noo clean On 9/17/07, Pierre Azoulay <pierre.azoulay@gmail.com> wrote: > Dear Statalisters, > > I have what I believe a simple programming question that I can't quite solve. > I have a panel of dyads, where each member of the dyad is a coauthor. > Each dyad is composed or a "superstar" and a "simple joe/jane." > > For instance: > > star_id id year nbpapers > --------------------------------------------------------- > 1 2 1972 1 > 1 2 1973 0 > 1 2 1974 2 > 1 2 1975 3 > 1 2 1976 0 > 1 2 1977 4 > 1 3 1970 1 > 1 3 1971 0 > 1 3 1972 0 > 1 3 1973 2 > 1 4 1978 2 > 1 4 1979 1 > 1 5 1977 4 > 1 5 1978 1 > 1 5 1979 0 > 1 5 1980 1 > 1 5 1981 1 > > So superstar #1 has 4 "simple joe collaborators" numbered 2,3,4, and 5. > In each year, the data records how many publications exist for > superstar i and simple joe/jane j. > > > I would like to collapse this data at the superstar/year level and > create a variable that records the sum of distinct "simple joes" in > the last 3 years. > In other words, I'd like to create the variable stk_nbcoauth_it that is: > > star_id year stk_nbcoauth_it > --------------------------------- > 1 1970 1 > 1 1971 1 > 1 1972 2 > 1 1973 2 > 1 1974 2 > 1 1975 2 > 1 1976 1 > 1 1977 2 > 1 1978 3 > 1 1979 3 > 1 1980 2 > 1 1981 2 > > I have fiddle with bysort star_id id (year), but without clear > success. Could anyone help? > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

