Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: RE: Using -collapse- extensively to find historical, irregular matches: Better way?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: RE: Using -collapse- extensively to find historical, irregular matches: Better way?
Date   Tue, 30 Sep 2003 16:58:59 +0100

Chih-Mao Hsieh
> 	
> 	I had been shying away from converting "cited" to 
> strings because the numbers are in the millions, i.e. 
> strings would be length 7.  Many of the "citing" patents 
> have more than 35-40 "cited" patents, and so the 
> concatenation might surpass the string's length limit.
> 	
> 	Of course, the chances are not high that two patents 
> would match each other over the first 35 patents, so your 
> way does appear to be better.

Another way is to -reshape-, something 
like this: 

bysort citing (cited) : gen j = _n 
reshape wide cited, i(citing) j(j) 
bysort cited* (citing) : gen counter = _N - 1 

At this moment, I think that's a lot better
than my earlier suggestions. 

Nick 
n.j.cox@durham.ac.uk 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index