[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: create a variable based on a recurring value in a varlist |

Date |
Thu, 13 Oct 2005 18:42:52 +0100 |

My code solved your problem as stated! But I appreciate that missings should be ignored. Try this. Here -1 will be returned in -same- if and only all PIDs are missing. gen long id = _n reshape long PID, i(id) bysort id (PID) : gen same = cond(mi(PID), -1, PID == PID[_n-1]) bysort id (same) : replace same = same[_N] reshape wide Nick n.j.cox@durham.ac.uk Derek Darves > Thanks all for the comments. > > I had to rewrite Nick's suggestion (see original message below) to > get this to work. In Nick's original formulation every case was > rendered a "1". I think the problem is that some of the PID > variables were missing for nearly every case. So, I added a little > bit of code. I did some error checking and, for the cases > that it did > mark greater than 1, the data are correct. This does not mean, of > course, that I did not miss cases. Since my goal is not find a > repeated (non-missing) value in a varlist, will this code do the > trick. In other words, does anyone see a way that the code below > could have missed a repeated value in varlist? This is the code: > *Start > clear > set mem 1000m > use data, clear > keep pid* index > save safecopy, replace > // Preparations for easy reshape > local i 1 > foreach var of varlist pid* { > ren `var' pid`i++' > } > > // Solution for Problem > reshape long pid, i(index) j(var) > by index (pid), sort: gen same = sum(pid==pid[_n-1]) if pid!=. > replace same = 0 if same ==. > gen same1=0 > bysort index (same) : replace same1 = same[_N] > drop same > reshape wide > > save shareddirector, replace > *end > > > On Oct 13, 2005, at 4:43 AM, Nick Cox wrote: > > > This is easier done long. > > > > save safecopy > > > > gen long id = _n > > reshape long PID, i(id) > > bysort id (PID) : gen same = PID == PID[_n-1] > > bysort id (same) : replace same = same[_N] > > reshape wide > > > > Nick > > n.j.cox@durham.ac.uk > > > > Seb Buechte > > > > > >> you could take a "brute force" approach by comparing each > var with > >> all > >> the other vars using two loops: > >> > >> gen interlock=0 > >> foreach var1 of varlist PID1 PID2 .... { > >> foreach var2 of varlist PID2 PID3.... { > >> if "`var1'"!="`var2'" { // making sure you do not > compare the > >> var with itself > >> replace interlock=1 if `var1' == `var2' > >> } > >> } > >> } > >> > >> I am not too sure how long it will take to run through these loops. > >> > > > > Derek Darves > > > > > >>> I have a group of variables: > >>> > >>> PID1 - PID15 > >>> > >>> PID* takes on values from 1 to 8000, and many are missing. > >>> > >>> Basically, I would like to make a new variable, called interlock, > >>> that is equal to 1 if any of the variables in the list > are equal to > >>> any other variable in the list (not including itself, of course). > >>> For example, if PID5==705 and PID14==705 I would like like > >>> > >> interlock==1 > >> > >>> > >>> Likewise, if none of the the variables in PID* take on > the value of > >>> any of the other variables in PID*, I would like interlock==0 > >>> > >>> I tried this: > >>> egen interlock = group(pid1_a pid1_b pid2_a pid2_b pid3_a > >>> pid3_b pid4_a pid4_b pid5_a pid5_b pid6_a pid6_b pid7_a > >>> pid7_b pid8_a pid8_b pid9_a pid9_b pid10_a pid10_b pid11_a > >>> pid11_b pid12_a pid12_b pid13_a pid13_b pid14_a > >>> > >> pid14_b pid15_a) > >> > >>> > >>> , but it returned all missing values when I know that some share a > >>> common value in two of the PID* fields. > >>> > >>> Lastly, not that it should matter, but the above is a simplifying > >>> example. In my actual dataset I have about 130 PID* > >>> > >> variables. I just > >> > >>> mention this in case I am hitting some kind of memory > limitation (I > >>> am not receiving any errors when I run the command, > though, it just > >>> doesn't work). > >>> * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: RE: Getting list of current dataset's variables in order** - Next by Date:
**Re: st: RE: Hurdle, etc programs up** - Previous by thread:
**Re: st: create a variable based on a recurring value in a varlist** - Next by thread:
**st: windows xp, large dataset and memory problem** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |