[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: AW: Merging database |

Date |
Wed, 29 Apr 2009 18:03:38 +0100 |

I have various comments on this code. 1. -foreach x in V*- won't work. Eric is probably thinking of -foreach x of var V*- but in this case -foreach v in V1 V2 V3- takes no more thought. 2. Eric wants to -recode- missings and also concatenate the identifiers. If so, it is easier to go egen V_combined = concat(V1 V2 V3), p(_) replace V_combined = subinstr(V_combined, ".", "x", .) That does what Eric's lines foreach x in V* { recode `x' (.=99) // <-- So that -regexr- isn't tripped up later tostring `x', replace } gen str10 v_combined = V1+"_"+V2+"_"+V3 gen v_combined2 = regexr(v_combined, "99", "x") seem intended to do. Note that using 99, even temporarily, is dangerous unless one can be sure that 99 is not a legitimate identifier. In any case, why recode? A variable with values like "1 . ." is a satisfactory composite -- if that is what is needed. Nick n.j.cox@durham.ac.uk Eric A. Booth To add to Jochen's comment: If you were hoping to have a new 'ID' variable that keeps the information from all the ID variables V1, V2, and V3, you could create a string variable...here are some examples: ****************** clear input V1 V2 V3 1 . 1 2 . 2 3 3 3 4 4 . . 5 5 6 . 6 end // foreach x in V* { recode `x' (.=99) // <-- So that -regexr- isn't tripped up later tostring `x', replace } gen str10 v_combined = V1+"_"+V2+"_"+V3 gen v_combined2 = regexr(v_combined, "99", "x") sencode v_combined2, gene(uniqueID) gsort(+v_combined2) label(id) list On Apr 29, 2009, at 11:36 AM, Jochen Späth wrote: > Hello Sergio, > > I'm not quite sure of what your problem is, maybe it would help if > you were a little more precise. > > Below, I assumed that the example you gave is AFTER your three data > sets have been merged, with v1 coming from the first, v2 from the > second and v3 from the third and with v1, v2 and v3 denoting all the > same ID. If this is the case you could > > -replace v1 = v2 if v1 == . & v2 != .- > -replace v1 = v3 if v1 == . & v2 ==. & v3 != .- > -count if v1 == .- /* should return 0, otherwise there are > observations in your data that are not uniquely determined by either > of your three ID variables.*/ > -drop v2 v3- /* of course, only if you got all IDs caught in v1 */ > > HTH, > Jochen > > -----Ursprüngliche Nachricht----- > Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu > ] Im Auftrag von "SERGIO M. AFCHA CHÁVEZ" > Gesendet: Mittwoch, 29. April 2009 17:55 > An: statalist@hsphsun2.harvard.edu > Betreff: st: Merging database > > Dear Statlisters, > > I have a little problem merging a data base. I have variables for 3 > years showing an ID: > > > V1 V2 V3 > 1 . 1 > 2 . 2 > 3 3 3 > 4 4 . > . 5 5 > 6 . 6 > > > I need only one ID variable. How can I obtain one column with all the > ID numbers? > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: AW: Merging database***From:*"Eric A. Booth" <ebooth@ppri.tamu.edu>

**References**:**st: Merging database***From:*"SERGIO M. AFCHA CHÁVEZ" <s.afcha@ub.edu>

**st: AW: Merging database***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: AW: Merging database***From:*"Eric A. Booth" <ebooth@ppri.tamu.edu>

- Prev by Date:
**Re: st: AW: Merging database** - Next by Date:
**st: svy:tabulate: what to report - design-based test or the Uncorrected chi-square** - Previous by thread:
**Re: st: AW: Merging database** - Next by thread:
**Re: st: AW: Merging database** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |