[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Eric A. Booth" <ebooth@ppri.tamu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: AW: Merging database |

Date |
Wed, 29 Apr 2009 13:14:00 -0500 |

On Apr 29, 2009, at 12:03 PM, Nick Cox wrote:

I have various comments on this code. 1. -foreach x in V*- won't work. Eric is probably thinking of-foreach x of var V*- but in this case -foreach v in V1 V2 V3- takesno more thought.

2. Eric wants to -recode- missings and also concatenate theidentifiers. If so, it is easier to goegen V_combined = concat(V1 V2 V3), p(_) replace V_combined = subinstr(V_combined, ".", "x", .)

...

Note that using 99, even temporarily, is dangerous unless one can besure that 99 is not a legitimate identifier. In any case, whyrecode? A variable with values like "1 . ." is a satisfactorycomposite -- if that is what is needed.

Thanks, Nick...this is very helpful.

EAB __ Eric A. Booth Public Policy Research Institute Texas A&M University ebooth@ppri.tamu.edu Office: +979.845.6754 Fax: +979.845.0249

Eric A. Booth To add to Jochen's comment: If you were hoping to have a new 'ID' variable that keeps the information from all the ID variables V1, V2, and V3, you could create a string variable...here are some examples: ****************** clear input V1 V2 V3 1 . 1 2 . 2 3 3 3 4 4 . . 5 5 6 . 6 end // foreach x in V* { recode `x' (.=99) // <-- So that -regexr- isn't tripped up later tostring `x', replace } gen str10 v_combined = V1+"_"+V2+"_"+V3 gen v_combined2 = regexr(v_combined, "99", "x") sencode v_combined2, gene(uniqueID) gsort(+v_combined2) label(id) list On Apr 29, 2009, at 11:36 AM, Jochen Späth wrote:Hello Sergio, I'm not quite sure of what your problem is, maybe it would help if you were a little more precise. Below, I assumed that the example you gave is AFTER your three data sets have been merged, with v1 coming from the first, v2 from the second and v3 from the third and with v1, v2 and v3 denoting all the same ID. If this is the case you could -replace v1 = v2 if v1 == . & v2 != .- -replace v1 = v3 if v1 == . & v2 ==. & v3 != .- -count if v1 == .- /* should return 0, otherwise there are observations in your data that are not uniquely determined by either of your three ID variables.*/ -drop v2 v3- /* of course, only if you got all IDs caught in v1 */ HTH, Jochen -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu ] Im Auftrag von "SERGIO M. AFCHA CHÁVEZ" Gesendet: Mittwoch, 29. April 2009 17:55 An: statalist@hsphsun2.harvard.edu Betreff: st: Merging database Dear Statlisters, I have a little problem merging a data base. I have variables for 3 years showing an ID: V1 V2 V3 1 . 1 2 . 2 3 3 3 4 4 . . 5 5 6 . 6I need only one ID variable. How can I obtain one column with alltheID numbers?* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: AW: Merging database***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**st: Merging database***From:*"SERGIO M. AFCHA CHÁVEZ" <s.afcha@ub.edu>

**st: AW: Merging database***From:*Jochen Späth <jochen.spaeth@iaw.edu>

**Re: st: AW: Merging database***From:*"Eric A. Booth" <ebooth@ppri.tamu.edu>

**RE: st: AW: Merging database***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: RE: svy:tabulate: what to report - design-based test or the Uncorrected chi-square** - Next by Date:
**RE: st: AW: Merging database** - Previous by thread:
**RE: st: AW: Merging database** - Next by thread:
**RE: st: AW: Merging database** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |