Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Updating master dataset with a transation dataset

From   Alice W Muehlhof <muehlhof@Princeton.EDU>
To   <>
Subject   st: RE: Re: Updating master dataset with a transation dataset
Date   Tue, 14 Mar 2006 10:05:36 -0500

Thank you so much. That worked. It's simple and efficient and reduces the
possibility of error. 

-----Original Message-----
[] On Behalf Of Michael Blasnik
Sent: Tuesday, March 14, 2006 9:40 AM
Subject: st: Re: Updating master dataset with a transation dataset

There are a couple of ways you could do this.  My preference would probably
be to use reshape and merge:

use transaction
reshape wide var4, i(var1 var2) j(var3) string sort var1 var2 save
transactionwide use master sort var1 var2 merge var1 var2 using
transactionwide, update

This approach will update the values in the master dataset with the
transaction dataset if they are currently missing.  If you want to replace
existing values, then you would need to systematically rename the variables
in transaction wide (e.g., you could -replace var3=var3+"X"- before the
reshape), then perform the merge, giving you two version of each variable,
and then use a series of replace commands for each variable, probably best
done using a -foreach- loop .

Michael Blasnik

----- Original Message -----
From: "Alice W Muehlhof" <muehlhof@Princeton.EDU>
To: <>
Sent: Tuesday, March 14, 2006 7:57 AM
Subject: st: Updating master dataset with a transation dataset

> Hi,
> I am relatively new to Stata, although I have programming experience in 
> and C.
> This is what I would like to do, but I cannot figure out how to do it in
> Stata:
>    1. My master dataset has hundreds of variables, two of which, var1 and
> var2 I
> combine to use as a unique identifier of each record.
>    2. My second dataset, the transaction dataset, has 4 variables: two of
> them
>       are the same two as in the master dataset, var1 and var2. They are
> not unique
>       identifiers on this dataset.
>        a. The third variable in the transaction dataset, (var3), contains
> the name
>    of a variable in the master dataset which is to be updated. This
> variable
>    name is different from record to record on the transaction
> dataset.
>        b. The fourth variable, var4, contains the data that is to update
> var3 on
>    the master dataset.
> Ex: I would match record from transaction dataset with master dataset 
> record
> on var1 and var2.
> Then I would look at the contents of the var3 on the transaction dataset
> which
> would tell me the name of the variable that needed to updated on the 
> master
> dataset. Var4
> on the transaction dataset would tell me what the contents of this updated
> variable is to be.
> It is possible to have several records in the transaction dataset all with
> the same unique
> identifier, instructing the system to update different variables on the 
> same
> record of the
> master dataset.
> Now I know that I can just create a series of replace varname with the
> contents of var4
> statements and copy and paste this data into a do-file, and run it that
> way.
> But that is not very efficient, and I need to do this over and over again,
> so if
> there is a way I could read the transaction dataset, then retrieve the
> appropriate
> record from the master dataset, update the specified variable with it's
> contents
> from var4, that would be better.
> Is there anyway I can do this in Stata?
> Thank you so much for your help.
> Alice
> Alice Muehlhof
> Research Assistant
> Woodrow Wilson School
> Princeton University

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index