Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Updating master dataset with a transation dataset


From   "Michael Blasnik" <michael.blasnik@verizon.net>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Updating master dataset with a transation dataset
Date   Tue, 14 Mar 2006 09:40:16 -0500

There are a couple of ways you could do this. My preference would probably be to use reshape and merge:

use transaction
reshape wide var4, i(var1 var2) j(var3) string
sort var1 var2
save transactionwide
use master
sort var1 var2
merge var1 var2 using transactionwide, update

This approach will update the values in the master dataset with the transaction dataset if they are currently missing. If you want to replace existing values, then you would need to systematically rename the variables in transaction wide (e.g., you could -replace var3=var3+"X"- before the reshape), then perform the merge, giving you two version of each variable, and then use a series of replace commands for each variable, probably best done using a -foreach- loop .

Michael Blasnik
michael.blasnik@verizon.net

----- Original Message ----- From: "Alice W Muehlhof" <muehlhof@Princeton.EDU>
To: <statalist@hsphsun2.harvard.edu>
Sent: Tuesday, March 14, 2006 7:57 AM
Subject: st: Updating master dataset with a transation dataset



Hi,
I am relatively new to Stata, although I have programming experience in SAS
and C.
This is what I would like to do, but I cannot figure out how to do it in
Stata:

1. My master dataset has hundreds of variables, two of which, var1 and
var2 I
combine to use as a unique identifier of each record.

2. My second dataset, the transaction dataset, has 4 variables: two of
them
are the same two as in the master dataset, var1 and var2. They are
not unique
identifiers on this dataset.
a. The third variable in the transaction dataset, (var3), contains
the name
of a variable in the master dataset which is to be updated. This
variable
name is different from record to record on the transaction
dataset.
b. The fourth variable, var4, contains the data that is to update
var3 on
the master dataset.


Ex: I would match record from transaction dataset with master dataset record
on var1 and var2.
Then I would look at the contents of the var3 on the transaction dataset
which
would tell me the name of the variable that needed to updated on the master
dataset. Var4
on the transaction dataset would tell me what the contents of this updated
variable is to be.

It is possible to have several records in the transaction dataset all with
the same unique
identifier, instructing the system to update different variables on the same
record of the
master dataset.

Now I know that I can just create a series of replace varname with the
contents of var4
statements and copy and paste this data into a do-file, and run it that
way.

But that is not very efficient, and I need to do this over and over again,
so if
there is a way I could read the transaction dataset, then retrieve the
appropriate
record from the master dataset, update the specified variable with it's
contents
from var4, that would be better.

Is there anyway I can do this in Stata?

Thank you so much for your help.

Alice

Alice Muehlhof
Research Assistant
Woodrow Wilson School
Princeton University
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index