Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Transfer of value labels: SQL data to Stata


From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   Re: st: Transfer of value labels: SQL data to Stata
Date   Wed, 26 Jan 2005 08:03:44 +0900

Benjamin A Lopman wrote:

I've got a question about getting data from our database into Stata for
analysis.  Perhaps one of you has gone through a similar process.

Transferring the data itself is of course no problem through an ODBC
connection.  The problem is handling the value labels.  (The variable
labels are also tricky but not such a major issue).  The problem is
mainly with the categorical variables and is as follows:

In Access (or any other database, as far as I am aware) the labels are
stored in a separate table.  For example, 1 and 2 are stored in a data
table and the  corresponding Male and Female labels are stored in a
second "lookup" table.  That is not how Stata or SPSS stores labels, so
when the data is brought back into one of those packages, the value
labels are lost.  I want to retain the numerical data and retain the
labels.  I am concerned about this because the dataset is so large  that
I don't want to have difficult to interpret  numerical data when the
data is shared with other analysts.

Stat Transfer does not seem to handle value labels converting to/from
database files.  

Has anyone overcome this problem?  Is there a script to properly carry
the value labels into the stats package?

--------------------------------------------------------------------------------

With Microsoft Access you can handle value labels in two ways.  One is via a so-
called lookup wizard that creates a lookup list at the table level.  The other 
way  is the conventional way that relationial database management systems use, 
which via use of a lookup table whose primary key is the foreign key for the 
column in the table referencing it for value labels.

I'm not sure how to get the value labels if your particular Access database 
uses the first method.  The use of Microsoft's lookup wizard to create a lookup 
list at the table level is highly discouraged, because of the all problems that 
it creates down the road.  (See numerous postings on the topic on 
comp.databases.ms-access .)

If your particular database uses the conventional method, the one that you 
describe in your post, then Phil Schumm is right in that the value labels need 
to be retrieved with script that specifically uses the column names in each 
case, that is, there is no practical universal script that automatically knows 
that table A is a lookup table and table B is the table referencing it for 
value labels.

However, the SQL statement to get the value labels is a simple join, which you 
can execute from within Stata's -odbc load-, if you wanted to get the values at 
once.  Then it's a simple -encode- to generate value labels.

If you want to reconstruct the value labels using the same numerical 
correspondence that the database uses, then you will need two trips to the 
database:  first to get the lookup table with its values and labels (which you 
can save from Stata for use to manually reconstruct the value labels later) and 
then to get the table referencing the lookup table (the data table).  Then you 
can construct the value labels within Stata to exactly match the number list 
used in the database.  Note that will be worthwhile only if the database uses 
numbers and not string (characters) as the primary key-foreign key columns.

Joseph Coveney


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index