Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: managing changing variable names, types over multiple files


From   Eric Booth <ebooth@ppri.tamu.edu>
To   "<statalist@hsphsun2.harvard.edu>" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: managing changing variable names, types over multiple files
Date   Fri, 10 Jun 2011 19:26:48 +0000

<>
Paul:
-descsave- from SSC would be useful for storing the variable names and other attributes (see the help file).  
As far as looping over files, take a look at -help extended_fcn-.
Here's an example of what I think you're describing:
*********!
sysuse auto, clear
forval n = 1/9 {
	sa testdata200`n', replace
	}	
clear
sa masterlist.dta, emptyok replace
global files: dir "`c(pwd)'" files "testdata*.dta", nofail respectcase
foreach f of global files {
		u `f', clear
		descsave, sa(autodesc.dta, replace)  //could save a do() here too
        g filename = "`f'"
        order filename
        sa desc_`f'.dta , replace
u masterlist.dta, clear
append using desc_`f'.dta
sa masterlist.dta, replace
}
u masterlist.dta, clear
desc
*********!
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
ebooth@ppri.tamu.edu


On Jun 10, 2011, at 1:54 PM, Paul Burkander wrote:

> Hi all,
> 
> I'm working with data that cover several years, with a separate file
> for each year.  Unfortunately, the names and types of variables
> sometimes change from year to year, making it difficult to append all
> the files.  There are a large number of variables, so it's difficult
> to check for changes by hand.  Also, we'll be getting more years in
> the future, so I'd like to, as much as possible, automate a system
> that catalogs variable names and types.
> 
> I'm envisioning a system where we have a macro with the names of all
> the files, then loop over each file, capture all the variable names
> and types, and dump it into a master variable attributes file.  I'm
> imagining a different variable for each row/attribute, so there'd be a
> 2007varname and a 2008vartype, for instance.  There would also be a
> mastervarname for what we want the variable to me.  Each row would
> correspond to the variable whose name may or may not change over time.
> 
> Does this seem like a reasonable way to automate this?  Do any of you
> have any other ideas?  are there any user written programs that can
> aid in this process?
> 
> I'd greatly appreciate any suggestions!
> 
> Paul
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/





*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index