Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: managing changing variable names, types over multiple files


From   Paul Burkander <paul@burkander.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: managing changing variable names, types over multiple files
Date   Fri, 10 Jun 2011 16:39:11 -0400

this seems to have worked very well; thank you!

I only kept the name and type, renamed "type" so it reflected the file
name, then merged all the variables.  Your general structure was very
useful.

Paul

On Fri, Jun 10, 2011 at 3:26 PM, Eric Booth <ebooth@ppri.tamu.edu> wrote:
> <>
> Paul:
> -descsave- from SSC would be useful for storing the variable names and other attributes (see the help file).
> As far as looping over files, take a look at -help extended_fcn-.
> Here's an example of what I think you're describing:
> *********!
> sysuse auto, clear
> forval n = 1/9 {
>        sa testdata200`n', replace
>        }
> clear
> sa masterlist.dta, emptyok replace
> global files: dir "`c(pwd)'" files "testdata*.dta", nofail respectcase
> foreach f of global files {
>                u `f', clear
>                descsave, sa(autodesc.dta, replace)  //could save a do() here too
>        g filename = "`f'"
>        order filename
>        sa desc_`f'.dta , replace
> u masterlist.dta, clear
> append using desc_`f'.dta
> sa masterlist.dta, replace
> }
> u masterlist.dta, clear
> desc
> *********!
> - Eric
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> ebooth@ppri.tamu.edu
>
>
> On Jun 10, 2011, at 1:54 PM, Paul Burkander wrote:
>
>> Hi all,
>>
>> I'm working with data that cover several years, with a separate file
>> for each year.  Unfortunately, the names and types of variables
>> sometimes change from year to year, making it difficult to append all
>> the files.  There are a large number of variables, so it's difficult
>> to check for changes by hand.  Also, we'll be getting more years in
>> the future, so I'd like to, as much as possible, automate a system
>> that catalogs variable names and types.
>>
>> I'm envisioning a system where we have a macro with the names of all
>> the files, then loop over each file, capture all the variable names
>> and types, and dump it into a master variable attributes file.  I'm
>> imagining a different variable for each row/attribute, so there'd be a
>> 2007varname and a 2008vartype, for instance.  There would also be a
>> mastervarname for what we want the variable to me.  Each row would
>> correspond to the variable whose name may or may not change over time.
>>
>> Does this seem like a reasonable way to automate this?  Do any of you
>> have any other ideas?  are there any user written programs that can
>> aid in this process?
>>
>> I'd greatly appreciate any suggestions!
>>
>> Paul
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index