Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Singeling out datasets containing variable X in folder with many stata files

From   Eric Booth <>
To   "<>" <>
Subject   Re: st: Singeling out datasets containing variable X in folder with many stata files
Date   Tue, 14 Jun 2011 15:17:43 +0000


On Jun 14, 2011, at 6:14 AM, Lukas Maximilian Rudolph wrote:

> Dear Statalisters,
> I have a folder with many stata-files that I am about to merge. Some of these contain information on household level, some on individual level. Of these, some are are in wide, some are in long form.
> I now want to identify all file names that contain a certain variable: 
> I want to seperate all files with the variable "pidlink", the individual identifier.
> Within these, I want to identify all files that contain a variable ending with "*type" as just these are in long form. 
> Then I would be able to construct one loop that reshapes all datasets in long form and then another loop that merges all files on individual and household level automatically without going through every single file. 

There are some tips on using a global macro extended function and -descsave- (from SSC) to do something in a similar thread from a few days ago:

Also see:

These threads should help point you to ways to loop over files in a directory and identify variable names/types.  You'll notice that several of these examples include -append-, -merge- and/or -reshape- in the course of looping over the selected files in a directory (or sub-directory).

> My thought would have been to try to save the files in different folders conditional on whether the respective variable is contained - but save is not combinable with "if".
> Is there another way to sort these files out?

I don't think it's necessary to move these to different sub-folders conditional on their variable contents -- I would just change the dataset name to flag it.  So, change dataset1.dta to dataset1_hasmyvar.dta so that later you could still select all "dataset1_*" files with a macro as well as all "*_hasmyvar" files in another macro without having to tell the code to navigate over sub-folders.  However, that is just my preference, it's possible to work with the files either way.

- Eric

Eric A. Booth
Public Policy Research Institute
Texas A&M University

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index