Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Append multiple files from .txt file with "file read"


From   Matt Vivier <[email protected]>
To   [email protected]
Subject   Re: st: Append multiple files from .txt file with "file read"
Date   Thu, 5 Dec 2013 20:12:57 -0500

Hi Nicole,

If -merge- is what you're trying to do, then you were on the right track
with your initial attempt to use loops. This is something I find myself
doing more often than I'd like, and typically using a structure like this:

drop _all
local filelist : dir . files "*.dta"
foreach file of local files {
    if _N==0{
        use `file'
    }
    else{
        merge 1:1 ID using `file'
        drop _merge
    }
}

Three things to look out for:
1. Make sure you -drop- _merge each time, or Stata gets very upset very
quickly. I'm guilty of this a little too often.
2. After 25 of these, your screen will become a mess. Once you're
comfortable with it working correctly you might think about using -qui- to
suppress the output, and maybe just show a count of rows that didn't match.
3. If you have variables with the same name (but different values) in the
datasets you may find yourself with some unexpected results. You would want
to go through and rename the variables in each file if they matter to your
end result.

Best,
Matt Vivier
Data Analyst
(203) 541-4665
Remedy Partners, Inc


On Thu, Dec 5, 2013 at 7:49 PM, David Kantor <[email protected]> wrote:
> Hello Nicole,
>
> You may want to display `line' to see what you are getting.
> Put in...
>         disp "`line'"
> just before
>         use `line'
>
> How many words does it comprise?
> You could be failing because there is nothing there, or because there are
> multiple words.
> If there are multiple words, and the file name is all of `line' (there are
> embedded spaces), then you need quotation marks:
>         use "`line' "
>
> If there are embedded quotation marks, then use compound quotation marks
>         use `"`line' "'
> -- and that is the safest way, in general.
>
> But if only the first word is the desired filename, then you need to select
> that:
>         use "`=word("`line'",1)'"
>
> (Compound quotes may be safer:
>         use `"`=word(`"`line'"',1)'"'
> )
>
> Possibly this is an important consideration; you construct the file using -!
> ls-. Does that write information other that the names?
> (You are presumably on Unix; I don't recall exactly what you get from -ls-.)
>
>
> If there are blank lines in the file, you may want a filter to skip them:
>
>         file open myfile using "filelist.txt", read
>         file read myfile line
>         while ~r(eof) &  `"`line'"' == "" {
>                 file read myfile line
>         }
>         if `"`line'"' ~= "" {
>                 disp `"`line'"'
>                 use  `"`line'"'
>                 file read myfile line
>         }
>         while ~r(eof) {
>
>                 append using `"`line'"'
>                 file read myfile line
>         }
>
> I might write it a bit differently; this may be simpler:
>
>         local jj = 0
>
>         file open myfile using "filelist.txt", read
>         file read myfile line
>         while ~r(eof) {
>                 if `"`line'"' ~= "" {
>                         disp `"`line'"'
>                         if ~`jj++'  {
>                                 use  `"`line'"'
>                         }
>                         else {
>
>                                 append using `"`line'"'
>                         }
>                 }
>                 file read myfile line
>         }
>
> That is, the -use- or -append- both appear inside the loop; -use- occurs on
> the first pass, -append- on all subsequent passes.
>
> Again, pay attention to what is in `line'; you may want only part of it. The
> code above presumes you want all of `line' as the filename; you will need to
> modify it if you need only part.
>
> As for why your test loop displays the second but not the first line, I
> cannot say. (I've heard of failing to get the final line, but you don't seem
> to have that problem.)
>
> Note that your first -save master_data- is unnecessary.
> HTH
> --David
>
>
>
> At 06:30 PM 12/5/2013, you wrote:
>>
>> Hello all,
>>
>> First and foremost, I have yet to fully understand how to use macros,
>> so please forgive me if the solution to this problem is painfully
>> obvious. I actually hope it's painfully obvious.
>>
>> I'm trying to combine multiple .dta files (1:1 horizontally appended)
>> by calling several .dta filenames stored in a .txt file. However, in
>> the process of doing this, whenever I try to run:
>>
>> .    use `line'
>>
>> Stata returns the error:
>>
>> .    invalid file specification
>>
>>
>> Here's the code I'm trying to execute (sourced from here*). To start,
>> I'm trying to execute this code on a .txt file containing just two
>> lines (aka: two .dta filenames), but the final file will have 25
>> lines:
>>
>>    pwd
>>    cd ~/Desktop/merge
>>    ! ls *.dta >filelist.txt
>>    file open myfile using "filelist.txt", read
>>    file read myfile line
>>    use `line'  /* ERROR HERE */
>>    save master_data, replace
>>    file read myfile line
>>    while r(eof)==0 {
>>    append using `line'
>>    file read myfile line
>>    }
>>    file close myfile
>>    save master_data, replace
>>
>>
>> I first thought the problem was that "filelist.txt" wasn't being read.
>> However, I believe it IS being read, since running the following:
>>
>>    ! ls *.dta >filelist.txt
>>    file open myfile using "filelist.txt", read
>>    file read myfile line
>>    while r(eof)==0 {
>>    display "`=word("`line'",1)'"
>>     file read myfile line
>>     }
>>
>> only displays the second (but not the first) line of the two-line .txt
>> file.
>>
>> Perhaps my issue has something to do with Stata overlooking the first
>> line of the .txt file? Or perhaps my general macro-incompetence (more
>> likely)?
>>
>> Any help will be greatly appreciated. Thanks so much for your
>> consideration.
>>
>> Nicole
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

-- 
The information contained in this transmission and any attachments may be 
confidential, proprietary  or privileged, and may be subject to protection 
under applicable law. This transmission is intended for the sole use of the 
individual or  entity to whom it is addressed. If you think you have 
received this transmission in error, please alert 
[email protected] and then delete this e-mail immediately. 
Thank you.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index