Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Append multiple files from .txt file with "file read"


From   Nicole Boyle <nicboyle@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Append multiple files from .txt file with "file read"
Date   Mon, 9 Dec 2013 12:06:53 -0800

Thanks so much for the thorough explanation, Sergiy! That actually
helps tremendously. Many thanks.

Best,
Nicole

On Fri, Dec 6, 2013 at 3:18 PM, Sergiy Radyakin <serjradyakin@gmail.com> wrote:
> Nicole,
>
> the 'enigma' you mentioned is really very simple. Locals are only
> visible within the same 'context' - e.g. within the same program or
> do-file. What happens is that when you execute your example from the
> do-editor line-by-line, Stata creates temporary do-files each time,
> and hence a new context. You can probably see these files' names in
> the output window:
> . do "C:\Users\nboyle\AppData\Local\Temp\STD1c000000.tmp"
> (or something similar)
> Hence the second line can't see the results of the first line. When
> you execute two lines together, they are in the same context, and
> hence the result of the first line is available to the second.
>
> Hope this helps,
> Best Sergiy Radyakin
>
> PS: (there is however an undocumented command c_local that allows one
> to jump across this boundary,
> http://www.stata.com/statalist/archive/2003-12/msg00385.html
> use of this command is discouraged)
>
> On Fri, Dec 6, 2013 at 5:47 PM, Nicole Boyle <nicboyle@gmail.com> wrote:
>> Thanks to David Radwin, Sergiy Radyakin, David Kantor, and Matt Vivier
>> for your very helpful replies!
>> I think I've identified one area where I went wrong. When I was
>> initially attempting to run my original code yesterday, I was trying
>> to run the first few lines "line-by-line" (since I'm not yet confident
>> in programming, I wanted to make sure what I wanted to happen was
>> _actually_ happening). However, it seems that the error that I
>> originally noted below:
>> ...
>>     ! ls *.dta >filelist.txt
>>     file open myfile using "filelist.txt", read
>>     file read myfile line
>>     use `line'  /* ERROR HERE */
>> ...
>>
>> didn't occur today when executing the code as a single block.
>>
>> The lesson I'd LIKE to take away from this that local macros can only
>> be used within the same block of code from which they're created.
>> However, I'm not sure this is truly the case, since something simple
>> as this:
>>
>>     local x="whatever"
>>     display "`x'"
>>
>> CAN, in fact, be run successfully line-by-line.
>>
>>
>> Apart from this enigma, I played around with the codes each of you
>> kindly posted and it was extremely helpful. It seems that there are
>> multiple ways of accomplishing the same goal, which is great to know.
>> I ended up using David Kantor's code and replaced -append- with
>> -merge- along with options -nogenerate- and -update-.
>>
>> ...
>>     ! ls *.dta >filelist.txt
>>     local jj = 0
>>         file open myfile using "filelist.txt", read
>>         file read myfile line
>>         while ~r(eof) {
>>                 if `"`line'"' ~= "" {
>>                         disp `"`line'"'
>>                         if ~`jj++'  {
>>                                 use  `"`line'"'
>>                         }
>>                         else {
>>                                 merge 1:1 id using `"`line'"', nogenerate update
>>                         }
>>                 }
>>                  file read myfile line
>>         }
>>
>>
>> Thanks so much for your help and patience!
>>
>> Best,
>> Nicole
>>
>> On Thu, Dec 5, 2013 at 5:12 PM, Matt Vivier <mvivier@remedypartners.com> wrote:
>>> Hi Nicole,
>>>
>>> If -merge- is what you're trying to do, then you were on the right track
>>> with your initial attempt to use loops. This is something I find myself
>>> doing more often than I'd like, and typically using a structure like this:
>>>
>>> drop _all
>>> local filelist : dir . files "*.dta"
>>> foreach file of local files {
>>>     if _N==0{
>>>         use `file'
>>>     }
>>>     else{
>>>         merge 1:1 ID using `file'
>>>         drop _merge
>>>     }
>>> }
>>>
>>> Three things to look out for:
>>> 1. Make sure you -drop- _merge each time, or Stata gets very upset very
>>> quickly. I'm guilty of this a little too often.
>>> 2. After 25 of these, your screen will become a mess. Once you're
>>> comfortable with it working correctly you might think about using -qui- to
>>> suppress the output, and maybe just show a count of rows that didn't match.
>>> 3. If you have variables with the same name (but different values) in the
>>> datasets you may find yourself with some unexpected results. You would want
>>> to go through and rename the variables in each file if they matter to your
>>> end result.
>>>
>>> Best,
>>> Matt Vivier
>>> Data Analyst
>>> (203) 541-4665
>>> Remedy Partners, Inc
>>>
>>>
>>> On Thu, Dec 5, 2013 at 7:49 PM, David Kantor <kantor.d@att.net> wrote:
>>>> Hello Nicole,
>>>>
>>>> You may want to display `line' to see what you are getting.
>>>> Put in...
>>>>         disp "`line'"
>>>> just before
>>>>         use `line'
>>>>
>>>> How many words does it comprise?
>>>> You could be failing because there is nothing there, or because there are
>>>> multiple words.
>>>> If there are multiple words, and the file name is all of `line' (there are
>>>> embedded spaces), then you need quotation marks:
>>>>         use "`line' "
>>>>
>>>> If there are embedded quotation marks, then use compound quotation marks
>>>>         use `"`line' "'
>>>> -- and that is the safest way, in general.
>>>>
>>>> But if only the first word is the desired filename, then you need to select
>>>> that:
>>>>         use "`=word("`line'",1)'"
>>>>
>>>> (Compound quotes may be safer:
>>>>         use `"`=word(`"`line'"',1)'"'
>>>> )
>>>>
>>>> Possibly this is an important consideration; you construct the file using -!
>>>> ls-. Does that write information other that the names?
>>>> (You are presumably on Unix; I don't recall exactly what you get from -ls-.)
>>>>
>>>>
>>>> If there are blank lines in the file, you may want a filter to skip them:
>>>>
>>>>         file open myfile using "filelist.txt", read
>>>>         file read myfile line
>>>>         while ~r(eof) &  `"`line'"' == "" {
>>>>                 file read myfile line
>>>>         }
>>>>         if `"`line'"' ~= "" {
>>>>                 disp `"`line'"'
>>>>                 use  `"`line'"'
>>>>                 file read myfile line
>>>>         }
>>>>         while ~r(eof) {
>>>>
>>>>                 append using `"`line'"'
>>>>                 file read myfile line
>>>>         }
>>>>
>>>> I might write it a bit differently; this may be simpler:
>>>>
>>>>         local jj = 0
>>>>
>>>>         file open myfile using "filelist.txt", read
>>>>         file read myfile line
>>>>         while ~r(eof) {
>>>>                 if `"`line'"' ~= "" {
>>>>                         disp `"`line'"'
>>>>                         if ~`jj++'  {
>>>>                                 use  `"`line'"'
>>>>                         }
>>>>                         else {
>>>>
>>>>                                 append using `"`line'"'
>>>>                         }
>>>>                 }
>>>>                 file read myfile line
>>>>         }
>>>>
>>>> That is, the -use- or -append- both appear inside the loop; -use- occurs on
>>>> the first pass, -append- on all subsequent passes.
>>>>
>>>> Again, pay attention to what is in `line'; you may want only part of it. The
>>>> code above presumes you want all of `line' as the filename; you will need to
>>>> modify it if you need only part.
>>>>
>>>> As for why your test loop displays the second but not the first line, I
>>>> cannot say. (I've heard of failing to get the final line, but you don't seem
>>>> to have that problem.)
>>>>
>>>> Note that your first -save master_data- is unnecessary.
>>>> HTH
>>>> --David
>>>>
>>>>
>>>>
>>>> At 06:30 PM 12/5/2013, you wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> First and foremost, I have yet to fully understand how to use macros,
>>>>> so please forgive me if the solution to this problem is painfully
>>>>> obvious. I actually hope it's painfully obvious.
>>>>>
>>>>> I'm trying to combine multiple .dta files (1:1 horizontally appended)
>>>>> by calling several .dta filenames stored in a .txt file. However, in
>>>>> the process of doing this, whenever I try to run:
>>>>>
>>>>> .    use `line'
>>>>>
>>>>> Stata returns the error:
>>>>>
>>>>> .    invalid file specification
>>>>>
>>>>>
>>>>> Here's the code I'm trying to execute (sourced from here*). To start,
>>>>> I'm trying to execute this code on a .txt file containing just two
>>>>> lines (aka: two .dta filenames), but the final file will have 25
>>>>> lines:
>>>>>
>>>>>    pwd
>>>>>    cd ~/Desktop/merge
>>>>>    ! ls *.dta >filelist.txt
>>>>>    file open myfile using "filelist.txt", read
>>>>>    file read myfile line
>>>>>    use `line'  /* ERROR HERE */
>>>>>    save master_data, replace
>>>>>    file read myfile line
>>>>>    while r(eof)==0 {
>>>>>    append using `line'
>>>>>    file read myfile line
>>>>>    }
>>>>>    file close myfile
>>>>>    save master_data, replace
>>>>>
>>>>>
>>>>> I first thought the problem was that "filelist.txt" wasn't being read.
>>>>> However, I believe it IS being read, since running the following:
>>>>>
>>>>>    ! ls *.dta >filelist.txt
>>>>>    file open myfile using "filelist.txt", read
>>>>>    file read myfile line
>>>>>    while r(eof)==0 {
>>>>>    display "`=word("`line'",1)'"
>>>>>     file read myfile line
>>>>>     }
>>>>>
>>>>> only displays the second (but not the first) line of the two-line .txt
>>>>> file.
>>>>>
>>>>> Perhaps my issue has something to do with Stata overlooking the first
>>>>> line of the .txt file? Or perhaps my general macro-incompetence (more
>>>>> likely)?
>>>>>
>>>>> Any help will be greatly appreciated. Thanks so much for your
>>>>> consideration.
>>>>>
>>>>> Nicole
>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> --
>>> The information contained in this transmission and any attachments may be
>>> confidential, proprietary  or privileged, and may be subject to protection
>>> under applicable law. This transmission is intended for the sole use of the
>>> individual or  entity to whom it is addressed. If you think you have
>>> received this transmission in error, please alert
>>> compliance@remedypartners.com and then delete this e-mail immediately.
>>> Thank you.
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index