Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Creating a variable from comments/header in a .txt file


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Creating a variable from comments/header in a .txt file
Date   Thu, 14 Apr 2005 13:05:42 +0100

I'll try the easier part of this. 

Once your filename is in 

pl[1] 

you can refer to that on the fly by 

`=pl[1]' 

so your -merge- command would then be

merge obsno using `=pl[1]', keep(pl) 

The command 

replace pl ="10u5uc" if _merge==1

would more generally be 

replace pl = pl[1] if _merge == 1 

if I understand you correctly. 

Nick 
[email protected] 

Juan Solon
 
> I would like to advise on using a data dictionary to extract comments
> from a .txt file and putting this into a variable.
> 
> I wanted to read data from a  series of text files, with each file
> containing unique file identifiers/descriptors in  a header in lines
> 1-10.  All files are named data.txt and I would have to do this
> repeatedly and thus, Im looking for a solution that replicable across
> files. The data itself is in a tabular format and starts on 
> line 11 and
> would look like this:
> Spot counts:
>       1    2    3    4    5    6    7    8    9   10   11   12
> A     -    -    -    -    -    -    -    -    -    -    -    - 
> B     -    -    -    -    -    -    -    -    -    -    -    - 
> C     -    -    -    -    -    -    -    -    -    -    -    - 
> D     -    -    -    -    -    -    -    -    -    -    -    - 
> E     -    -    -    -    -    -    -    -    -    -  197    9 
> F     -    -    -    -    -    -    -    -    -    -  188    7 
> G     -    -    -    -    -    -    3    2    1  204  189   78 
> H     -    -    -    -    -    -    1    2    0  254  195   63 
> 
>  I have made a do file that can read the tabular data 
> beginning in Line
> 11 contained in data.txt using infile and a data dictionary.  
> My problem
> is that I want to tag each observation with a unique identifier that
> would be taken from line 1.  This is important because I would
> eventually merge all the data from the text files and I would need to
> know the source.
> 
> The solution I tried seems very clumsy, and it doesn't 
> actually work as
> I wanted it to.  I created a separate file with a variable (pl)
> containing the unique file identifier (the string from Line 
> 1).  I then 
> generated an observation no using generate obsno=_n and saved the file
> with the filename as the string from line 1.  This file can then be
> merged with the tabular data .   
> 
> 
> - start of do file - 
> set more off
> infile using eli2.dct, clear
> drop in 2/l
> gen pl=substr(plate,18,6) /* this is data from the first line for
> example a string "10u5uc"*/
> gen obsno=_n
> local file =pl
> sort obsno
> save `file', replace
> infile using eli.dct, clear
> gen obsno=_n
> sort obsno
> merge obsno using 10u5uc, keep(pl) /* How do I do this using  a macro
> instead of typing "10u5uc"
> 
>  - end of do file
> 
> 
> The resulting merged file then contains 8 observations and a string
> variable pl.  As expected, variable pl is missing for records 2-8.  I
> would like variable pl for records 2-8 to have the same value 
> as record
> 1.  I can do this manually:
> 
> replace pl ="10u5uc" if _merge==1
>  
> but would rather that it were a macro. At this point, I have 
> hit a brick
> wall!!!
> 
> 1.  How do I extract the data in Line 1 into a variable  and 
> repeat this
> for all the records ?
> 2.  Is it possible to merge files and refer to the using 
> dataset using a
> macro? 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index