Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

  st: RE: Creating a variable from comments/header in a .txt file


From   Juan Solon <juan.solon@lshtm.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject     st: RE: Creating a variable from comments/header in a .txt file
Date   Thu, 14 Apr 2005 17:00:17 +0000

Dear Nick,
Thank you for your reply.
I now understand that pl[1] or varname[1] refers to the contents of the varname. I was able to successfully replace the redundant macros in the first half of the do file and save the file using the contents variable pl. . . .
infile using eli2.dct, clear
. .
gen pl=substr(plate,18,6) /* this is data from the first line : in this case "10u5uc"*/
. .
save `=pl[1]', replace
*this then saves the data as 10u5uc.dta


However, I need to read in another set of data, and it appears that the reference to pl[1] disappears
merge obsno using "`=pl[1]'"
pl not found
invalid file specification
r(198);

Is there a way to save as a global macro and then delete it after I am done with it?
Juan





---
I'll try the easier part of this. Once your filename is in pl[1]
you can refer to that on the fly by `=pl[1]'
so your -merge- command would then be

merge obsno using `=pl[1]', keep(pl)

The command
replace pl ="10u5uc" if _merge==1

would more generally be
replace pl = pl[1] if _merge == 1

if I understand you correctly.

Nick
n.j.cox@durham.ac.uk

--- original post from juan.solon@lshtm.ac.uk
I wanted to read data from a series of text files, with each file containing unique file identifiers/descriptors in lines 1-10. All files are named data.txt and I would have to do this repeatedly and thus, Im looking for a solution that replicable across files. The data itself is in a tabular format and starts on line 11 and would look like this:
Spot counts:
1 2 3 4 5 6 7 8 9 10 11 12
A - - - - - - - - - - - -
B - - - - - - - - - - - -
C - - - - - - - - - - - -
D - - - - - - - - - - - -
E - - - - - - - - - - 197 9
F - - - - - - - - - - 188 7
G - - - - - - 3 2 1 204 189 78
H - - - - - - 1 2 0 254 195 63

I have made a do file that can read the tabular data beginning in Line 11 contained in data.txt using infile and a data dictionary. My problem is that I want to tag each observation with a unique identifier that would be taken from line 1.

The solution I tried seems very clumsy, and it doesn't actually work. I created a separate file with a variable (pl) containing the unique file identifier (the string from Line 1). I then generated an observation no using generate obsno=_n and saved the file with the filename as the string from line 1. This file can then be merged with the tabular data .


- start of do file -
set more off
infile using eli2.dct, clear
drop in 2/l
gen pl=substr(plate,18,6) /* this is data from the first line for example a string "10u5uc"*/
gen obsno=_n
local file =pl
sort obsno
save `file', replace
infile using eli.dct, clear
gen obsno=_n
sort obsno
merge obsno using 10u5uc, keep(pl) /* How do I do this using a macro instead of typing "10u5uc"

- end of do file


The resulting merged file then contains 8 observations and a string variable pl. As expected, variable pl is missing for records 2-8. I would like variable pl for records 2-8 to have the same value as record 1. I can do this manually:

replace pl ="10u5uc" if _merge==1


but would rather that it were a macro.

1. How do I extract the data in Line 1 into a variable and repeat this for all the records?
2. Is it possible to merge files and refer to the using dataset using a macro?

I look forward to your suggestions!
Juan







© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index