[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: insheet delimiter problem

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	RE: st: insheet delimiter problem
Date	Mon, 10 Nov 2008 15:01:42 -0000

I would never recommend alteration of the original data file. I would
always recommend that you work on a copy. With that proviso, this route
is more complicated than using -filefilter-. 

Nick 
[email protected] 

Joseph Coveney

Ada Ma wrote:

Thanks a lot to both for the solutions you have suggested.  I think
the -filefilter- command will be the easiest to implement given that
I'm on a Windoze system!

------------------------------------------------------------------------
--------

Your best bet might be something along the line of the approach
illustrated
below.  It can import the pipe-delimited text file that you show all
within
Stata's main dataset area using only -infix- and -split-, i.e., without
resorting to Unix-like commands, such as prior alteration of the
external
text data file using -filefilter-.

For the illustration, I've created a text file containing the
pipe-delimited, quotation-mark-containing dataset that you showed.  I've
named it 1.prn, and it's on Stata's working directory on my machine.
The
do-file is shown below first, and the Results window play-by-play is
shown beneath it.

Joseph Coveney

clear *
set more off
type 1.prn
infix str a 1-244 using 1.prn
list, noobs
split a, generate(a_) parse("|")
drop a
list, noobs
foreach var of varlist _all {
    local newname = `var'[1]
    rename `var' `newname'
}
drop in 1
list, noobs
exit

-------------------

. clear *

. set more off

. type 1.prn
epikey|hrg|code1|code2|code3
1|A0123|D100|V123|K166
2|A0125|D200|"|G122
3|B0101|D300|"|C333
4|B0122|D400|E002|V777

. infix str a 1-244 using 1.prn
(5 observations read)

. list, noobs

  +------------------------------+
  |                            a |
  |------------------------------|
  | epikey|hrg|code1|code2|code3 |
  |       1|A0123|D100|V123|K166 |
  |          2|A0125|D200|"|G122 |
  |          3|B0101|D300|"|C333 |
  |       4|B0122|D400|E002|V777 |
  +------------------------------+

. split a, generate(a_) parse("|")
variables created as string:
a_1  a_2  a_3  a_4  a_5

. drop a

. list, noobs

  +----------------------------------------+
  |    a_1     a_2     a_3     a_4     a_5 |
  |----------------------------------------|
  | epikey     hrg   code1   code2   code3 |
  |      1   A0123    D100    V123    K166 |
  |      2   A0125    D200       "    G122 |
  |      3   B0101    D300       "    C333 |
  |      4   B0122    D400    E002    V777 |
  +----------------------------------------+

. foreach var of varlist _all {
  2.     local newname = `var'[1]
  3.     rename `var' `newname'
  4. }

. drop in 1
(1 observation deleted)

. list, noobs

  +----------------------------------------+
  | epikey     hrg   code1   code2   code3 |
  |----------------------------------------|
  |      1   A0123    D100    V123    K166 |
  |      2   A0125    D200       "    G122 |
  |      3   B0101    D300       "    C333 |
  |      4   B0122    D400    E002    V777 |
  +----------------------------------------+

. exit

end of do-file

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: insheet delimiter problem
  - From: "Joseph Coveney" <[email protected]>

Prev by Date: RE: st: Interval variables as independent variables
Next by Date: RE: st: RE: Manipulation of the distribution
Previous by thread: Re: st: insheet delimiter problem
Next by thread: Re: st: insheet delimiter problem
Index(es):
- Date
- Thread