Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: infile and dictionaries and the small data mindset


From   "E. Paul Wileyto" <[email protected]>
To   [email protected]
Subject   Re: st: infile and dictionaries and the small data mindset
Date   Wed, 14 May 2008 14:27:08 -0400

I'll try this and see what happens.  Thanks

P

Friedrich Huebler wrote:
You can -infix- the data. Keep in mind that strings are limited to a
length of 244 characters.

. infix str data 1-244 using data.txt
. infix str var1 1-244 str var2 245-488 using data.txt

Friedrich


On Wed, May 14, 2008 at 11:35 AM, E. Paul Wileyto
<[email protected]> wrote:

Thanks. Both are somewhat useful.
I think that what I actually need to do is import each line of text as a
single string, and then parse it line-by-line according to whatever rules I
can glean from the original data files. Is that possible?

If it is, I can parse by a set of rules that change according to which block
headers I have hit up to that point. Sounds like a pain, but if I can code
it once, then I can hand it to someone else to do the import.

P

Friedrich Huebler wrote:

Paul,

Perhaps you can do this with -insheet-, as described in this thread:

http://www.stata.com/statalist/archive/2008-02/msg00875.html
http://www.stata.com/statalist/archive/2008-02/msg00940.html

Friedrich

On Wed, May 14, 2008 at 9:23 AM, E. Paul Wileyto <[email protected]>
wrote:


One of our worst fears is that someone will come to us with data
scattered
all over a spreadsheet file in little summary tables. If they have lots
of
those files, I can usually find a way to script the import efficiently
using
ODBC.

What if you have those same tables in a text file? Is there any
efficient
way to import and parse data in such a format? I have the far end of
this
process scripted so the researcher can generate his own summary
statistics,
but getting the data into Stata involves a program making an excel file,
followed by cutting and pasting into Stata. I'd like to cut out some of
the
import steps, so that all we would need to do is give a list of filenames
to
a Stata script, and watch the screen roll by as the data get extracted.
The files are generated by a program that is monitoring mouse behavior.
Each file may contain behavior from one mouse on one day, or several
mice
on one day. The general format is always the same. For each mouse-run,
there is a small block of ancillary information as a header. I cannot
guarantee that all of these blocks have the same number of words, but
some
of that info will be needed as data. These are followed by blocks of
numbers in columns. Each block has an alphanumeric header before it (on
its
own line), and there are row numbers.

I would have a fairly good idea how to script this in Matlab, but I don't
want to be the one doing the import on a daily basis, and it's hard for
the
researcher to justify buying into some pricey software just to script
that
one task.

Any clues about scripting this type of import in Stata would be
appreciated.

Thanks

Paul
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

--
E. Paul Wileyto, Ph.D.
Assistant Professor of Biostatistics
Tobacco Use Research Center
School of Medicine, U. of Pennsylvania
3535 Market Street, Suite 4100
Philadelphia, PA 19104-3309

215-746-7147
Fax: 215-746-7140
[email protected]
http://mail.med.upenn.edu/~epw/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index