[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: infile and dictionaries and the small data mindset

From   "Friedrich Huebler" <>
Subject   Re: st: infile and dictionaries and the small data mindset
Date   Wed, 14 May 2008 13:56:20 -0400

You can -infix- the data. Keep in mind that strings are limited to a
length of 244 characters.

. infix str data 1-244 using data.txt
. infix str var1 1-244 str var2 245-488 using data.txt


On Wed, May 14, 2008 at 11:35 AM, E. Paul Wileyto
<> wrote:
> Thanks.  Both are somewhat useful.
> I think that what I actually need to do is import each line of text as a
> single string, and then parse it line-by-line according to whatever rules I
> can glean from the original data files.  Is that possible?
> If it is, I can parse by a set of rules that change according to which block
> headers I have hit up to that point.  Sounds like a pain, but if I can code
> it once, then I can hand it to someone else to do the import.
> P
> Friedrich Huebler wrote:
>> Paul,
>> Perhaps you can do this with -insheet-, as described in this thread:
>> Friedrich
>> On Wed, May 14, 2008 at 9:23 AM, E. Paul Wileyto <>
>> wrote:
>>> One of our worst fears is that someone will come to us with data
>>> scattered
>>> all over a spreadsheet file in little summary tables.  If they have lots
>>> of
>>> those files, I can usually find a way to script the import efficiently
>>> using
>>> ODBC.
>>> What if you have those same tables in a text file?  Is there any
>>> efficient
>>> way to import and parse data in such a format?  I have the far end of
>>> this
>>> process scripted so the researcher can generate his own summary
>>> statistics,
>>> but getting the data into Stata involves a program making an excel file,
>>> followed by cutting and pasting into Stata.  I'd like to cut out some of
>>> the
>>> import steps, so that all we would need to do is give a list of filenames
>>> to
>>> a Stata script, and watch the screen roll by as the data get extracted.
>>> The files are generated by a program that is monitoring mouse behavior.
>>>  Each file may contain behavior from one mouse on one day, or several
>>> mice
>>> on one day.  The general format is always the same.  For each mouse-run,
>>> there is a small block of ancillary information as a header.  I cannot
>>> guarantee that all of these blocks have the same number of words, but
>>> some
>>> of that info will be needed as data.  These are followed by blocks of
>>> numbers in columns.  Each block has an alphanumeric header before it (on
>>> its
>>> own line), and there are row numbers.
>>> I would have a fairly good idea how to script this in Matlab, but I don't
>>> want to be the one doing the import on a daily basis, and it's hard for
>>> the
>>> researcher to justify buying into some pricey software just to script
>>> that
>>> one task.
>>> Any clues about scripting this type of import in Stata would be
>>> appreciated.
>>> Thanks
>>> Paul
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index