Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st:How to input a portion of a file


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st:How to input a portion of a file
Date   Thu, 21 Feb 2008 12:49:47 -0000

That's an instructive example. 

As I understand it, -insheet- peeks at the early bit of the file, makes
a guess at the number and type of variables, and assigns accordingly.
Whether guessing will also reliably give a workable answer with Joseph
Wagner's files, I can't say. 

Nick 
[email protected] 

Friedrich Huebler

Assume we have a file "test.txt" that contains the following text
(without the Start and End lines). We are only interested in the
numbers.

=== Start of file ===
I am not clear how that this will help, as the header text and
the remainder of the file will give -insheet- quite different
ideas about what variables there are.
mpg trunk turn
22 11 40
17 11 40
22 12 35
20 16 40
=== End of file ===

Let's import the data with -insheet-.

. insheet using test.txt, nonames delimiter(" ")
(14 vars, 8 obs)
. drop if _n < 5
(4 observations deleted)
. drop v4 - v14
. list

     +--------------+
     | v1   v2   v3 |
     |--------------|
  1. | 22   11   40 |
  2. | 17   11   40 |
  3. | 22   12   35 |
  4. | 20   16   40 |
     +--------------+

Friedrich

On Wed, Feb 20, 2008 at 6:35 AM, Nick Cox <[email protected]> wrote:
> I am not clear how that this will help, as the header text and the
>  remainder of the file will give -insheet- quite different ideas about
>  what variables there are.
>
>
>  Nick
>  [email protected]
>
>  Friedrich Huebler
>
>
>  You wrote that -insheet- with subsequent deletion of unwanted data is
>  "sloppy". That approach might still be the easiest if all files have
>  the same structure and your data always appear in the same columns.
>
>  . insheet using filename, nonames
>  . drop if _n < 30 | _n > 129
>  . drop v1 - v20 v25 - v30
>
>
>
> On Feb 18, 2008 9:26 AM, Joseph Wagner <[email protected]>
wrote:
>  > I have data I wish to input a portion of into STATA.  Data is
>  collected
>  > on patients by a machine that measures their gait as they walk.  A
>  text
>  > file is output for each patient with columns representing variables
>  > (each about 130 lines long) but the multiple observation data
doesn't
>  > start until line 29.  The first 28 lines are taken up with short
lines
>  > of data describing the patient.  Unfortunately, I also need a
couple
>  of
>  > those lines in 'header' area.  The 29th line has the variables
names
>  but
>  > they do not line up directly with the columns of data so I figured
I
>  > could just label the data later.  The data I need starts 30 lines
down
>  > at column 115 and includes the next 4 columns and goes down 100
lines.
>  >
>  > I realize there are easier ways to do this but I have data on about
>  300
>  > patients (and so one file for each person) and wanted to automate
this
>  > input (followed by successive merging of files to get my final
>  dataset).
>  >
>  > I wanted to use the -infix- command but have never used this
command
>  > before and my attempts so far have failed.  I also tried using
>  -infile-
>  > with the _first(30) option and the _line(30) option but those
didn't
>  > seem to work either.
>  >
>  > Here is a dictionary I attempted with just one of the variables:
>  >
>  > dictionary using "c:\data\gait\SBS00001_20050607_1.nrm" {
>  >        _line(30)
>  >        _column(115) r_grf_vrt_frc %5f
>  > }
>  >
>  > infile using SBS00001_20050607_1.dct
>  >
>  > unexpected end of file
>  > (5 observations read)
>  >
>  > The other problem is that it didn't seem to pull the data
>  corresponding
>  > to that column.  I thought perhaps there was a problem with the
data
>  not
>  > being in a fixed format but if I try -insheet- all the data imports
>  and
>  > the correct data lines up in the individual columns.  Of course I
>  could
>  > write some programming whereby I delete the unneeded variables and
>  line
>  > but that's kind of sloppy.
>  >
>  >
>  >
>  > I am using STATA ver. 8.2

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index