Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Using infile with varying lines() per observation

From	Sergiy Radyakin <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Using infile with varying lines() per observation
Date	Fri, 26 Jul 2013 17:10:23 -0400

Kevin, I really want to help, but I don't understand your description
of the data then.

Your file declares variables C1, C2, C3, C4, where is C5 coming from
(see your output)? I assume that C1, C2 etc are really some meaningful
names, like age, education, gender. The program would not be able to
'invent' C5 even if it looks like an account number or a set of GPS
coordinates. Don't use same letters for variables and values. If you
use C in column names C1, C2,..., then use x,y,z for values.

Values C D and E were all in the C3 column, how would they end up in
different columns in the output?
Where is C in your final output? Is it part of ID, or a separate
variable? Do you want to discard it?
And can I be sure that every subject will have at least two lines in
the dataset? Are they always the same with A,B,C\A,B,D ? or can that
vary?

What is A B C D ? is that a number? a string? if a string is it a
letter? a word? a sentence?
I am also confused by the blank lines following each subject. Are they
part of the file? (important)

It would be better if you could take the 5-10 representative subjects
from your actual dataset, change the names to Cameron Diaz, Angelina
Jolie, Bill Gould and other legendary people, divide their incomes by
3.14, and then post the resulting file somewhere from where it can be
web-read into Stata. That would take care of much of the questions.

Sergiy

On Fri, Jul 26, 2013 at 4:18 PM, Kevin McConeghy
<[email protected]> wrote:
> Thank you for your help Sergiy, however I did a bad job describing my
> data. I am having trouble adapting your code. The columns are
> fixed-format.
>
>
> The id var=D is on the second line.
>
> C1 C2 C3 C4
>
> A   B   C
> A   B   D
>      B   E  F
>      B
>      B
>      B
>
> A   B   C
> A   B   D
>      B   E  F
>
> A   B   C
> A   B   D
>      B   E  F
>      B
>      B
>
> I need to convert so it is:
>
> C1  C2            id   C4  C5
>
> AA  BBBBBB   D   E    F
>
> AA  BBB          D   E    F
>
> AA  BBBBB      D   E    F
>
>
> I apologize for being vague before.
>
> Kevin
>
> ------------------------------
>
> Date: Thu, 25 Jul 2013 20:28:15 -0400
> From: Sergiy Radyakin <[email protected]>
> Subject: Re: st: Using infile with varying lines() per observation
>
> Kevin, considering your described setup the following should work:
>
> type http://radyakin.org/statalist/2013072501/testdata.txt
> do http://radyakin.org/statalist/2013072501/readflex.do
>
> Here is the output:
>
> id col1 col2 col3 col4
> 1 A     B    C   7
>         B
> 2 A     B    C   1
> 3 A     B    C   90
>         B
>         B
>
>
>     id   col1   col2   col3   col4
>      1      A     BB      C      7
>      2      A      B      C      1
>      3      A    BBB      C     90
>
>
> It's up to you to make sure that 244 chars is enough for the whole BBB
> value and that the numbers are completely located in the first line of
> each subject. Id is assumed to be a string.
>
> Hope this helps, Sergiy Radyakin
>
> --
> Kevin McConeghy, PharmD, BCPS
> 833 S Wood St, Chicago, IL 60612
> College of Pharmacy, Dept. of Pharmacy Practice
> University of Illinois at Chicago
> (312)-413-1422, [email protected]
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Using infile with varying lines() per observation
  - From: Kevin McConeghy <[email protected]>

Prev by Date: st: Re: Adding a oneway plot to marginsplot
Next by Date: st: encode vs destring
Previous by thread: Re: st: Using infile with varying lines() per observation
Next by thread: st: Stata13 - saveold modifications
Index(es):
- Date
- Thread