Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: how does insheet determine datatypes?


From   David Kantor <[email protected]>
To   [email protected]
Subject   Re: st: how does insheet determine datatypes?
Date   Tue, 09 Jan 2007 21:56:50 -0500

At 02:46 PM 1/9/2007, Phil Schumm wrote:
On Jan 9, 2007, at 1:21 PM, David Kantor wrote:
Insheet, at least as it has been up to Stata 8, behaves
ungracefully if the second line contains var lables (and var names
are in the first line), which is how some raw datasets are
composed.  In this case, you get everything as string -- usually
very long ones.  And the var names in the raw data are ignored; you
get default names v1, v2, etc..  And what were supposed to be the
var names and labels end up as data in the first and second
observations.
In Stata 9, the -names- option causes -insheet- to handle the
variable names properly:

[...]

Of course, the variable labels still need to be dealt with.


If you encounter this situation, you may want to use - convert_top_lines-.

I just took a look at the code, and -convert_top_lines- could also
benefit from a function to generate Stata names from strings (i.e.,
if a variable name in the first row of the data file is not a valid
Stata name, -convert_top_lines- currently throws an error).  You can
of course trap this error as Nick pointed out, but you're then still
left with the question of what to name the corresponding variable.
It would be nice to do this in a way that was guaranteed to be
consistent with the way -insheet- does it (in case you were making
use of both on the same project).
Thank you for that information and the suggestion. I would put on my wishlist for -insheet- to have an option to handle the var-labels-in-second-line situation.

I have never had the problem you mentioned -- having an illegal variable name. So the issue never occurred to me. But it is worth considering. If I ever get going on this path, I would want to know what -insheet- does in these situations -- to be consistent with it, if possible. Meanwhile, if any users have used convert_top_lines and had this particular problem, I'd like to know.

--David

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index