Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: insheet and dropping cases


From   "Ben Hoen" <[email protected]>
To   <[email protected]>
Subject   RE: st: insheet and dropping cases
Date   Thu, 20 Feb 2014 13:40:19 -0500

Thanks Sergiy!  LOL.  I hear you re big files.  I have a few million records
I am eventually going to try to read into Stata (hence the pre-planning).

I will see if I can find them as you suggest.

And thanks to David for the improved work-around using -inputst-

I will try to report back what I find as this might be an issue for another
user sometime.

Best,

Ben

Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Sergiy Radyakin
Sent: Thursday, February 20, 2014 1:28 PM
To: [email protected]
Subject: Re: st: insheet and dropping cases

Ben,

-- the problem is likely caused by presence of unprintable characters
in the file, that are tolerated by StatTransfer, but not by Stata;

-- character with ASCII code 255 is a usual suspect;

-- pasting raw data to statalist is likely not to reveal the problem,
since the special characters might not survive massaging throw emails;

-- isolating the problem in the text editor into a new file could help
(keep the last record read in correctly and one immediately after),
then make the file available through a link, to retain its binary
structure, not all text editors will retain special chars on save;

-- use hexdump "file" , analyze tabulate to see unprintable
characters, then search for them in the file or use filefilter;

-- see "zap gremlins" for relevant tactic.

On the bright side: you are lucky you have 363 cases. Last time I had
this problem, only 16gb out of 40gb were read in. Try to open that
file in the notepad :)

Hope this helps.

Best, Sergiy Radyakin


On Thu, Feb 20, 2014 at 12:34 PM, Radwin, David <[email protected]> wrote:
> One other possibility is to use -inputst-, a Stata program that calls
Stat/Transfer (part of -stcmd- by Roger Newson and available at SSC).
>
> This workaround is probably less computationally efficient than the
suggestions from others, but since you already know that Stat/Transfer
works, this approach might be faster and easier than trying to figure out
the problem with your text files and -insheet- or -import delimited-.
>
> David
> --
> David Radwin, Senior Research Associate
> Education and Workforce Development
> RTI International
> 2150 Shattuck Ave. Suite 800, Berkeley, CA 94704
> Phone: 510-665-8274
>
> www.rti.org/education
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:owner-
>> [email protected]] On Behalf Of Phil Schumm
>> Sent: Thursday, February 20, 2014 6:38 AM
>> To: Statalist Statalist
>> Subject: Re: st: insheet and dropping cases
>>
>> On Feb 20, 2014, at 8:28 AM, Ben Hoen <[email protected]> wrote:
>> > Hexdump I had never used.  This is what it returned:
>>
>> <snip>
>>
>> > Do you see anything suspicious here?  (I replaced all the commas with
>> "_", using filefilter - another great suggestion -  wondering if that was
>> causing any issues and insheet still returned 184 observations.)
>>
>>
>> I don't see anything obvious -- you'll need to look at the file directly.
>> Is Stata reading the first 184 observations, or are the 184 observations
>> from different places in the file?  Check that first, and if you are
>> getting the first 184 observations, then look at lines 184-6 (depending
on
>> whether the file has a header line).  Something has to be going on there.
>>
>>
>> -- Phil
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index