Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Stata appears to be eating some string IDs when saving a file


From   "Dimitriy V. Masterov" <dvmaster@gmail.com>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Stata appears to be eating some string IDs when saving a file
Date   Tue, 2 Apr 2013 16:34:45 -0700

STS has confirmed that I am not a crazy person, at least not in this
instance. This is a real bug.

The problem is that Stata does not return an error when the file
system fills up. The developers are now aware of this and they would
like to have Stata detect this problem in the future and report the
error correctly. They also plan to add some more error checking to the
-use- command so that it catches files that have been corrupted.

For now, the best way to detect these types of issue is to use the
-datasignature- command to verify that the data set was not
modified/corrupted when saved.

DVM

On Sun, Mar 31, 2013 at 10:32 PM, Dimitriy V. Masterov
<dvmaster@gmail.com> wrote:
> I believe I diagnosed the issue. This seems to happen when I am
> running low on space in my home directory on the server. When I freed
> up some space, the problem went away. I wish there was some sort of
> warning to alert users that this is happening. This has been a very
> frustrating and terrifying experience.
>
> DVM
>
> On Sat, Mar 30, 2013 at 2:25 PM, Dimitriy V. Masterov
> <dvmaster@gmail.com> wrote:
>> I am having a strange problem with Stata deleting the values for about 80%
>> of my data when I save a file. It only does it for string variables,
>> and this only happens some of the time that I run this code.
>>
>> Here's the relevant part:
>>
>> . des ;
>>
>> Contains data
>>   obs:    10,766,127
>>  vars:             4
>>  size:   387,580,572
>> ------------------------------
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>               storage  display     value
>> variable name   type   format      label      variable label
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> slr_id          str10  %10s
>> byr_id          str10  %10s
>> item_id         str12  %12s
>> pt_m2m_cat      float  %21.0g      pt_m2m_cat
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Sorted by:
>>      Note:  dataset has changed since last saved
>>
>> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
>> !missing(pt_m2m_cat);
>>
>> . count;
>> 10766127
>>
>> . save "pt_m2m_cat.dta", replace;
>> file pt_m2m_cat.dta saved
>>
>> . use "pt_m2m_cat.dta", clear;
>>
>> . assert !missing(slr_id) & !missing(byr_id) & !missing(item_id) &
>> !missing(pt_m2m_cat);
>> 3407873 contradictions in 10766127 observations
>> assertion is false
>> r(9);
>>
>>
>> My Stata MP is 12.1 (March 20, 2013), on an Ubuntu box. Any ideas how
>> to diagnose this?
>>
>> DVM
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index