Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Joe Canner <jcanner1@jhmi.edu> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: Cannot get insheet to work, data do not load properly |
Date | Thu, 29 Aug 2013 17:35:40 +0000 |
Laura, I second Nick's suggestion about corrupted files. Text files are particularly susceptible to such things, depending on what you are using to generate them and/or edit them. I noticed some other things that are awry that might help focus your search: 1. You ran -insheet- several times in a row and the next-to-last time it read a different number of observations. Did you make any changes to the file before or after that step? If not, that in itself is a cause for concern. If you did make changes, perhaps you accidentally made some other fatal changes. 2. You variables have names v1, v2, etc. You should add the -names- option to -insheet- to fix this, although on my Stata (also 12.1) it figured out automatically that I had names in the first row. This suggests that maybe the problem is near the beginning. Are the variable names separated by tabs as well? Regards, Joe Canner Johns Hopkins University School of Medicine -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Laura Grant Sent: Thursday, August 29, 2013 12:36 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: Cannot get insheet to work, data do not load properly Thanks Nick I will try that! On Thu, Aug 29, 2013 at 11:31 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Could be a corruption issue. Somewhere deep in the file there is > complete garbage. I am guessing wildly because in that case I would > expect Stata to read in some, then stop. But using -hexdump- to look > at the file sometimes reveals a problem. > Nick > njcoxstata@gmail.com > > > On 29 August 2013 17:22, Laura Grant <contributary@gmail.com> wrote: >> Don't think it's a length or size issue -- I have State SE and as I >> mentioned, it loads but with blank entries. >> >> The data look like this, tab delimited, 9 variables about 1.7mil observations: >> ACCOUNTNUMBER ConcatenatedAddress DEVICE BillType PREVIOUSREADDATE >> PREVIOUSREADING PRESENTREADDATE PRESENTREADING USE >> 44444444 5555 N GENERAL AV 99999999 Res 9/11/07 0:00 1106 12/11/07 >> 0:00 1131 25 >> 44444443 5553 N GENERAL AV 99999996 Res 12/11/07 0:00 1131 3/11/08 >> 0:00 1158 27 >> >> I can view them in excel (but the length is too long) or in a text editor. >> They look fine. >> I can delete the top lines, save as different types, and the load >> still looks like the screen capture. >> >> Would appreciate any help! >> >> On Thu, Aug 29, 2013 at 10:32 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> I don't know what the limits are for STATA, but Stata can take 2 >>> billion observations. >>> >>> Main answer is that Laura's screen capture isn't extra evidence. >>> Something is not in order for the file. Perhaps you could show us >>> the first few lines of the file, or get in touch with tech-support. >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 29 August 2013 16:22, O'Neill, Sinead <sinead.oneill@ucc.ie> wrote: >>>> May your dataset is too large for STATA. >>>> SAS can handle extremely large data. >>>> >>>> Sinéad O Neill >>>> PhD Scholar >>>> NPEC, ANU Research Centre >>>> Dept of Obstetrics & Gynaecology >>>> 5th Floor CUMH >>>> Wilton, Cork. >>>> (+353-21-492-0656) >>>> (+353-86-3586895) >>>> >>>> On 29 Aug 2013, at 16:21, "Laura Grant" <contributary@gmail.com> wrote: >>>> >>>>> I have a long dataset (1.7m observations) that I can view >>>>> partially in excel and fully in text editors. >>>>> >>>>> However when I go to insheet it in Stata the data load as all >>>>> blank entries except the first cell, which always loads as " ˇ˛X " >>>>> where X is the first character of the first line of the data. >>>>> >>>>> See screen capture at >>>>> goo.gl/zaesv7 >>>>> >>>>> The number of variables and observations are correct but they are ALL MISSING. >>>>> >>>>> The code I am using, as seen in pic link above, are variations on >>>>> >>>>> insheet using "Res Usage 2008 to 2010.txt", names tab clear >>>>> >>>>> Thoughts? Thanks! >>>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/