[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Sergiy Radyakin" <[email protected]> |

To |
[email protected] |

Subject |
Re: st: RE: Use a few observations from a tab-delimited or csv file |

Date |
Wed, 20 Aug 2008 11:19:42 -0400 |

```
Hello Martin
unless you have Excel 2007 (or newer :) , the file limit is 65535x256,
while in Todd's case the file is 80000x2200, so Excel would have a
hard time.
Regards, Sergiy
On 8/20/08, Martin Weiss <[email protected]> wrote:
> Well, to create a dataset of summary datasets, use -h collapse-. If you had
> access to Stat/Transfer, that would facilitate your problem with the size of
> the file. Excel could probably take care of the conversion as well, but is
> usually frowned upon on the list...
>
> HTH
> Martin
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Todd D. Kendall
> Sent: Wednesday, August 20, 2008 4:41 PM
> To: [email protected]
> Subject: st: Use a few observations from a tab-delimited or csv file
>
> Dear Statlisters,
>
> I have a file that is currently in csv format (or I could easily
> convert it to tab-delimited). It is fairly large: roughly 80,000
> observations and 2,200 variables.
>
> In fact, it is too large to fit into Stata (I am running Stata 9.2 on
> a Windows XP machine with 1 GB of RAM). The maximum memory I can
> allocate to Stata is -set mem 636m-. When I try to simply insheet the
> file at this setting, I get only 16,276 observations read in -- not
> anywhere close to the whole file, so I don't think there are any easy
> tweaks to make this work.
>
> However, it turns out that, for roughly the last 2,000 variables, I
> really don't need every single variable; instead, I just need a few
> summary statistics calculated over these 2,000 variables (e.g., the
> mean or standard deviation). My idea is to write a simple do file
> that loads in, say, the first 15,000 observations, computes the mean
> and standard deviation of the 2,000 variables, then drops these
> variabes and saves as a .dta file. I would then repeat on the next
> 15,000 observations, and so on. Then I could just append all the
> little files together, and I would assume I could fit this into Stata,
> as it would only have around 200 variables instead of 2,200.
>
> My problem is that insheet doesn't work with "in" -- i.e., I can't
> write -insheet filename.csv in 1/15000-. Alternatively, if I could
> convert the file from csv into a fixed format, I could write a
> dictionary and use infix, but my Google search for how to convert a
> csv file into a fixed-column file has come up pretty dry.
>
> Am I barking up the wrong tree completely here, or am I missing
> something obvious? I greatly appreciate any suggestions.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```

**Follow-Ups**:**RE: st: RE: Use a few observations from a tab-delimited or csv file***From:*"Martin Weiss" <[email protected]>

**References**:**st: Use a few observations from a tab-delimited or csv file***From:*"Todd D. Kendall" <[email protected]>

- Prev by Date:
**Re: st: Use a few observations from a tab-delimited or csv file** - Next by Date:
**RE: st: RE: Use a few observations from a tab-delimited or csv file** - Previous by thread:
**Re: st: Use a few observations from a tab-delimited or csv file** - Next by thread:
**RE: st: RE: Use a few observations from a tab-delimited or csv file** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |