Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Machine spec for 70GB data


From   William Buchanan <william@williambuchanan.net>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Machine spec for 70GB data
Date   Sat, 22 Oct 2011 05:41:32 -0700

Gindo,

Contrary to prior responses to your request, the set memory command is unnecessary when using Stata 12.  If your dataset is 70GB, you would need at least that much RAM in addition to the RAM necessary for your computer to run.  

- Billy

Sent from my iPhone

On Oct 22, 2011, at 4:52, Yuval Arbel <yuval.arbel@gmail.com> wrote:

> Gindo,
> 
> Are you sure the data file is 70GB? I'm using Windows operating system
> and I recently succeded to run a file of  1.29 GB that includes above
> 4 million observations. Here are the few raws from the do file. Just
> make sure to use the "set memory" command:
> 
> . do "D:\kingston\public_housing\public_housing_full_20110630.do"
> 
> . clear
> 
> . clear matrix
> 
> . set memory 12500m
> (12800000k)
> 
> . insheet using "D:\kingston\public_housing\survivalindexed27-Jun-2011.csv"
> (56 vars, 4086490 obs)
> 
> . sort time_index
> 
> . stset time_index, id(appt) failure(fail==1)
> 
>                id:  appt
>     failure event:  fail == 1
> obs. time interval:  (time_index[_n-1], time_index]
> exit on or before:  failure
> 
> ------------------------------------------------------------------------------
>  4086490  total obs.
>    32731  obs. end on or before enter()
> ------------------------------------------------------------------------------
>  4053759  obs. remaining, representing
>    49650  subjects
>     8582  failures in single failure-per-subject data
>  5084887  total analysis time at risk, at risk from t =         0
>                             earliest observed entry t =         0
>                                  last observed exit t =       114
> 
> 
> 
> On Sat, Oct 22, 2011 at 1:00 PM, Gindo Tampubolon
> <Gindo.Tampubolon@manchester.ac.uk> wrote:
>> Dear all,
>> 
>> I need to process a large data file [70GB; a few millions obs] with Stata 12 MP8. Mainly to do cross-random effects,individuals and hospitals, where the outcome is length of stay [controlling for no more than a handful of covariates to begin with]. As an approximation, the outcome is treated as continuous i.e. linear mixed models.
>> 
>> What kind of machine spec would be needed? Any ideas, information, experience? Would operating system make any difference? I'm open to consider Windows, Linux, OS X.
>> 
>> Many thanks,
>> Gindo
>> University of Manchester
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
> 
> 
> 
> -- 
> Dr. Yuval Arbel
> School of Business
> Carmel Academic Center
> 4 Shaar Palmer Street, Haifa, Israel
> e-mail: yuval.arbel@gmail.com
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index