Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Machine spec for 70GB data


From   William Buchanan <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Machine spec for 70GB data
Date   Sat, 22 Oct 2011 05:41:32 -0700

Gindo,

Contrary to prior responses to your request, the set memory command is unnecessary when using Stata 12.  If your dataset is 70GB, you would need at least that much RAM in addition to the RAM necessary for your computer to run.  

- Billy

Sent from my iPhone

On Oct 22, 2011, at 4:52, Yuval Arbel <[email protected]> wrote:

> Gindo,
> 
> Are you sure the data file is 70GB? I'm using Windows operating system
> and I recently succeded to run a file of  1.29 GB that includes above
> 4 million observations. Here are the few raws from the do file. Just
> make sure to use the "set memory" command:
> 
> . do "D:\kingston\public_housing\public_housing_full_20110630.do"
> 
> . clear
> 
> . clear matrix
> 
> . set memory 12500m
> (12800000k)
> 
> . insheet using "D:\kingston\public_housing\survivalindexed27-Jun-2011.csv"
> (56 vars, 4086490 obs)
> 
> . sort time_index
> 
> . stset time_index, id(appt) failure(fail==1)
> 
>                id:  appt
>     failure event:  fail == 1
> obs. time interval:  (time_index[_n-1], time_index]
> exit on or before:  failure
> 
> ------------------------------------------------------------------------------
>  4086490  total obs.
>    32731  obs. end on or before enter()
> ------------------------------------------------------------------------------
>  4053759  obs. remaining, representing
>    49650  subjects
>     8582  failures in single failure-per-subject data
>  5084887  total analysis time at risk, at risk from t =         0
>                             earliest observed entry t =         0
>                                  last observed exit t =       114
> 
> 
> 
> On Sat, Oct 22, 2011 at 1:00 PM, Gindo Tampubolon
> <[email protected]> wrote:
>> Dear all,
>> 
>> I need to process a large data file [70GB; a few millions obs] with Stata 12 MP8. Mainly to do cross-random effects,individuals and hospitals, where the outcome is length of stay [controlling for no more than a handful of covariates to begin with]. As an approximation, the outcome is treated as continuous i.e. linear mixed models.
>> 
>> What kind of machine spec would be needed? Any ideas, information, experience? Would operating system make any difference? I'm open to consider Windows, Linux, OS X.
>> 
>> Many thanks,
>> Gindo
>> University of Manchester
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
> 
> 
> 
> -- 
> Dr. Yuval Arbel
> School of Business
> Carmel Academic Center
> 4 Shaar Palmer Street, Haifa, Israel
> e-mail: [email protected]
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index