Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Machine spec for 70GB data


From   Yuval Arbel <yuval.arbel@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Machine spec for 70GB data
Date   Sat, 22 Oct 2011 13:52:28 +0200

Gindo,

Are you sure the data file is 70GB? I'm using Windows operating system
and I recently succeded to run a file of  1.29 GB that includes above
4 million observations. Here are the few raws from the do file. Just
make sure to use the "set memory" command:

. do "D:\kingston\public_housing\public_housing_full_20110630.do"

. clear

. clear matrix

. set memory 12500m
(12800000k)

. insheet using "D:\kingston\public_housing\survivalindexed27-Jun-2011.csv"
(56 vars, 4086490 obs)

. sort time_index

. stset time_index, id(appt) failure(fail==1)

                id:  appt
     failure event:  fail == 1
obs. time interval:  (time_index[_n-1], time_index]
 exit on or before:  failure

------------------------------------------------------------------------------
  4086490  total obs.
    32731  obs. end on or before enter()
------------------------------------------------------------------------------
  4053759  obs. remaining, representing
    49650  subjects
     8582  failures in single failure-per-subject data
  5084887  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =       114



On Sat, Oct 22, 2011 at 1:00 PM, Gindo Tampubolon
<Gindo.Tampubolon@manchester.ac.uk> wrote:
> Dear all,
>
> I need to process a large data file [70GB; a few millions obs] with Stata 12 MP8. Mainly to do cross-random effects,individuals and hospitals, where the outcome is length of stay [controlling for no more than a handful of covariates to begin with]. As an approximation, the outcome is treated as continuous i.e. linear mixed models.
>
> What kind of machine spec would be needed? Any ideas, information, experience? Would operating system make any difference? I'm open to consider Windows, Linux, OS X.
>
> Many thanks,
> Gindo
> University of Manchester
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street, Haifa, Israel
e-mail: yuval.arbel@gmail.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index