Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: How to merge individual records to groups in a large dataset, w/o using collapse


From   Allon Crazy <allon_crazy@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: How to merge individual records to groups in a large dataset, w/o using collapse
Date   Mon, 26 Feb 2007 16:08:16 -0800 (PST)

Hi Michael,

thanks for your advice. Seperate the dataset into
smaller ones might be better,but I did try, still too
big for my 1G ram, using "collapse".

An example I can give for what I want to accomplish is
:
collapse (mean) var1 var2 (median) var3 (sum) var4
[fw=weight5], by(region6)

I am wondering whether there is any alternative ways
to do the same task but require less ram. 

many thanks
--- Michael Blasnik <michael.blasnik@verizon.net>
wrote:

> If you are looking at -collapse-, then  -merge- is
> the wrong term for Stata 
> users (merge means joining tables in Stata).
> 
> You would have better luck with answers if you
> showed us some sample command 
> and better described what end result you want. 
> Depending on what group 
> summary statistics you want, you may be better off
> using just pieces of the 
> data set and collapsing each one within a loop. 
> With very large datasets, I 
> find it usually makes more sense to work from first
> principles in Stata and 
> avoid commands like -scollapse- and most -egen-
> commands as well.  You can 
> usually accomplish what you want more efficiently
> and with less memory 
> overhead nad enhanced speed doing it this way.
> 
> Michael Blasnik
> 
> ----- Original Message ----- 
> From: "Allon Crazy" <allon_crazy@yahoo.com>
> To: <statalist@hsphsun2.harvard.edu>
> Sent: Monday, February 26, 2007 6:04 PM
> Subject: st: How to merge individual records to
> groups in a large dataset, 
> w/o using collapse
> 
> 
> >I am wondering how to merge individual records to
> > groups for an extremely large dataset (20 million
> > observations), without using collapse. I tred
> > collapse, but my computer would not offer enough
> > memeory for it because the dataset is too huge. I
> > tried egen, but egen does not take sampling
> weights
> > into consideration. I am wondering whether there
> is
> > another way or other options.
> >
> > I would grealy appreciate.
> >
> 
> *
> *   For searches and help try:
> *  
> http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 



 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index