[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Steven Stillman (LMPG)" <Steven.Stillman@lmpg.dol.govt.nz> |

To |
"'statalist'" <statalist@hsphsun2.harvard.edu> |

Subject |
st: collapsing efficiently |

Date |
Thu, 18 Dec 2003 19:23:32 +1300 |

I am collapsing individual/quarter data down to yearly population counts for a number of variables (around 20) for various groups (this part isn't important). This is a large dataset of about 20,000 obs per quarter * 64 quarters. My full dataset is around 330m. Ideally, I would do this using the command: collapse (sum) varlist [pw=weight], by(group year) fast Unfortunately, even when I drop all variables from my dataset besides the ones being collapse and allocate my full system memory of 500m, I get an error message that not enough memory is available. I believe this occurs because of collapse's internal use of doubles and its creation of new temp variables before deleting the old ones. I have gotten around this using the following sequence of commands: [for var varlist: (forgive my use of for, old habits die hard) egen float temp = sum(X*weight), by(group year) \ qui replace X = temp \ qui drop temp] (brackets are just to indicate this is all one command) bys year group: keep if _n==1 This does exactly what I need but is tediously slow. My use of egen means I am storing lots of unnecessary information (ie duplicate records) that I have no need for. I am pretty sure that this same idea can be used by looping over groups, calculating the sum, and storing this in only one observation per group. I haven't been able to figure out exactly how to do this myself and am hoping someone else will quickly see the light here. One problem is that summarize cannot be used to calculate the sum because it doesn't take non-integer pweights. thanks for any help, Steve <><><><><><><><><><><><><><><><><><><> Steven Stillman - Senior Research Economist Labour Market Policy Group - Department of Labour PO Box 3705 - Wellington, New Zealand Email: steven.stillman@lmpg.dol.govt.nz Web: http://www.thestillmans.org/econ.html <http://www.thestillmans.org/econ.html> Tel: (64)4-915-4076 - Fax: (64)4-915-4040 <><><><><><><><><><><><><><><><><><><> The information contained in this document is intended only for the addressee and is not necessarily the views nor the official communication of the Department of Labour. All final/official papers which are sent from the Department will be sent by non-electronic means, on appropriate letterhead, signed by authorised personnel. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: collapsing efficiently***From:*Ulrich Kohler <kohler@wz-berlin.de>

**Re: st: collapsing efficiently***From:*Jenkins S P <stephenj@essex.ac.uk>

- Prev by Date:
**Re: st: if-then-else** - Next by Date:
**st: Problem with --update ado--** - Previous by thread:
**st: if-then-else** - Next by thread:
**Re: st: collapsing efficiently** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |