[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Phil Schumm <pschumm@uchicago.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: SPSS to Stata issues |

Date |
Mon, 15 Aug 2005 15:15:10 -0500 |

At 01:52 PM 8/15/2005 -0400, Eric Uslaner wrote:

I have a huge data set (actually the General Social Survey early release for 2004 that includes the entire GSS). The GSS data set is in SPSS and has almost 5000 variables, most of which are only asked in a few years. I am working with regular (intercooled) Stata 9 and do not have SE. I have truncated the data set in SPSS but the data set still has the same number of variables, most of which will be entirely missing data. Can't use StatTransfer easily since this would require looking at each variable and dropping those all missing one by one. If I had Stata SE, I could use Nick Cox's dropmiss to get rid of variables with no valid cases. But my problem now is that I can't get the data into Stata at all (too many variables).

Something like the following should work:

clear

tempvar recno

gen byte `recno' = .

tempfile mydata

save "`mydata'"

qui des using foo, varlist

loc varlist `r(varlist)'

tempvar mergevar

loc count 0

qui foreach var of loc varlist {

loc short_list `short_list' `var'

loc count `++count'

if (1000<`count') | ("`ferest()'" == "") {

use `short_list' using foo

gen `recno' = _n

dropmiss

merge `recno' using "`mydata'", sort _merge(`mergevar')

drop `mergevar'

save "`mydata'", replace

loc count 0

mac drop short_list

}

}

This will read variables from a file foo.dta (located in the working directory) 1000 at a time, and use -dropmiss- to drop those that are entirely missing. What you'll be left with is a file containing all of the original variables that have at least one non-missing value (assuming that such a file has fewer than 2,047 vars and fits in memory). If this doesn't do exactly what you need, you should be able to modify it so that it does.

-- Phil

P.S. A similar technique may be used to read in a dataset which will not fit within the available memory, but will after it is compressed (just replace the call to -dropmiss- with a call to -compress-).

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: SPSS to Stata issues***From:*Richard Williams <Richard.A.Williams.5@ND.edu>

- Prev by Date:
**Re: st: SPSS to Stata issues** - Next by Date:
**st: RE: general panel data regression question** - Previous by thread:
**Re: st: SPSS to Stata issues** - Next by thread:
**st: RND problem** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |