Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: eqany


From   Babigumira Ronnie <rutaremwa_rb@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: eqany
Date   Mon, 8 Jul 2002 07:43:13 -0700 (PDT)

Hi Listers
I am cleaning data (that's all I seem to be doing) and I am a little
puzzled here. A while ago, I asked on how to identify illegal entries when
a variable takes on values in batches (e.g. 11 to 19 21 to 25 etc). Nick
Cox pointed me to 

. egen OK = eqany(cropcod2), values(110/120 220/227 330/334 440/446)
. list houscode cropcod2 if !OK

This has very well for, however, today I tried  

. egen OK = eqany(inpcode), values(500/505 599 601 1100/1111 /*
. */ 1200/1201 2100/2160/ 2200/2220 2299 2300/2302)
. list houscode inpcode if !OK 

I get an error message;

. egen OK = eqany(inpcode), values(500/505 599 601 1100/1111 /*
> */ 1200/1201 2100/2160 2200/2220 2299 2300/2302)
varlist not allowed
r(101);

Any one familiar with this and a way around it?

Roni

--- William Gould <wgould@stata.com> wrote:
> Salah Mahmud" <salah@eircom.net>, following up on a thread, asked, 
> 
> > Is the "observation pointer" the only overhead as far as data storage
> is
> > concerned?
> 
> to my posting that, 
> 
> > The size reported by -describe- is obtained by
> >
> >
> >            1,692,789  * ( 4   +    4  )   =    13,542,312
> >               /           |         \
> >           # of obs        |          \
> >                           |           \
> >                     width of data      plus 4
> >                    1 float = 4 bytes
> >
> 
> No, the 4 bytes is not all, but it is the important amount and the
> answer to
> Salah's question really depends on how you define overhead.
> 
> First off, what I said about the number reported by -describe- is
> exactly
> accurate:  that is what -describe- reports.  There is, however, more to
> a
> dataset than the variables and observations, such as variable names,
> variable
> labels, value labels, display formats, characteristics, etc.
> 
> When -describe- reports the "size" of the data, it ignores all of that,
> but
> obviously all those things appear in the .dta dataset, so that will tend
> to
> make the .dta dataset size larger than the number reported by
> -describe-,
> while the extra 4 bytes per observation, which only gets added when the
> data
> is copied to memory, makes the .dta dataset smaller.
> 
> Then there is overhead as I tend to think of it:  the memory cost of
> maintaining the memory image of the data and all of its features.  The 4
> bytes
> per observation is an example of this, and almost every feature of the
> data --
> each value label, each variable label (but not each variable name) --
> also has
> the overhead of pointers that track each piece of information.  This
> amounts
> to about 16 bytes per piece of information, and sometimes more.
> 
> This overhead, however, does not usually add up to much because the
> number of
> pieces of information being tracked is on the order of the number of
> variables
> in the dataset, rather than the number of observations.  It was,
> however,
> dealing with overhead like this that was the largest issue in producing
> Stata/SE, which could allow lots more varibles.
> 
> Anyway, the dataset label and each value label, variable label, and 
> characteristic adds 16 bytes to the memory image in addition to the
> contents 
> of the information piece itself.  The date-and-time stamp adds 16 bytes 
> (plus the date-and-time stamp).
> 
> Really, the 4 bytes per observation is the important number.
> 
> -- Bill
> wgould@stata.com
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index