Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Overriding a loop if 0 observations using tabstat


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: st: Overriding a loop if 0 observations using tabstat
Date   Wed, 28 Apr 2010 10:45:44 -0400

Agree, this is not intuitive. My first run was with
50mb allocated:

	t=48.90; t=60.45; t=72.30.

but when I allocate 1gb (on an 8gb machine)

	t=79.66; t=105.07; t=121.07

?



Robert Picard wrote:
I don't understand. Under both scenario (-set memory 1g- or -set
memory 10m-), the dataset size and everything else is the same. On my
computer with 12GB of RAM, a 1g allocation should not make a
difference and none of it should be paged out to virtual memory. In
fact, Stata does not even allocated to itself 1GB of real or virtual
memory (when I look at the Activity Monitor) unless I actually create
or load a dataset which requires 1GB of RAM.

The reason why I ask is that the lesson appears to be that when
running Stata, you should always aim for the smallest memory
allocation possible for maximum efficiency at the price of finding
out, hours later when you encounter an insufficient memory error that
you should have used a larger -set memory-.

Robert

On Tue, Apr 27, 2010 at 3:42 PM, Martin Weiss <[email protected]> wrote:
<>

The additional 990m for the 1g allocation decrease the amount available for
computations, so this is what I would expect to happen.


HTH
Martin


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Robert Picard
Sent: Dienstag, 27. April 2010 21:39
To: [email protected]
Subject: Re: st: Overriding a loop if 0 observations using tabstat

Do you guys see a difference if you try under a different memory
allocation? I'm running Stata/MP 11 (4 cores) on a Mac Pro 2.93GHz
Quad-Core with 12GB of RAM and get:

with 1g allocation: t=17.49; t=64.09; t=71.18
with 10m allocation: t=10.93; t=43.35; t=47.68

Just curious,

Robert

On Tue, Apr 27, 2010 at 2:59 PM, Jeph Herrin <[email protected]> wrote:
This is 64bit MP 2 on Windows 7 with 8G ram.
The processor is an AMD Phenom II with 3.20GHz clock speed.

cheers,
J


Martin Weiss wrote:
<>

Jeph, out of curiosity, what kind of equipment is it that throws up these
numbers? Mine is 64 bit MP 4 on Windows 7 with 4G Ram.


HTH
Martin


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jeph Herrin
Sent: Dienstag, 27. April 2010 20:27
To: [email protected]
Subject: Re: st: Overriding a loop if 0 observations using tabstat

t=48.90; t=60.45; t=72.30. :>


Martin Weiss wrote:
<>

t=100.28; t=207.58; t=241.55. :-)


HTH
Martin


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Dienstag, 27. April 2010 19:08
To: [email protected]
Subject: RE: st: Overriding a loop if 0 observations using tabstat

Good question. I decided to do some timings to support -- or rebut -- my
feeling that -count- which just counts should be faster than -summarize,
meanonly- which does other stuff too and in turn than -summarize- which
does
other stuff too. But although that's the order the timings are closer
than
I
guessed. Still, doing anything the quickest way does no harm and may
give
valuable speed-up for large problems.
Here is one test script. Compare your experiences:
clear
set obs 100000
set seed 2803
gen y = runiform()
set rmsg on

qui forval i = 1/10000 {
       count if y > 0.5
}

qui forval i = 1/10000 {
       su y if y > 0.5, meanonly
}

qui forval i = 1/10000 {
       su y if y > 0.5
}

My timings were t=187.49; 254.49; 313.38, which no doubt shows up the
Mesolithic age of my machine.
Nick [email protected]
Martin Weiss

" As a small detail of efficiency, I would always recommend -count-
rather
than -summarize- for the purpose here."

My earlier code did use -count-... What makes this thing more efficient,
though? Both are built-in, so they probably enjoy a big advantage over
everybody else anyway. So I guess the reason for your preference is the
fact
that -count- calculates fewer results than -su, mean-?

Nick Cox

A secondary theme here is that this kind of code gets very difficult to
read, which makes it difficult to maintain and debug.
I note that the condition
intab1 == 1 & admit_ic == 1 & btwg < .
is common to all the -summarize- and -tabstat- commands. That being so,
you
could get that out of the way like this
preserve keep if intab1 == 1 & admit_ic == 1 & btwg < .
<stuff> restore
Your -tabstat- options that are constant can be put in a little bag:
local opts stat(n mean median p25 p75 min max) col(stat) f(%9.0g) notot
nosep

Now <stuff> can be rewritten
forv i = 0/5 {
       foreach y in male singlet {
               forv s = 0/1 {
                       di "myga==`i' & `y'==`s'"
                       qui su bwtg if myga==`i' & `y'
                       if r(N) != 0 {
                               tabstat bwtg if myga==`i', `opts' by(`y')
                      }
               }
       }
}

Now it is easier to see what is going on. I added some cosmetic changes
too,
which this horrible mailer may well reverse.
One puzzle: Did you mean to add the condition "& `y'" to the
-summarize-?
It
means the same as
& `y' != 0
-- which may or may not be what you want.
As a small detail of efficiency, I would always recommend -count- rather
than -summarize- for the purpose here.
Nick [email protected]
sara khan

Many thanks Maarten for your advice. I managed to resolve it with the
following code:

forv i=0/5 {
foreach y in male singlet{
forv s=0/1{
di "myga==`i' & `y'==`s'"
qui su bwtg if myga==`i' & intab1==1 & admit_ic==1 & bwtg<. & `y'
       if r(N)!=0{
tabstat bwtg if myga==`i' & intab1==1 & admit_ic==1 & bwtg<., stat(n
mean median p25 p75 min max ) by(`y') col(stat) f(%9.0g) notot nosep

}
}
}
}


On Tue, Apr 27, 2010 at 12:56 PM, Maarten buis <[email protected]>
wrote:
--- On Tue, 27/4/10, sara khan wrote:
I just tried this but the output only shows the display
results and nothing from tabstat.
<snip>

-capture- works for me:

*----------------- begin example ---------------------
sysuse auto, clear
forvalues i = 0/5 {
      capture noisily tabstat mpg if rep78== `i', ///
              s(n mean) by(foreign)
}
*-------------------- end example -------------------

In order to debug your loop I would build it step by step:
step 1: no looping, no locals, no -if- just a single -tatstat- command
step 2: add -capture noisily-
step 3: add some -if- conditions
step 4: build a single loop (e.g. over i but not over y)
etc. etc.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index