Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Efficiently run three loops (or do without them)


From   Ronnie Babigumira <rb.glists@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Efficiently run three loops (or do without them)
Date   Mon, 13 Apr 2009 23:45:22 +0700

Dear Eva and Nick
Thanks for the quick response. Sorry for the unclear email, I was trying keep the email short.

Since sending my first email, I have made some progress but I still welcome help from you or anyone else.

By way of a brief background, I am running this on a number of data sets with the aim of listing the suspect cases in a way that allows us to easily go back to the questionnaires to check that the listed value is indeed a valid entry. I would therefore want to be able to see the actual values of fup_pdt and fup_unit

The full code I am working with is

u qtr_b_fup, clear
levelsof qtr, local(qtrs)
foreach qtr of local qtrs {
	use qtr_b_fup, clear
	keep if qtr == `qtr'
	levelsof fup_pdt, local(fupdts)
	foreach i of local fupdts {
		levelsof fup_unit if fup_pdt == `i', local(fupunits)
			foreach j of local fupunits {
				preserve
				keep if fup_pdt == `i' & fup_unit == `j'
				olindicator  fup_qtycoll
				di "Potential errors in the quantity collected for `i' & unit`j'"
				list houscode qtr fup_pdt fup_unit fup_qtycoll if fup_qtycoll_ol==1 					
				restore
		}	
	}	
}

Where -olindicator- is a small program I have written to help me identify the outliers

With output that looks like this
----------------------------------------------------------------------------------------
Potential errors in the quantity collected for 1 & unit 34

   +---------------------------------------------------+
   | houscode | qtr | fup_pdt | fup_unit | fup_qtycoll |
   |----------+-----+---------+----------+-------------|
   |      138 |   1 |       1 |       34 |          50 |
   +---------------------------------------------------+
----------------------------------------------------------------------------------------

I think this can be improved and that I don't have to keep reloading the data so, I welcome any help.

Hope this makes things clearer.

Ronnie

Eva Poen wrote:
Ronnie,

this is difficult to answer without knowing what you mean by "run my
checks". There are some tools out there to detect outliers; use
-findit outliers- to see what's around.

Whether or not you need nested loops will depend on whether or not
your checks need to know the actual value of fup_pdt and fup_unit. The
-preserve- and -keep- thing will slow things down, and you may be able
to do without it (by just using if conditions in your code).

But, really, we need some more information to be able to help.

Eva


2009/4/13 Ronnie Babigumira <rb.glists@gmail.com>:
Dear list
I have quarterly data that looks like this

     qtr   houscode   fup_pdt   fup_unit   fup_qtycoll
       1        562        23          2            50
       1        570       628          2             2
       1        573       628        201            10
       1        573       628          2             2
       1        576       628        201             5
       1        576       628        201            20
       1        577       628          2             1
       1        578       628          2             1
       1        590        34         26            60
       1        595        34         26           200

For each quarter, I would like to identify "strange" values (outliers) in
the variable fup_qtycoll
(simply to rule out data entry error).

This would be done for the different fup_pdt and fup_unit combinations

My initial idea is that I would have to do it in three -foreach- loops, be
something along the lines
(this does not work since I need to -preserve- before -keep-ing, which I
would like to avoid, and it
is not by quarter yet)

levelsof fup_pdt, local(fupdts)
foreach i of local fupdts {
               keep if fup_pdt == `i'
               levelsof fup_unit, local(fupunits)
               foreach j of local j {
                       keep if fup_unit == `j'
                       *** run my checks and other stuff
           }
}

I would appreciate some help on how I can do this efficiently

Ronnie

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index