Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Need to change omitted category based on *estimated results*


From   "Matthew Mercurio" <matthewmercurio@fscgroup.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: Need to change omitted category based on *estimated results*
Date   Sun, 30 Sep 2012 23:52:47 -0700

I am running a series of regressions on patient-level data, where the cost of treatment for a particular hospital stay modeled as a linear function of the number of diagnostic procedures performed, the length of stay, gender, and a categorical variable representing the identity of the attending physician.  There are over 700+ DRGs (Diagnostic Related Groups) and thus 700+ separate regressions:

levelsof DRG, local(levels)
foreach l of local levels {

  use "C:\Users\Documents\Analysis Data.dta", clear
  di as text "`l'"
  keep if DRG == "`l'"
  xi: reg DIR_VAR_COST NUMBER_OF_DIAGS LOS  i.SEX i.ATTDPHYSCODE
}

The focus of the analysis is to look at the estimated coefficients on ATTDPHYSCODE.  I've read up on the syntax for changing the omitted category, including the newer syntax in -help fvvarlist-, but it doesn't seem to address my particular problem:
 
I want to exclude the Physician Code for the particular physician who has the lowest *estimated coefficient* in each model. In other words, when this physician is excluded and the model is rerun, all of the estimated coefficients on ATTDPHYSCODE will be positive, since the one with the lowest estimated coefficient has been excluded.  Why (you might ask) would I care about this, since the distance between any two coefficients remains the same?  The problem is that I need the standard errors and p-values to be computed relative to the "lowest-cost" physician, since ultimately the results of all 700+ regressions are being compiled for further analysis, and differences which are not statistically significant in distance from the "least-cost" physician will be ignored.

When I only had 10 DRGs to work with, I just eyeballed the results after excluding the physician numbered first (Stata default), changed the omitted category based on the smallest (or most negative) coefficient, reran the results, and went on my way.  Now with 700+ regressions to run, I cannot use this manual procedure.  I tried taking the e(b) matrix but, I don't know how to extract the index of the lowest value among the ATTDPHYSCODE coefficients and then put that into the char ATTDPHYSCODE[omit] syntax within the program above.  I also tried using the minindex function in mata but that took me beyond my programming capabilities.  Obviously the model will need to be run twice within each step of the loop, once to find out which coefficient is the smallest and then again after excluding that one.  If anyone can offer me a tip as to how I might code this I would be grateful.  Kind regards,



Matthew G. Mercurio, Ph.D.
Applied Statistical Consultant

http://www.MGMAppliedConsulting.com




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index