Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: programming assist, too many unique values for levels


From   "Andrew O'Connor DO" <[email protected]>
To   <[email protected]>
Subject   st: programming assist, too many unique values for levels
Date   Tue, 01 May 2007 14:54:47 -0400

I'm hoping someone can offer some help, I've been working on this for
some time now
I'm running STATA 8.2 SE and have a large dataset (>90,000 rows of data
with about 12,000 unique record numbers, multiple observations for the
same individual).
I'm trying to calculate a "time out of range" for each patient (i.e. the
proportion of each patients observation time that is predicted to be
greater than 140 assuming a linearly interpolated slope of acutally
measured blood pressures--not simply the proportion of blood pressure
readings that is > than my threshold).  I have 3 variables: MRN (medical
record number), Visit_date, bp_systolic

I've run into a problem due to the size of my data set, specifically
that I have too many levels.  Here is my code
   encode mrn, gen (pt)
sort pt visit_date
drop if bp_systolic==.
by pt:gen obstime =visit_date[_n+1]-visit_date
by pt:gen sys_diff=bp_systolic[_n+1]-bp_systolic
by pt:gen slope=sys_diff/obstime
by pt:gen predict=(140-bp_systolic)/slope if bp_systolic<140 &
bp_systolic[_n+1]>=140 
by pt:gen date140=visit_date + predict
by pt:gen predict2=floor([140-bp_systolic]/slope) if bp_systolic>=140 &
bp_systolic[_n+1]<140
by pt:gen date140down=visit_date[_n-1] - predict2
by pt:gen out_range=obstime if bp_systolic>=140 & bp_systolic[_n+1]>=140
by pt: replace out_range=visit_date[_n+1]- date140 if bp_systolic<140 &
bp_systolic[_n+1]>=140
by pt: replace out_range=obstime- predict2 if bp_systolic>=140 &
bp_systolic[_n+1] <140
gen time=.

levels pt, local(levels)
quietly foreach l of local levels {
sum obstime if pt==`l'
local total=r(sum) 
replace time=`total' if pt==`l' 
}
gen time_out=.
quietly foreach l of local levels {
sum out_range if pt==`l'  
local total =r(sum)
replace time_out=`total' if pt==`l'
}
gen time_o_r=(time_out/time)
local threshold = 140
  gen proportion=.
levels pt, local(levels)
 quietly foreach l of local levels {
   count if pt == `l' & bp_systolic !=.
    local total =r(N)
    count if bp_systolic >= `threshold' & bp_systolic !=. & pt == `l'
    replace proportion = r(N)/`total' if pt == `l'
} 
Any suggestions for using a different set of programming statements???
Thanks,
AO


The MetroHealth System: Saving lives in Northeast Ohio for 170 
years as the region's leader in critical care, community health 
and rehabilitation.  Visit us at http://www.MetroHealth.org for 
a complete list of services, health care providers, and 
locations.

This email and all attachments that may have been included are 
intended only for the use of the party to whom/which the email 
is addressed and may contain information that is privileged, 
confidential, or exempt from disclosure under applicable law. 
If you are not the addressee or the employee or agent of the 
intended recipient, you are hereby notified that you are 
strictly prohibited from printing, storing, disseminating, 
distributing, or copying this communication. If you have 
received this notification in error, please contact the 
Director of Risk/Privacy Management at (216)778-5728. For a copy 
of our Notice of Privacy Practices, please visit: 
http://www.metrohealth.org/general/privacy.asp 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index