[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Increasing variance of dependent variable, logit, inter-rater agreement

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Increasing variance of dependent variable, logit, inter-rater agreement
Date	Sat, 28 Feb 2009 19:45:56 -0500

--

Anupit,

All this detail is welcome and clear. I don't really know how tomodel all of this simultaneously, or, even if there would be anybenefit in doing so. I hope that others will read your descriptionand chime in.

Some thoughts: I've read the abstracts of the Feinstein-Cichettiarticles, and I think that your original idea of predicting positiveagreement from a regression model is good. Be sure to use a flexiblemodel for age. I think that you need a model with more variabilitythan logistic assumes. Consider -hetprob- , which fits a probitmodel. If you have vehicles that were retested over time, alsoconsider longitudinal data methods (-xt- prefix) If the remotesensing device was not recalibrated between individual observations,you probably also have non-independent errors for observations takenon the same device at the same time. If you used different remotesensing devices to retest the same vehicle of the same age, then youcan add random- or fixed- device terms to a predictive model. If youknow about environmental conditions that would have affected errorsin the remote-sensing, be sure to add those as predictors. With somany observations, you can afford to divide your data, develop yourmodel on one piece, and test on the other.


Best wishes,

Steve

On Feb 27, 2009, at 8:28 PM, Supnithadnaporn, Anupit wrote:

<>

Dear Steven,
I appreciate your reply to my post. I am sorry if my explanation istoo long.
Thank you,
Anupit
Please give more detail about what is being assessed. Is there a gold
standard, measured or latent, for what these technologies are trying
to agree upon?
The subject of my study is the in-used vehicles. In some areas ofthe US,there is a regulation that requires a vehicle to be tested for itsemission.In the past, this instrument measured the real tailpipe emission.The test
is typically performed at the commercial inspection station. If the
amount of emission surpasses the threshold standard, the vehiclefails.The owner of failing vehicle has to repair his/her vehicle until itmeetsthe standard level otherwise he/she cannot renew the vehicleregistration.
However, this tailpipe-test technology has been replaced by the newone calledOBD II test. This test no longer measures the tailpipe emission.Instead,it gives the fail result if there is an error codes relating to theemission
control part of the vehicle.
Despite the different technologies measuring different things, theyshare the common goal of the regulation. That is to identify thehigh-polluting vehicles.
* What is the first technology that measures characteristics and
arrives at a pass-fail?  How does it make this decision? Was age one
of these characteristics?
So, the first technology is the OBD II that detects the error codesand yieldthe pass-fail result which is the *nominal level*. Having certainerror codesmeans that the vehicle is likely to emit high level of pollutionbeyond the
standards. As the vehicle become older, it is likely to pollute more.
Moreover, the OBD II which is the computer unit of the vehicle islikely tomalfunction. If the OBD II is malfunction, it can give either thefalse-pass
or false-fail result.
* How was the cut point y2b arrived at?
Fortunately, the regulator also has set up several unobtrusivemonitoringstations on road. Basically, this technology uses the remote-sensing device(RSD) to measure the real tailpipe emission from numerous vehiclesrunningpass by. This is the second technology in my analysis. It measuresthe realtailpipe emission which is the *interval level*. And the thresholdis basedon the EPA regulation set for particular type of vehicle make,model year,
and weight - *the cut point of y2b*.
* You say that the variability of y2a increases with age.  Is the
level of y2a related to age?
Correct. As a vehicle is getting older, its emission level islikely to behigh due to deterioration. Moreover, its emission can vary vastlydifferentfrom one measurement (by RSD) to the other. This is what I amtrying to
take into account in my analysis


My data is a pooled-cross section time series of 4 years.
My unit of analysis is a matched pair of a vehicle tested by OBD IIand
measured by RSD on road in the same year-testing cycle.
My hypothesis is that the OBD-RSD agreement is greater for theolder vehicle
fleets. My sample size ~ 80,000 observations.
Of the total, 72% is classified as 'agree'
For 28% of 'disagree' group, around 90% is the Fail-RSD, Pass-OBD.
During the early analysis, I split the vehicles into different agegroupsfrom 3-9 years. I obtain Kappa for each group and compare them.However,I run into problem of Kappa when the prevalence (the disagree casesfor
each age-group) is small.

Cicchetti DV, Feinstein AR. High agreement but low Kappa: II Resolving
the paradoxes. J Clin Epidemiol 1990; 43:551-8

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Increasing variance of dependent variable, logit, inter-rater agreement
  - From: "Supnithadnaporn, Anupit" <[email protected]>

Prev by Date: st: How to graph point estimate with twoway confidence intervals
Previous by thread: Re: st: Increasing variance of dependent variable, logit, inter-rater agreement
Next by thread: [no subject]
Index(es):
- Date
- Thread