Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten Buis <maartenlbuis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Independent variable censoring |

Date |
Mon, 4 Mar 2013 10:25:44 +0100 |

> On Sun, Mar 3, 2013 at 9:48 PM, Nikos Kakouros wrote: >> This may be a silly question (I should prefix all my stata posts with >> that) but what is the best way to handle a heavily left-censored >> predictor (independent) variable in a linear regression model? On Sun, Mar 3, 2013 at 10:26 PM, Maarten Buis wrote: > The reason for that is that the distribution of the independent > variables is even less relevant than the distribution of the dependent > variable. What you need to take care of is whether the effect of your > independent variable is linear. It may make sense to add your variable > linearly together with an indicator variable for whether that variable > was censored or not. In essence this adds a "jump" in your regression > line at the point of censoring. Here is an example of how I would do that. Notice that this is not a censored independent variable, but an independent variable that for other reasons has one "special" value we want to take into account. This reinforces the point that there is nothing special about a censored independent variable. *------------------ begin example ------------------ sysuse nlsw88, clear gen byte black = race == 2 if race < 3 label variable black "race" label define black 0 "white" /// 1 "black" label value black black // 40 hours per week might be a tiny bit special: spikeplot hours gen byte fulltime = hours == 40 if hours < . glm wage black union grade hours fulltime , /// link(log) vce(robust) eform // there is an upwards jump in wage of about 5% at 40 hours a week // here is how I would graph the results: preserve keep if e(sample) bys hours : keep if _n == 1 replace black = 0 replace union = 0 replace grade = 12 predict wagehat, mu replace fulltime = 0 predict empty if hours == 40, mu twoway line wagehat hours if hours != 40, /// lcolor(black) lpattern(solid) || /// rspike empty wagehat hours if hours == 40 , /// lcolor(black) lpattern(solid) || /// scatter wagehat hours if hours == 40, /// msymbol(O) mcolor(black) || /// scatter empty hours if hours == 40, /// msymbol(O) mfcolor(white) /// mlcolor(black) legend(off) restore *------------------- end example ------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) Hope this helps, Maarten --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Independent variable censoring***From:*Nikos Kakouros <nkakouros@gmail.com>

**Re: st: Independent variable censoring***From:*Maarten Buis <maartenlbuis@gmail.com>

- Prev by Date:
**Re: st: How test linear combination of estimates from two sets of results from margins?** - Next by Date:
**Re: st: Re: Stack trick by Nicholas Cox** - Previous by thread:
**Re: st: Independent variable censoring** - Next by thread:
**st: How test linear combination of estimates from two sets of results from margins?** - Index(es):