Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Alex Olssen <alex.olssen@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: lpoly and nonmissing fitted values where the dependent variable is missing |

Date |
Tue, 10 Aug 2010 08:33:05 +1000 |

Thanks Austin and Yulia for your helpful responses. Sorry Austin, I was actually aware of your work and intended to mention it but forgot to when I sat down to write the email. It is clear and very helpful. Kind regards, Alex On 10 August 2010 02:09, Yulia Marchenko, StataCorp LP <ymarchenko@stata.com> wrote: > Alex Olssen <alex.olssen@gmail.com> asks why -lpoly- produces smoothed values > outside the range of <x>-values (the variable -length- below) as defined by an > -if- statement: > >> I am doing a regression discontinuity analysis and want to understand how >> -lpoly- is working. I use the -lpoly- options -gen- and -at- to create >> fitted values for my local linear regression. Due to the nature of >> regression discontinuity I look at two subgroups separately. Fitted values >> are generated to observation that are even outside the subgroup. I want to >> understand how it chooses where to fit them. >> >> For example, >> >> sysuse auto, clear >> lpoly price length if length<190, ker(rec) deg(1) bwidth(12) gen(L) at(length) >> sort length >> br L length >> >> Cars with lengths up to 212cm long have fitted values. Does anyone know why? >> >> Note the if statement causes no problems. If I gen lengthlt190=length if >> length<190 and then lpoly price lengthlt190 the results are identical. > > -lpoly- uses two notions of a sample: an estimation sample and a grid sample. > An estimation sample defines a set of observations to be used in local > weighted linear regression fits. A grid sample defines a set of grid points > at which the smooth will be evaluated. To link this to the documentation > (-[R] lpoly-, pp. 939-940), the estimation sample defines the set of x_i's > used to compute regression coefficients in formula (2) in the documentation > and the grid sample defines the set of grid points x_o. > > An -if- condition only affects the estimation sample and not the grid sample. > To restrict the range of grid points, Alex should create a new variable in the > desired range and use it in the -at()- option. Continuing Alex's example, we > can use the -lengthlt190- variable in the -at()- option to restrict the range > of 'at' values to those less than 190: > > . sysuse auto, clear > . gen lengthlt190=length if length<190 > . lpoly price length if length<190, /// > ker(rec) deg(1) bwidth(12) gen(L) at(lengthlt190) > > > -- Yulia > ymarchenko@stata.com > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: lpoly and nonmissing fitted values where the dependent variable is missing***From:*Alex Olssen <alex.olssen@gmail.com>

**References**:**Re: st: lpoly and nonmissing fitted values where the dependent variable is missing***From:*ymarchenko@stata.com (Yulia Marchenko, StataCorp LP)

- Prev by Date:
**st: RE: RE: Replace missing values by 0** - Next by Date:
**st: dates and elapsed time** - Previous by thread:
**Re: st: lpoly and nonmissing fitted values where the dependent variable is missing** - Next by thread:
**Re: st: lpoly and nonmissing fitted values where the dependent variable is missing** - Index(es):