Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: st: Truncated sample or Heckman selection‏

From   Nick Cox <>
Subject   st: Re: st: Truncated sample or Heckman selection‏
Date   Thu, 4 Oct 2012 18:21:04 +0100

This was posted yesterday with no reply. As has already been pointed
out today on the list, posts usually get no reply because they are
unclear -- hence it is best to rewrite them.

I may be misunderstanding this, but

1. If you have no data for x variables for some firms, I don't see
that you can do anything at all with those firms, except type in
strings of missing values for them. Why they don't innovate is not
something on which you have evidence.

2. If a response is bounded by definition by 0 and 100, tobit (Tobit)
is not a good method. The situation is not that observed values could
be outside 0 and 100, but those are the minimum and maximum values
observed in practice; rather those are the minimum and maximum values
that could be observed. Also, a linear functional form seems less
plausible than a nonlinear one.  It is likely that something more akin
to a logit model will give better results.  See

SJ-8-2  st0147  . . . . . . . . . . . . . . Stata tip 63: Modeling proportions
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F. Baum
        Q2/08   SJ 8(2):299--303                                 (no commands)
        tip on how to model a response variable that appears
        as a proportion or fraction

for a concise review of various possibilities. It is accessible to
all, regardless of SJ subscription, at

On Thu, Oct 4, 2012 at 5:28 PM, Ebru Ozturk <> wrote:

> I have a question that I cannot decide whether I should use truncated regression or Heckman sample selection.
> For instance, in the dataset, firms that produce any type of innovation (process or product) give information about other 'x' variables. In other words, firms that do not produce any innovation do not answer other questions as these questions are directly related to firms' innovation activities. So, the 'x' variables that I am interested in have no values only for those firms that do not produce innovation. But, I know the dependent (y) variable in both case, either firms produce innovation or not produce.
> I am planning to run tobit regression as the dependent variable is percentage between 0 - 100 and Heckman sample selection model to check selection bias. But, I can not decide whether it is truncated sample or Heckman sample selection.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index