Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Re: st: Truncated sample or Heckman selection‏

From   Ebru Ozturk <>
To   <>
Subject   RE: st: Re: st: Truncated sample or Heckman selection‏
Date   Thu, 4 Oct 2012 22:10:34 +0300

Thank you for your response. I didn't know the reason of not been replied. That's why I sent it twice.
Basicly, my data includes firms that produce innovation and do not produce some innovation at all. Those firms that do not have an activity toward innovation at all do not give information about 'X' variables I am interested in. Therefore, I restrict my sample to firms that at least produce any type of innovation such as product or process so that I can get information on 'X' variables. By restricting sample to those firms, I can also get information about firms that do not produce innovation. Because a firm may not produce product innovation but might produce process innovation thus, this firm answers 'X' variables. Therefore, I asked should I use Truncated regression or Heckman sample selection?
As I am interested in the percentage of total sales from innovation activities, they are obviously between 0 and 100. So, I guess I should use Tobit regression.

I hope it is clear this time.

Kind regards

> Date: Thu, 4 Oct 2012 18:21:04 +0100
> Subject: st: Re: st: Truncated sample or Heckman selection‏
> From:
> To:
> This was posted yesterday with no reply. As has already been pointed
> out today on the list, posts usually get no reply because they are
> unclear -- hence it is best to rewrite them.
> I may be misunderstanding this, but
> 1. If you have no data for x variables for some firms, I don't see
> that you can do anything at all with those firms, except type in
> strings of missing values for them. Why they don't innovate is not
> something on which you have evidence.
> 2. If a response is bounded by definition by 0 and 100, tobit (Tobit)
> is not a good method. The situation is not that observed values could
> be outside 0 and 100, but those are the minimum and maximum values
> observed in practice; rather those are the minimum and maximum values
> that could be observed. Also, a linear functional form seems less
> plausible than a nonlinear one. It is likely that something more akin
> to a logit model will give better results. See
> SJ-8-2 st0147 . . . . . . . . . . . . . . Stata tip 63: Modeling proportions
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F. Baum
> Q2/08 SJ 8(2):299--303 (no commands)
> tip on how to model a response variable that appears
> as a proportion or fraction
> for a concise review of various possibilities. It is accessible to
> all, regardless of SJ subscription, at
> On Thu, Oct 4, 2012 at 5:28 PM, Ebru Ozturk <> wrote:
> > I have a question that I cannot decide whether I should use truncated regression or Heckman sample selection.
> > For instance, in the dataset, firms that produce any type of innovation (process or product) give information about other 'x' variables. In other words, firms that do not produce any innovation do not answer other questions as these questions are directly related to firms' innovation activities. So, the 'x' variables that I am interested in have no values only for those firms that do not produce innovation. But, I know the dependent (y) variable in both case, either firms produce innovation or not produce.
> >
> > I am planning to run tobit regression as the dependent variable is percentage between 0 - 100 and Heckman sample selection model to check selection bias. But, I can not decide whether it is truncated sample or Heckman sample selection.
> *
> * For searches and help try:
> *
> *
> * 		 	   		  
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index