Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: When number of regressors greaterthan the number of clusters in OLS regression


From   Divya Balasubramaniam <divya@uga.edu>
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: When number of regressors greaterthan the number of clusters in OLS regression
Date   Tue, 2 Sep 2008 08:32:00 -0400 (EDT)

Hello Dr.Cox,

Thanks a lot for pointing the issue on the share variable. I will look into the reference.

Divya.

---- Original message ----
>Date: Tue, 2 Sep 2008 12:57:57 +0100
>From: "Nick Cox" <n.j.cox@durham.ac.uk>  
>Subject: RE: st: When number of regressors greater than the number of clusters in OLS regression  
>To: <statalist@hsphsun2.harvard.edu>
>
>In the back-and-forth with several penetrating comments from Mark
>Schaffer and Steve Samuels one key question was raised by Steve but not
>as far as I can see really answered and another key question was not
>raised at all. 
>
>First off, at the risk of being obvious, states for which data are
>available as sampled population seem most unlikely on the face of it to
>be a undistorted sample of the target population, presumably all India.
>My guess would be that various states with no data, say those in remote
>or mountainous areas or politically or militarily sensitive, are also
>often states with low provision. (I'll bet Kashmir or Himachal Pradesh
>is not in the 17, for example.) As your research question seems likely
>to entail extra-statistical inference to all India, it would be vital to
>take account as far as you possibly can of the likely biases. For
>example, you could try to see where the 17 lie in the all-India
>frequency distributions for your predictors or for other
>standard-of-living measures or proxies. 
>
>Second, share whether measured as proportion (0-1) or percent (0-100%)
>is bounded and that raises the question, often addressed on this list,
>of whether your modelling should pay direct attention to that. There is
>nothing in standard regression that guarantees predictions for such a
>response within feasible ranges, and worrying econometrics-style about
>how to handle the error term should surely take second place to thinking
>about the best handling of the response variable! At best this may not
>bite much in practice if values are near the middle of the range, 0.5 or
>50%, and vary little. However, a wild guess is that your likely range is
>much larger than that and that values near 0.1 or 0.9 may arise in some
>districts. The problem will be compounded if your project tempts you
>into making out-of-sample predictions for areas where share is expected
>to be low. 
>
>Kit Baum recently surveyed the leading options here in a concise and
>highly informative Stata Journal Tip: 
>
>SJ-8-2  st0147  . . . . . . . . . . . . . . Stata tip 63: Modeling
>proportions
>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.
>F. Baum
>        Q2/08   SJ 8(2):299--303                                 (no
>commands)
>        tip on how to model a response variable that appears
>        as a proportion or fraction
>
>and, as said, there has been much discussion on the list on how to
>handle proportional responses.   
>
>Nick
>n.j.cox@durham.ac.uk 
>
>Divya Balasubramaniam
>
>Thank you all for your invaluable suggestions. I really appreciate it.
>
>
>*
>*   For searches and help try:
>*   http://www.stata.com/help.cgi?search
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/
=======================================
Divya Balasubramaniam
Economics PhD Student
Terry College of Business
University of Georgia
Athens -30602.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index