Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how to do subsampling in stata

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: how to do subsampling in stata
Date	Fri, 16 Aug 2013 11:10:12 +0100

B wasn't well worded. Matching and subsampling are not equivalent or
parallel. The matching example is intended to show the kind of user
commitment that tends to change StataCorp's mind about what should be
supported officially.
Nick
[email protected]


On 16 August 2013 09:15, Nick Cox <[email protected]> wrote:
> All your points are valid to me, but
>
> A. At any users' meeting or Stata conference, people will say "Stata
> should support X, which is big in field Y, and that would be really
> popular and people would buy Stata just to use that!" Meanwhile, one
> is looking round the room and there are puzzled faces and people are
> muttering to their friends "What's that? Never heard of it." Mostly,
> everyone is right, but there is a long list of desires. (Often X is
> really big, or an entire approach.)
>
> B. A big difference with matching is the evident volume of real
> interest, shown as sustained activity over a period of years from the
> Stata user community: major user-written programs downloaded
> frequently, lots of papers and talks, numerous questions on Statalist.
> That is a level of commitment not matched by evident interest in
> subsampling. Whether everyone is looking in the wrong direction
> remains a good question.
>
> C. StataCorp is very cautious and slow to react on big statistical
> additions, arguably in the user community's best interests.
> Statistical science, like anything else, is full of five-year fads,
> things transiently popular but dropped abruptly when something else
> becomes hot, or people see that they have been oversold. StataCorp
> doesn't want to spend massive effort on implementing something that
> will be quickly superseded in users' affections. Academics tend to
> read papers and come to favourable views of something and come to
> think "This is great and should be implemented now", but StataCorp
> have a different time scale.
>
> Nick
> [email protected]
>
>
> On 16 August 2013 02:07, László Sándor <[email protected]> wrote:
>> Stas, I am not sure I'm with you on this one.
>>
>> 1. Subsampling looks much, much easier to implement than other novelties.
>> 2. Many if not most people use bootstrap not because they derived that
>> their estimator is smooth but exactly because they worry that
>> something is not exactly canonical in their problem or application,
>> but hey, they can just bootstrap it. My admittedly limited
>> understanding of the difference between the two methods suggest that
>> subsampling is the safer bet.
>> 3. The original (?) question on Statalist even mentioned that Abadie
>> and Imbens tried to warn people that matching is exactly a problem
>> where the bootstrap can be problematic, while subsampling they
>> recommend. With version 13, Stata became a matching powerhouse. Why
>> not support this simple thing, then?
>> http://www.stata.com/statalist/archive/2009-04/msg00920.html
>>
>> Best,
>>
>> Laszlo
>>
>> On Thu, Aug 15, 2013 at 7:13 PM, Stas Kolenikov <[email protected]> wrote:
>>> On Thu, Aug 15, 2013 at 12:12 PM, Phil Schumm <[email protected]> wrote:
>>>> On Aug 15, 2013, at 11:45 AM, László Sándor <[email protected]> wrote:
>>>>> Or of course, if StataCorp reading this is confident about how easy the transition from -bsample- to -sample- would be for a clone of -bootstrap-
>>>>
>>>> I'm not familiar with the literature on the subsampling, so what I'm about to say may not entirely apply here.  However, it is worth noting that a lot of what StataCorp does is not simply implementing estimators and methods, but is making sure that the theory behind them is sound, and that the various things users might do once the method is implemented in Stata are reasonable.  Thus, even though it might be fairly simple for a user to patch an existing command to accommodate a specific situation (for which they are willing to take full responsibility), it might take StataCorp longer to verify for themselves that the enhancement is really something with which they feel comfortable.
>>>
>>> Of many other wonderful theoretical developments in statistics and
>>> econometrics, why not (a) empirical likelihood and exponential
>>> tilting? (b) block bootstrap for time series? (c) delete-k jackknife
>>> for complex survey data? (d) degrees of freedom corrections in mixed
>>> models? (e) tetrad analysis in latent variable models? and an endless
>>> wish list follows. Each of these are well established in their
>>> specific literature, but their use is required in a fairly limited
>>> range of situations. It took Stata Corp about 10 years from seeing the
>>> first user-written multiple imputation and generalized linear latent
>>> variable and mixed model pacakges (-ice/mim- and -gllamm-, of course)
>>> to the production versions of these (-mi-, -meglm- and -gsem-), and
>>> these have three order of magnitude greater generalizability and
>>> potential user base than subsampling (which is really called for in
>>> weird situations with non-smooth estimators, so one needs to put a lot
>>> of work to even produce such an estimator) or empirical likelihood
>>> (which is asymptotically equivalent to the existing -gmm-, anyway).
>>>
>>> That's a long introduction to say that I would not expect to see Stata
>>> Corp working on this for the next three or so releases. If Laszlo's
>>> needs are more urgent, he should start working on his own
>>> implementation of subsampling. As I did with empirical likelihood :).
>>>
>>> -- Stas Kolenikov, PhD, PStat (ASA, SSC)
>>> -- Senior Survey Statistician, Abt SRBI
>>> -- Opinions stated in this email are mine only, and do not reflect the
>>> position of my employer
>>> -- http://stas.kolenikov.name
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: how to do subsampling in stata
  - From: László Sándor <[email protected]>

References:
- st: how to do subsampling in stata
  - From: László Sándor <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Phil Schumm <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Phil Schumm <[email protected]>
- Re: st: how to do subsampling in stata
  - From: László Sándor <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Stas Kolenikov <[email protected]>
- Re: st: how to do subsampling in stata
  - From: László Sándor <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Nick Cox <[email protected]>
- Re: st: how to do subsampling in stata
  - From: László Sándor <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Phil Schumm <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Stas Kolenikov <[email protected]>
- Re: st: how to do subsampling in stata
  - From: László Sándor <[email protected]>
- Re: st: how to do subsampling in stata
  - From: Nick Cox <[email protected]>

Prev by Date: st: Pscore or pstest
Next by Date: st: factor variables may not contain negative values
Previous by thread: Re: st: how to do subsampling in stata
Next by thread: Re: st: how to do subsampling in stata
Index(es):
- Date
- Thread