Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: random sampling matching the characteristics of the sample

From	Maarten Buis <[email protected]>
To	[email protected]
Subject	Re: st: random sampling matching the characteristics of the sample
Date	Tue, 1 May 2012 11:01:15 +0200

You are not going to gain anything, except that pleasing reviewers is
good for your career. The 10% dummy will mean that there is not as
much information in your data as you would hope, but no amount of
statistical trickery will create information that is not present in
your data... Conceptually, what the reviewer asked you to do seems to
correspond with propensity score matching, and there are tools in
Stata available for such an analysis, see: -findit propensity score-.
There are good rasons for using propensity score matching (and equally
good reasons for not doing so, it all depends on the exact nature of
your research question, your data, etc.) but a sparse dummy is not one
of them.

Hope this helps,
Maarten

On Tue, May 1, 2012 at 10:22 AM, Andrea Rispoli <[email protected]> wrote:
> Dear Stan,
> Thank you. This is the request of a reviewer. Would you recommend that
> I simply chose a random sample?
>
> On Tue, May 1, 2012 at 3:13 AM, Stas Kolenikov <[email protected]> wrote:
>> So why exactly do you want to do this? You will only lose in
>> precision, provided your model is OK; if it is badly misspecified,
>> then God only knows how your coefficients could jump around, so you
>> probably should not trust either specification, anyway.
>>
>> On Mon, Apr 30, 2012 at 6:44 PM, Andrea Rispoli <[email protected]> wrote:
>>> Dear Statalisters,
>>> I am running a regression model: y=f(x, age, size) where x is a dummy
>>> variable that can take value 1 or 0.
>>> Since in my sample x=1 for 10% of the sample and x=0 for 90% of the
>>> sample, I would like to identify a random subsample among the group
>>> x=0 so that it is more "comparable" in terms of size with the
>>> subsample for which x=1.
>>>
>>> My problem is that I would like that the selected subsample (in which
>>> x=0) matched the characteristics of the first subsample (x=1) on the
>>> other dimensions (e,g age and size).
>>> For instance, if I take the subsample x=1, mean of age = 37, mean of size=45.
>>> I would like to randomly select the second subsample (x=0), so that
>>> mean of age = 37, mean of size=45 as it is the case in the first
>>> subsample (x=1).
>>>
>>> Do you have any suggestions on how I could achieve such result in stata?
>>>
>>> Thank you very much in advance for all your help!!!
>>> Kind Regards
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
>> Small print: I use this email account for mailing lists only.
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: random sampling matching the characteristics of the sample
  - From: [email protected] (Brendan Halpin)

References:
- Re: st: random sampling matching the characteristics of the sample
  - From: Andrea Rispoli <[email protected]>

Prev by Date: st: Stepwise saving ommited variable
Next by Date: Re: st: Very high t- statistics and very small standard errors
Previous by thread: Re: st: random sampling matching the characteristics of the sample
Next by thread: Re: st: random sampling matching the characteristics of the sample
Index(es):
- Date
- Thread