Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: GSAMPLE R3300

From	Duru <[email protected]>
To	[email protected]
Subject	Re: st: GSAMPLE R3300
Date	Fri, 7 Sep 2012 08:21:54 +0200

Dear Sam,
Thanks for the translations anyways, I just wanted to let you know
there are social scientists in the list who do not have extensive
programming skills or did not have advanced statistical training, but
still curious to understand stuff which is not relevant to their
immidiate research.

I respect and kind of enjoy Nick Cox’s style, I know by now it is not
the mood, but the attitude. And still good to hear what would be the
most embarrassing and honest reply to your question, but if there is
somebody else with patience to explain further, I personally always
want to hear, even if it is not exactly correct it gives me a broader
view.

Best regards,

Duru


On Fri, Sep 7, 2012 at 1:55 AM, Nick Cox <[email protected]> wrote:
> What I said is easily accessible to anyone who remains curious.
>
> I still think it was a fair comment, but see no point in repeating or
> rewriting what I said. Sam's welcome to his own interpretations and
> speculations.
>
> My amusement was entirely at Sam's wry summary of what I said, not at
> all at his own positive contributions. I am sorry that I evidently did
> not make that clear.
>
> Nick
>
> On Fri, Sep 7, 2012 at 12:20 AM, Lucas <[email protected]> wrote:
>> A fuller summary of Nick's response to the gsample query is:
>>
>> 1)If the poster is making an illogical request, the program should not
>> try to puzzle it out.
>> 2)So, give the program weights it can use.
>>
>> I agree with 1, but 2 offered no guidance in what characteristics such
>> weights might need to have, so I took a stab, and the tone I read in
>> the email was generally dismissive ("Are you saying that you expected
>> -gsample- . . .to cope with an illogical request?" . . . "If a user is
>> asking something crazy" . . . "Alternatively, you can always write
>> your own program that does what you want it to do.")
>>
>> I just assumed you, Nick, were in a bad mood, because immediately
>> nearby was your one-word response to another poster's question.  Your
>> answer was "No."  A later poster provided a bit more assistance.
>>
>> We all fall into bad moods occasionally.  When we do, it'd smooth
>> social interaction if we don't claim amusement when others step
>> forward to try to help others.
>>
>> Respectfully
>> Sam
>>
>> On Thu, Sep 6, 2012 at 3:23 PM, Nick Cox <[email protected]> wrote:
>>> I think Sam's last paragraph refers to my posting. For the record, I
>>> consider his summary amusing, but not to represent what I said or even
>>> meant. My main point was to underline that the program -gsample- was
>>> behaving defensibly and that the poster's surprise was thus misplaced.
>>> I did also suggest that they needed to recalculate the weights.
>>> Naturally any other contributions to the thread that explain
>>> specifically and correctly what the poster should do instead are more
>>> valuable than that one post.
>>>
>>> Nick
>>>
>>> On Thu, Sep 6, 2012 at 11:04 PM, Lucas <[email protected]> wrote:
>>>> I've never used gsample, but I just assumed after you removed the C
>>>> units you could adjust the remaining cases so that their weights sum
>>>> to 1.  Sorry I didn't say that.  Not sure this new information alters
>>>> Steve's comment.
>>>>
>>>> My understanding of Stas's comment was that one left the certainty
>>>> units in and let gsample select them while gsample also selected other
>>>> cases, too.  My approach was to remove the certainty units and use
>>>> gsample to select the remainder.  As I don't know what follows the
>>>> sample selection, nor do I know gsample, I can't tell whether
>>>> something is gained by letting gsample select the certainty units.
>>>>
>>>> At any rate, I took the ridiculous step of responding to a question
>>>> about a command I have never used because I thought the poster
>>>> deserved something more useful than an admonition to not be surprised
>>>> if they try to get a command to do something it cannot do when it
>>>> doesn't do it.
>>>>
>>>> Sam
>>>>
>>>> On Thu, Sep 6, 2012 at 2:17 PM, Steve Samuels <[email protected]> wrote:
>>>>>
>>>>>
>>>>> Sam:
>>>>>
>>>>> You've missed the point of Stas's post: After removing the initial certainty
>>>>> units, the scaled size measures (probabilities) of the remaining units must be
>>>>> adjusted upward so that they sum to 1. Now additional units might violate the
>>>>> inequality quoted in the -gsample- error message. The process is repeated
>>>>> until the inequality is not violated for any the remaining units.
>>>>>
>>>>> Some alternatives to -gsample- and -samplepps- (SSC):
>>>>> Sampford's Method can be found in the SAS SURVEYSELECT procedure. SAS's default
>>>>> PPS method is the Hanurav-Vijayan method (Vijayan, 1968); see also Fox (1989)
>>>>> and Golmant (1990). Tilley's elimination method can be found in the R "sampling"
>>>>> package as the -UPTille- command.
>>>>>
>>>>> Tille (2006) is the definitive text these days.
>>>>> See also the -help- for Ben Jann's -mf_mm_sample- for more information
>>>>> (-gsample- is a wrapper for this).
>>>>>
>>>>> References:
>>>>>
>>>>> Fox, D. R. (1989), "Computer Selection of Size-Biased Samples," The American
>>>>> Statistician, 43(3), 168–171.
>>>>>
>>>>> Golmant, J. (1990), "Correction: Computer Selection of Size-Biased Samples," The
>>>>> American Statistician, 44(2), 194.
>>>>>
>>>>> Tillé, Yves. 2006. Sampling algorithms. New York: Springer.
>>>>>
>>>>> Vijayan, K. (1968), "An Exact Sampling Scheme: Generalization of a Method of
>>>>> Hanurav," Journal of the Royal Statistical Society, Series B, 30, 556–566.
>>>>>
>>>>>
>>>>> Steve
>>>>>
>>>>>
>>>>> On Sep 6, 2012, at 1:04 PM, Lucas wrote:
>>>>>
>>>>> Why not simply remove the certainty units (C Units), draw the sample
>>>>> from the remainder units (R Units) to obtain the sampled units (S
>>>>> Units), then add the certainty and sampled sets (C & S) together to
>>>>> form the final sample (FS units)?
>>>>>
>>>>> Sam
>>>>>
>>>>> On Thu, Sep 6, 2012 at 8:44 AM, Stas Kolenikov <[email protected]> wrote:
>>>>> On Thu, Sep 6, 2012 at 9:28 AM, Lieke Boonen (SiRM)
>>>>> <[email protected]> wrote:
>>>>> We try to take a sample from our population, without replacement. we have several subgroeps with a high sampling weight. However with the gsample command it gives an error because for these cases the w_i*n /sum(w) is lager than 1. We thought the program looked at the relation between the weights and that this should not be a problem. Does anyone recognize this problem and is there a solution for this problem?
>>>>>
>>>>> As far as I can recall, -gsample- does a decent job of selecting one
>>>>> observation from the list, provided, as you found the hard way, that
>>>>> you don't have any certainty units. However, it is not appropriate for
>>>>> many real situation sampling problems, which usually require more
>>>>> complicated code. You also need to be aware that PPSWOR is a very
>>>>> non-trivial and counter-intuitive task. See
>>>>> http://www.citeulike.org/user/ctacmo/tag/unequal_prob_sampling for the
>>>>> appropriate references. All in all, you probably need to do this:
>>>>>
>>>>> 1. Identify the certainty units, set their probability of selection to 1.
>>>>> 2. Adjust the probability distribution, pulling up the probabilities
>>>>> for other units.
>>>>> 3. Check again for the certainty units: repeat steps 1-2 until the
>>>>> probability of selection on a single draw have converged.
>>>>> 4. Implement your PPS procedure -- systematic sample is the poor man,
>>>>> old days shortcut procedure to sample from the physical list on
>>>>> sheet(s) of paper that leads to technical difficulties in variance
>>>>> estimation; Rao-Hartley-Cochran is the easiest-to-implement shortcut
>>>>> that leads to an approximate PPS; Rao-Sampford used to be the most
>>>>> rigorous choice until Tille's elimination procedures appeared in the
>>>>> literature.
>>>>>
>>>>> --
>>>>> -- Stas Kolenikov, PhD, PStat (SSC)  ::  http://stas.kolenikov.name
>>>>> -- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
>>>>> srbi dot com
>>>>> -- Opinions stated in this email are mine only, and do not reflect the
>>>>> position of my employer
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: GSAMPLE R3300
  - From: "Lieke Boonen (SiRM)" <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Stas Kolenikov <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Lucas <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Steve Samuels <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Lucas <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Nick Cox <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Lucas <[email protected]>
- Re: st: GSAMPLE R3300
  - From: Nick Cox <[email protected]>

Prev by Date: st: ivreg2: Anderson-Rubin Wald significant, and Stock-Wright S not significant - explanation?
Next by Date: st: New command: stpm2illd - flexible parametric illness death models
Previous by thread: Re: st: GSAMPLE R3300
Next by thread: Re: st: GSAMPLE R3300
Index(es):
- Date
- Thread