Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: sampling program help


From   Steven Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: sampling program help
Date   Wed, 19 Jan 2011 23:24:05 -0500


George, Remember that this is an international list and that not everyone will know what a "GP" is (I don't). Technical issues aside for the moment, I'd like to know what the purpose of the study is and what the design is intended to do: what's being measured and on whom (people, households, soil samples? availability of programs?); what statistics you hope to calculate and in what subgroups? Do you intend to weight the data to reflect the probabilities of selection or the population represented?

I would point out in the current design, you will be unable to estimate any attribute of the entire population without bias because some sansads with flood-prone codes of 1 & 2 will not be eligible for sampling. Also, estimates in single GPSs will have no more than 1-2 degrees of freedom for error, and the unequal sampling probabilities will effectively reduce these somewhat.

Without knowing the constraints under which you are operating, I make this suggestion: as you are primarily interested in the code 3 & 4 sansads, define people living in these as the population of interest and eliminate sansads with codes 1 & 2 altogether.

Steve

Steven J. Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783






On Jan 5, 2011, at 11:34 AM, george joseph wrote:

Dear Staz,

Thank you for your help. But still, I need to use the prone_flood
variable in allocating  the sansads in a particular manner. Let me try
to explain more clearly as I made a mistake in my explanation in the
previous mail. Now we have the sample size to be drawn from each
category (1-4. given by the generated samplesize variable) Now the
issue of alocation based on prone_flood variable comes in.

The prone_flood variable is coded 1-4 and more weight should be given
to 4 then 3 and then 2 and then 1.  We need to choose in such a way
that from each GP, those with higher codes are selected first. that is


So, from each GP, the sansads should be selected as follows based on
the prone_flood variable. If samplesize is 4, select 2 sansands from
those which are coded 4 (prone_flood variable) and one each from those
sansads which are coded 3 and 2.
If the samplesize is 3, select 2 from those which are coded 4 and 1
from those which are coded 3 .
If the samplesize is 2, select 2 from those which are coded 4

If the samplesize is 1, select 1 from those which are coded 4.

However, the problem arises when one particular category of sansads
say those coded 4 (prone_flood variable) are absent.Ideally we should
go to the next category coded 3, choose 2 sansads from them and then
choose one each from those coded 2 and 1 respectively.

Any help on this problem will be really appreciated.

Thanks.
George



For each GP, the procedure is as follows: If the samplesize is 1, we
should  select 1 sansad from the GP from those sansads which are coded
4 based on prone_flood variable. If none is coded 4, we should select
one sansad which is coded 3 from among those which are coded 3. If
none is coded 3, we should select one sansad which is coded 2 from
among those which are coded 2. And finally, If none is coded 2, we
should select one sansad which is coded 1 from among those which are
coded 1.

To continue,: If the samplesize is 4, we should  select 2 sansads from
the GP from those sansads which are coded 4 based on prone_flood
variable. If none is coded 4, we should select two sansads which is
coded 3 from among those which are coded 3. If none is coded 3, we
should select two sansads which is coded 2 from among those which are
coded 2. And finally, If none is coded 2, we should select all four
sansads which is coded 1 from among those which are coded 1.



On Tue, Jan 4, 2011 at 5:27 PM, Stas Kolenikov <skolenik@gmail.com> wrote:
This is a starting point (it ignores the prone_flood variable, as I
could not understand how you use it):

recode tot_sansad (8/12=1) (13/14=2) (15/16=3) (17/20=4), gen( samplesize)

generate rnd = uniform()

bysort gp (rnd) : gen byte tosample = (_n <= samplesize)

keep if tosample

On Tue, Jan 4, 2011 at 3:45 PM, george joseph <gjosephresearch@gmail.com > wrote:
  Hi,
I would like to have your help with writing a program to resolve  the
following sampling issue I have
I have data on 223 sansads  ( sansad_id) in 19 GPs . The variable
prone_flood_cyclone (1-4) indicates
the strength of floods and cyclones in the sansads and tot_sansads
indicates the total number of sansads in each GP.

I would like to draw a sample of 55 sansads in the following manner.
1) The sample should contain sansads from all 19 GPS
2) If the  total number of sansads (tot_sansads) in a GP is less than
12, we should draw one sansad, if  tot_sansads is between 13 and 14,
we should draw 2 sansads, if  tot_sansads is between 15 and 16,  we
should draw 3 sansads, and if  tot_sansads is between 17 and 20, we
should draw 4 sansads.

The following is the METHOD OF ALLOCATION of the number of SANSADs in
each of the  GPs giving OVER-WEIGHTAGE to SANSADs Prone to floods,
cyclones etc.
If the
Total number of SANSADs in one GP   :8-12
Number of SANSADs   to be selected : 1
Method of Allocation ( based on ne_flood_cyclone)
      1 to be selected from Code’4 ‘. If Code 4 is ‘NIL’, than 1
from Code ‘3’. If Code 3 is also’ NIL’, then 1 from Code’2’. If Code
2 is also ‘NIL’, then 1from Code ‘1’.

if
Total number of SANSADs in one GP   :13-14
Number of SANSADs   to be selected : 2
Method of Allocation ( based on prone_flood_cyclone)
      1 from Code ‘4’+from Code’3’ – otherwise, follow the sequence
as in sl.1.
if
Total number of SANSADs in one GP   :15-16
Number of SANSADs   to be selected : 3
Method of Allocation ( based on prone_flood_cyclone)
      1from Code’4’ +1from Code’3’+1 from Code’2’- otherwise, follow
the sequence as in sl.1.
and if
Total number of SANSADs in one GP   :17-20
Number of SANSADs   to be selected : 4
Method of Allocation ( based on prone_flood_cyclone)
      2from Code’4’+1 from Code’3’+ 1from Code’2’ -
otherwise, follow the sequence as in sl.1.

Could you let me know how I can do this which can be replicated .
Thanks.
George

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/




--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index