Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: R: Estimating the number of workers in each industry in each district - flag: Stata 9/2 SE
From 
 
Steve Samuels <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: R: Estimating the number of workers in each industry in each district - flag: Stata 9/2 SE 
Date 
 
Fri, 24 Sep 2010 11:55:48 -0400 
My advice about handling household counts of workers was wrong. Do not expand.
Say you have counts for the number of workers in the hh  in three industries
n_agriculture
n_service
n_sales
Then you would use do a separate command for each industry, for example:
*********************************************
levelsof district, local(districts)
foreach x of  local districts{
svy: total n_agriculture if district==`x'
}
***********************************************
You would use this form rather than an -over()-  or -subpop()- option,
because districts are sampling strata.
-Steve
On Fri, Sep 24, 2010 at 9:44 AM, Steve Samuels <[email protected]> wrote:
> Arka-
>
> Based on your description, you would -svyset- your data as follows:
>
> Define a variable (call it "psu" for "primary sampling unit") which is
> the village number (rural sector) or urban block( urban sector)
>
>
> then
> ********************************************************
> svyset psu [pw = your weight], strata(district)
> ***********************************************************
>
> If your data has one line per person, with "industry" categorized
>
> then the command for totals might be
>
> *****************************************************
> svy: tab district industry, count se format(%10.0fc)
> *****************************************************
>
> If your data has only counts of workers in each industry in each HH,
> then you should -expand- the data first so that it has one line for
> each worker in the HH, e.g.
>
> *************
> expand hhsize
> *************
>
> (but that might include children, so you will have to take some care)
>
> Now a word of advice. It is easy to go wrong in a survey analysis. As
> you are a student, I suggest that you seek guidance from a faculty
> member who is experienced in surveys, if not in Stata. (I know that
> the Department of Statistics at UBC has a survey sampling course). I
> also suggest that you obtain a text to learn about sampnling, such as
> Sharon Lohr's "Sampling: Design and Analysis" (2009).  I also
> recommend "Applied Survey Data Analysis" by Heeringa, West,and
> Berglund (2010); it uses Stata almost exclusively for its examples.
>
> Best wishes,
>
> Steve
>
> Steven J. Samuels
> [email protected]
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax:    206-202-4783
>
>
>
> On Thu, Sep 23, 2010 at 8:24 PM, Arka Roy Chaudhuri <[email protected]> wrote:
>> Hi,
>>  Thanks for the help. In my dataset all the districts in the target
>> population are include. The sampling design is stratified multi-stage
>> design with the first stage units being villages in the rural sector
>> and urban blocks in the urban sector. The ultimate stage units (USU)
>> are households in both the sectors.
>>
>>   I only have one set of weights that comes with the data. The
>> documentation states that the weights represent the probability that
>> the particular household was included in the sample.  Please let me
>> know if I should include any other information. I am really thankful
>> for all the help.
>>
>>
>>
>> Regards,
>>
>> Arka
>>
>> On Wed, Sep 15, 2010 at 7:16 AM, Steve Samuels <[email protected]> wrote:
>>>
>>> Arka-
>>>
>>> I can't answer  without more information about the sampling design.
>>> Please describe the design in detail, including answers to the
>>> following questin..
>>>
>>> 1. Were all districts in the target population included in the sample?
>>> Or, were districts sampled?
>>>
>>> 2. Are the final sampling weights the probability sampling weights? Or
>>> was there adjustment to the probabilithy weights (post-stratification,
>>> "raking")  so that the sample results will better reflect population
>>> census proportions? If the weights are so adjusted,  are the original
>>> sampling weights available to you?
>>>
>>>
>>> Steve
>>>
>>> Steven J. Samuels
>>> [email protected]
>>> 18 Cantine's Island
>>> Saugerties NY 12477
>>> USA
>>> Voice: 845-246-0774
>>> Fax:    206-202-4783
>>>
>>> On Wed, Sep 15, 2010 at 4:07 AM, Carlo Lazzaro <[email protected]> wrote:
>>> > Arka wrote:
>>> > "Now I want to estimate the number of workers
>>> > belonging to each industry in a particular district"
>>> >
>>> > A quite trivial example about Arka's issue may be the following one (set
>>> > aside survey technicalities):
>>> >
>>> > ---------------------code begins------------------------------------
>>> > drop _all
>>> > set obs 100
>>> > g Workers=_n
>>> > g District="East" in 1/50
>>> > replace District="West" in 51/100
>>> > g Industry="Concrete" in 1/30
>>> > replace  Industry="Steel" in 31/100
>>> > g A= 1 if  District=="East" &  Industry=="Steel"
>>> > g B= 1 if  District=="West" &  Industry=="Steel"
>>> > g C= 1 if  District=="East" &  Industry=="Concrete"
>>> > ---------------------code ends------------------------------------
>>> >
>>> > HTH and Kind Regards,
>>> > Carlo
>>> > -----Messaggio originale-----
>>> > Da: [email protected]
>>> > [mailto:[email protected]] Per conto di Arka Roy
>>> > Chaudhuri
>>> > Inviato: mercoledì 15 settembre 2010 9.24
>>> > A: [email protected]
>>> > Oggetto: st: Estimating the number of workers in each industry in each
>>> > district
>>> >
>>> > Dear All,
>>> >        I have a data set which has information at the individual
>>> > level.I have variables which record the district of residence of the
>>> > individual, the industry of employment of the individual and other
>>> > demographic characterstics.The data set also comes with weights which
>>> > represents the probability that a particular household is included in
>>> > the sample.Thus all individuals belonging to a particular household
>>> > get the same weight.Now I want to estimate the number of workers
>>> > belonging to each industry in a particular district.Could anyone
>>> > please advice on the correct stata code that I should write to get my
>>> > desired estimates?Also I would be grateful if somebody could advice me
>>> > on the possible biases that might affect my estimates at the
>>> > industry-district level.I would really appreciate any help in this
>>> > regard.Thanks
>>> >
>>> > Regards,
>>> > Arka
>>> > --
>>> > Arka Roy Chaudhuri
>>> > PhD Student
>>> > University of British Columbia
>>> > 997-1873 East Mall
>>> > Vancouver
>>> > Canada
>>> > Ph: +1 (604) 349-8283
>>> > Email: [email protected]
>>> >
>>> > *
>>> > *   For searches and help try:
>>> > *   http://www.stata.com/help.cgi?search
>>> > *   http://www.stata.com/support/statalist/faq
>>> > *   http://www.ats.ucla.edu/stat/stata/
>>> >
>>> >
>>> > *
>>> > *   For searches and help try:
>>> > *   http://www.stata.com/help.cgi?search
>>> > *   http://www.stata.com/support/statalist/faq
>>> > *   http://www.ats.ucla.edu/stat/stata/
>>> >
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/