Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Is pweight the right weight for me and how to specify my weight vector

 From Richard Williams To statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu Subject Re: st: Is pweight the right weight for me and how to specify my weight vector Date Fri, 27 Dec 2013 00:23:00 -0500

I agree that the question is unclear. I wonder if zip code areas are the intended unit of analysis? If so aweights might be appropriate. See, for example,
```
<http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/weight_syntax>http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/weight_syntax

At 11:37 PM 12/26/2013, Steve Samuels wrote:

```
```Your description so far says nothing about a sampling process of any
kind, so your designation of the weights as "sampling weights" or
"probability" weights (pweights) is premature and probably incorrect.

We would need more detail on the population, the sampling process if
any, the sample, and the purpose of your analysis. Have you only zip
code level data, data on individuals, or both?

Steve

Dear Members.
I have data with multiple observations per zip code.  I count the
number of observations per zip code and use that number as the
sampling weight. So I have a vector called weights, which is equal to
the number of observations per zip code. When I run a regression and
use the [pweight=weights] option, does stata invert each element of
the vector or am I supposed to do take the inverse manually?

Secondly, can someone provide some intuition for when I use pweight as
stated above?  Is the result a regression in which each zip code is
weighted equally?  The worry is that without this weight command, a
zip code with 10,000 observations will drive regression results more
than a zip code with 1 observation.  I'm wondering if using pweight
will down weight the zip code with 10,000 observations and upweight
zip codes with fewer observations. Is there a better weighting scheme
to use in this situation? Thanks for any advice.
Jesse
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```
```
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```