Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Jesse Burkhardt <jesse.burkhardt@yale.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Is pweight the right weight for me and how to specify my weight vector |

Date |
Fri, 27 Dec 2013 16:41:15 -0800 |

Thanks for the responses. I'll try and be more clear. My data is as follows: The dependent variable is the cost of a solar panel installation in a given zip code. The sampling process is assumed to be 100% of the population of installed solar panels. We believe this to be a fairly reasonable assumption. The independent variables are characteristics of the installed solar panels at the zip code level, zip code level census data, and a city based department of energy "score". This "score" variable is our primary variable of interest. The census data are obviously mean values for each zip code but the characteristics and the cost data are not means so I wasn't sure I wanted to use aweights since aweights seem to be for mean level data only. In addition, the score data at the city level causes problems because all zip codes within a given city are assigned the same score value and there is probably selection into the department of energy scoring program at the city level. For now I am ignoring the selection problem. On the other hand, since we assume we do not have a sampling bias for the installations, in that we have 100% of the population, then I'm not sure weights are really necessary. Here is the troubling question: I have cities with only 1 or 2 installations and cities with over 10,000 installations. My worry is that the cities with 10,000 installations will drive the regression results for the coefficient on "score." I would like to add weight to cities with only a few installations and down weight cities with thousands of observations. Which weighting scheme would work best for this and is this appropriate to do given the structure of the data? Thanks again. Jesse On Thu, Dec 26, 2013 at 9:23 PM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > I agree that the question is unclear. I wonder if zip code areas are the > intended unit of analysis? If so aweights might be appropriate. See, for > example, > > <http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/weight_syntax>http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/sample_surveys/weight_syntax > > > > At 11:37 PM 12/26/2013, Steve Samuels wrote: > >> Your description so far says nothing about a sampling process of any >> kind, so your designation of the weights as "sampling weights" or >> "probability" weights (pweights) is premature and probably incorrect. >> >> We would need more detail on the population, the sampling process if >> any, the sample, and the purpose of your analysis. Have you only zip >> code level data, data on individuals, or both? >> >> Steve >> >> >> Dear Members. >> I have data with multiple observations per zip code. I count the >> number of observations per zip code and use that number as the >> sampling weight. So I have a vector called weights, which is equal to >> the number of observations per zip code. When I run a regression and >> use the [pweight=weights] option, does stata invert each element of >> the vector or am I supposed to do take the inverse manually? >> >> Secondly, can someone provide some intuition for when I use pweight as >> stated above? Is the result a regression in which each zip code is >> weighted equally? The worry is that without this weight command, a >> zip code with 10,000 observations will drive regression results more >> than a zip code with 1 observation. I'm wondering if using pweight >> will down weight the zip code with 10,000 observations and upweight >> zip codes with fewer observations. Is there a better weighting scheme >> to use in this situation? Thanks for any advice. >> Jesse >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > > ------------------------------------------- > Richard Williams, Notre Dame Dept of Sociology > OFFICE: (574)631-6668, (574)631-6463 > HOME: (574)289-5227 > EMAIL: Richard.A.Williams.5@ND.Edu > WWW: http://www.nd.edu/~rwilliam > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Is pweight the right weight for me and how to specify my weight vector***From:*Steve Samuels <sjsamuels@gmail.com>

**References**:**st: Is pweight the right weight for me and how to specify my weight vector***From:*Jesse Burkhardt <jesse.burkhardt@yale.edu>

**Re: st: Is pweight the right weight for me and how to specify my weight vector***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: Is pweight the right weight for me and how to specify my weight vector***From:*Richard Williams <richardwilliams.ndu@gmail.com>

- Prev by Date:
**st: Ocratio gives neither AIC nor BIC** - Next by Date:
**st: Stata estimates to Latex tables in vertical order** - Previous by thread:
**Re: st: Is pweight the right weight for me and how to specify my weight vector** - Next by thread:
**Re: st: Is pweight the right weight for me and how to specify my weight vector** - Index(es):