# Re: st: Comparing frequencies with pweight

Erick, I'd still like to know whether your data are from a sample, and, if so, what design.

There is nothing wrong with the first table. Chi square is proportional to sample size; take any table, double the sample size and chi square will double. You do have a large N, but that's reality; there is no need to "adjust" for it. What matters is the percentages that are being compared. The two waves definitely do differ w.r.t. the outcome (22.5% vs 13.9%) You are asking very different questions if you do and do not weight by number of clients. So, you must decide what your research question is.

Steve
On Nov 15, 2008, at 10:38 PM, Erick Guerrero wrote:

Thanks for the reply Steve, yes, x is an attribute of orgs - for instance - dummy of whether they are for profit or not.

So, tab forprofit wavenumber (pweight= number of clients)

my total sample for both waves is 1118 orgs

I basically want to run a chi-square or a t-test between an org attribute and wavenumber but weight by number of clients.

I tried the svyset - (pweight=number of clients), and then use svy: tab forprofit wavenumber - but I get an ackward result.

or the

*Normal Chi-square has a huge N*

. tab  pfp wavenum, row col chi

+-------------------+
| Key               |
|-------------------|
|     frequency     |
|  row percentage   |
| column percentage |
+-------------------+

|        wavenum
pfp dummy |         4          6  |     Total
-----------+----------------------+----------
0       |       527        430 |       957
|     55.07      44.93 |    100.00
|     86.11      77.48 |     82.01
-----------+----------------------+----------
1       |        85        125 |       210
|     40.48      59.52 |    100.00
|     13.89      22.52 |     17.99
-----------+----------------------+----------
Total     |       612        555 |     1,167
|     52.44      47.56 |    100.00
|    100.00     100.00 |    100.00

Pearson chi2(1) =  14.7018   Pr = 0.000

. tab  pfp wavenum [pweight= nuosatcl], row col chi
pweight not allowed

*Using survey option - reports low column percentages.
*

. svyset

pweight: nuosatcl
VCE: linearized
Single unit: missing
Strata 1: <one>
SU 1: <observations>
FPC 1: <zero>

. svy:tab  pfp wavenum
(running tabulate on estimation sample)

Number of strata = 1 Number of obs = 1121 Number of PSUs = 1121 Population size = 696235 Design df = 1120

-------------------------------
| wavenum pfp dummy | 4 6 Total
----------+--------------------
0        | .4576  .3995   .857
1        |  .048   .095   .143
|
Total      | .5056  .4944      1
-------------------------------
Key:  cell proportions

Pearson:
Uncorrected   chi2(1)         =   21.5824
Design-based  F(1, 1120)      =    5.1446     P = 0.0235

*For MEANS*

. svy: mean   latinclients, over(wavenum)
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =       1          Number of obs    =    1088
Number of PSUs   =    1088          Population size  =  673329
Design df = 1087

4: wavenum = 4
6: wavenum = 6

--------------------------------------------------------------
|             Linearized
Over |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
latinclients |
4 |   11.36198   .9582807      9.481693    13.24227
6 |   17.05415    1.98646      13.15642    20.95188
--------------------------------------------------------------

. test [latinclients]4 - [ latinclients]6=0

( 1)  [latinclients]4 - [latinclients]6 = 0

F(  1,  1087) =    6.66
Prob > F =    0.0100

.

.
. Erick

Steven Samuels wrote:
Martin's suggestions of -svy: tab- , with "organization' designated as the psu, is probably the right one. However, I'm not sure. With your specification "x" and "y" must be attributes of the organizations. One must be "wave". What is the other? - pweight- will weight the organizations by the number of clients. So, essentially you are classifying clients by the attribute of their organization and ignoring organization , except to compute standard errors. For example, if you classify organizations as "big" and "small", -svy: tab- would compare the proportions of clients who are in big organizations in 1995 and 2005 Is this what you want?

Are the 618 organizations a sample of a population? If so, please describe the design, for you need to take into account other aspects of the design. Final question: what do you mean by "adjust for N"?

-Steve

On Nov 14, 2008, at 12:24 PM, Martin Weiss wrote:

Have you checked out -help svy: tabulate twoway-?

HTH
Martin
Hi, I need code for Stata to do descriptives - chi-squares with

I am trying to compare percentages between two waves (1995 and 2005) on
618 organizations.

The tab x y, col row chi with pweight (number of clients in each
organization) does not work, and when I use fweight gives me an inflated N. What is the proper way to compare these frequencies in organizations
and weight by number of clients and adjust for N in stata?

Thanks,

Erick
