# st: subpops vs. over & lincom t vs. regress t in svyset data

 From Michael Costello To statalist Subject st: subpops vs. over & lincom t vs. regress t in svyset data Date Tue, 3 Jan 2012 14:14:43 -0500

```Happy New Year Statlisters!

I'm working with many many similar survey weighted datasets of
international education data.  Often I am tasked with creating tables
of statistics (means, variances, counts, t-statistics, effect size,
etc.) for many subpopulations and over several phases (baseline,
midterm, final).

We had been calculating our statistics using -svy: varname,
over(subpops)- rather than using many -svy, subpop(subpops): mean
varname- functions in quick succession, as the returned values were
equal.  In a more recent database, the values are not equal, and I'm
wondering why that is.  The subpopulation I was working with was
gender (female=1, male=0).  Could the discrepancies be due to the
handful of observations with gender = . (missing), or is there some
other difference in the calculations?  It appears that using the
-subpop- option treats those observations as non-existent.  How does
-over- treat them?

I'm also trying to find out the difference between the t-statistic
that is printed when I do a -lincom- function and the t-statistic that
is printed when I do a regress function.  For example:

svy: regress score gender
vs.
svy: mean score, over(gender)
lincom [score]Male - [score]Female

I believe that the regression function uses a pooled standard error
SE, while the -lincom- uses an unpooled calculation, but I was hoping
for some confirmation on that.

Thanks so much for all your help and advice!  You folks are always so

-Michael
```