Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: subpops vs. over & lincom t vs. regress t in svyset data

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: subpops vs. over & lincom t vs. regress t in svyset data Date Mon, 23 Jan 2012 09:40:25 +0000

```The original post was on 3 January

http://www.stata.com/statalist/archive/2012-01/msg00350.html

and Shaun Scholes replied on 10 January:

http://www.stata.com/statalist/archive/2012-01/msg00348.html
http://www.stata.com/statalist/archive/2012-01/msg00350.html

Nick

On Mon, Jan 23, 2012 at 3:13 AM, Michael Costello
<michaelavcostello@gmail.com> wrote:
> I originally sent this e-mail three weeks ago, but didn't receive a
> response.  I was very much hoping for one, so I thought I would
> repost.
> -M.
>
> I'm working with many many similar survey weighted datasets of
> international education data.  Often I am tasked with creating tables
> of statistics (means, variances, counts, t-statistics, effect size,
> etc.) for many subpopulations and over several phases (baseline,
> midterm, final).
>
> We had been calculating our statistics using -svy: varname,
> over(subpops)- rather than using many -svy, subpop(subpops): mean
> varname- functions in quick succession, as the returned values were
> equal.  In a more recent database, the values are not equal, and I'm
> wondering why that is.  The subpopulation I was working with was
> gender (female=1, male=0).  Could the discrepancies be due to the
> handful of observations with gender = . (missing), or is there some
> other difference in the calculations?  It appears that using the
> -subpop- option treats those observations as non-existent.  How does
> -over- treat them?
>
> I'm also trying to find out the difference between the t-statistic
> that is printed when I do a -lincom- function and the t-statistic that
> is printed when I do a regress function.  For example:
>
> svy: regress score gender
> vs.
> svy: mean score, over(gender)
> lincom [score]Male - [score]Female
>
> I believe that the regression function uses a pooled standard error
> SE, while the -lincom- uses an unpooled calculation, but I was hoping
> for some confirmation on that.
>
> Thanks so much for all your help and advice!  You folks are always so