Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: different samples summary table


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: different samples summary table
Date   Fri, 22 Feb 2013 12:42:27 +0000

Please use full real names on Statalist. This is explained in

http://www.stata.com/support/faqs/resources/statalist-faq/#tojoin

namely

"You are asked to post on Statalist using your full real name. This is
a long-standing practice on Statalist. Giving full names is one of the
ways in which we show respect for others. Your chances of eliciting a
good reply are greatly diminished if you write and conceal your
identity. Conversely, if you decide just to watch and read on the
list, your email identity remains entirely up to you."

One answer is that when you exclude, don't -drop-.

Consider this simple analogue of your problem

sysuse auto, clear

qui foreach v in mpg weight {
	gen `v'2 = `v' if foreign
	gen `v'3 = `v' if foreign & rep78 > 3
}

tabstat mpg* weight*, s(n mean) c(s)
tabstat mpg* weight*, s(n mean)

This isn't everything you ask for, but there are plenty of
well-documented commands for table export. The one trick here is to
-generate- variables that include only the observations you want,
making use of -if-.

Nick

On Fri, Feb 22, 2013 at 12:16 PM,  <altruist81@gmx.de> wrote:
> Dear statalist users,
>
> I am looking for a convenient way to do the following:
>
> 1. I start of with a dataset that has about 40000 individual observation.
> 2. I exclude about 10000 individuals because they are outside the age bracket that I want to consider
> 3. I exclude another 10000 individuals because they have missing values in some of the key covariates that I want to control for in a regression
> 4. My final esimtation sample has thus 20000 observations.
>
> There are some key variables, lets call them "income" and "height" and I want to see how the means of these variables change throughout my steps 1-4.
>
> Preferably, this should be displayed in a table that has 4 rows (with the first row showing the mean values of the initial data, second row mean values of individuals outside the age bracket, third row shows mean values of individuals with missing covariates, fourth row showing mean values of final estimation sample). Moreover, those rows should be in a wide format (i.e. from left to right), provide the number of observation affected and the table should be either in Excel or Tex format.
>
> What ways are there to produce such a table?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index