Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: different samples summary table

From   Maarten Buis <>
Subject   Re: st: different samples summary table
Date   Fri, 22 Feb 2013 13:43:56 +0100

On Fri, Feb 22, 2013 at 1:16 PM,  Dar or <> wrote:
> I am looking for a convenient way to do the following:
> 1. I start of with a dataset that has about 40000 individual observation.
> 2. I exclude about 10000 individuals because they are outside the age bracket that I want to consider
> 3. I exclude another 10000 individuals because they have missing values in some of the key covariates that I want to control for in a regression
> 4. My final esimtation sample has thus 20000 observations.
> There are some key variables, lets call them "income" and "height" and I want to see how the means of these variables change throughout my steps 1-4.
> Preferably, this should be displayed in a table that has 4 rows (with the first row showing the mean values of the initial data, second row mean values of individuals outside the age bracket, third row shows mean values of individuals with missing covariates, fourth row showing mean values of final estimation sample). Moreover, those rows should be in a wide format (i.e. from left to right), provide the number of observation affected and the table should be either in Excel or Tex format.

Notice that the Statalist FAQ requires you to post with your full
name. This is different on many other forums, but here this is
considered a very important part of showing respect to one another.
You should have known this by now.

The example below requires the -estout- package, which you can get by
typing in Stata -ssc install estout-. Similar solutions are possible
with -outreg-, which you can get by typing in Stata -ssc install

*------------------ begin example ------------------
sysuse nlsw88, clear

// prepare a matrix
matrix results = J(4,2,.)
matrix colnames results = wage grade
matrix rownames results = overall right_age not_missing final

// overall
sum wage, meanonly
matrix results[1,1] = r(mean)
sum grade, meanonly
matrix results[1,2] = r(mean)

// in age bracket 36 <= age <=43
sum wage if inrange(age,36,43), meanonly
matrix results[2,1] = r(mean)
sum grade if inrange(age,36,43), meanonly
matrix results[2,2] = r(mean)

// not missing
sum wage if !missing(wage,grade,race,union), meanonly
matrix results[3,1] = r(mean)
sum grade if !missing(wage,grade,race,union), meanonly
matrix results[3,2] = r(mean)

// final
sum wage if !missing(wage,grade,race,union) & inrange(age,36,43), meanonly
matrix results[4,1] = r(mean)
sum grade if !missing(wage,grade,race,union) & inrange(age,36,43), meanonly
matrix results[4,2] = r(mean)

// use -esttab- to display the results
// you can add options to store this as for example a LaTeX file
esttab matrix(results, fmt(2))
*------------------- end example -------------------
(For more on examples I sent to the Statalist see: )

-- Maarten

Maarten L. Buis
Reichpietschufer 50
10785 Berlin

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index