Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: missing data, tab ..., matcell(...) and marksample


From   "Paul A. Jargowsky" <paul.jargowsky@utdallas.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: missing data, tab ..., matcell(...) and marksample
Date   Fri, 08 Aug 2003 14:41:01 -0500

I have written a program to replace tabulate, but adds the ability to format the output, whether frequencies or row/col/cell percents. (Oddly, there isn't a way to do this in Stata. tabulate doesn't allow formatting of output, and table doesn't allow row, col, or cell percents). The program also organizes the output a little differently, more to my liking.

The basic operation of the program is that "tab ..., matcell(X)" is invoked internally to produce a matrix of frequencies; these are then manipulated with matrix operations.

The problem I am having concerns missing values in the row or column variable. I won't to be able to retain them, but when I specify the "missing" option to my table command, the missing observations are ignored. I have discovered that the problem is the marksample command, which is needed for byableness and ifableness. In other words, in the following code:

marksample touse
....
.....
......
quietly tab `1' `2' [`weight' `exp'] if `touse',
matcell(`X') matrow(`R') matcol(`C') `missing'
.......

the output matrix `X' will not have a row for the missing category of the row or column variable even if there should be one. But if I comment out "if `touse'", it works:

quietly tab `1' `2' [`weight' `exp'] /* if `touse'*/,
matcell(`X') matrow(`R') matcol(`C') `missing'

Here is the output from the two versions of the do file, with a rigged data set:

With "if `touse'":
. pjt row col, col miss lines

-----------------------------------------------------------
*** row by col (Column Percents) ***
-----------------------------------------------------------
col
row 1 2 3 Total
-----------------------------------------------------------
1 33.3 33.3 33.3 33.3
2 33.3 33.3 33.3 33.3
3 33.3 33.3 33.3 33.3
-----------------------------------------------------------
Total 100.0 100.0 100.0 100.0
-----------------------------------------------------------

Without:
. pjt row col, col lines miss

-----------------------------------------------------------------------
*** row by col (Column Percents) ***
-----------------------------------------------------------------------
col
row 1 2 3 . Total
-----------------------------------------------------------------------
1 25.0 25.0 25.0 33.3 26.7
2 25.0 25.0 25.0 33.3 26.7
3 25.0 25.0 25.0 33.3 26.7
. 25.0 25.0 25.0 0.0 20.0
-----------------------------------------------------------------------
Total 100.0 100.0 100.0 100.0 100.0
-----------------------------------------------------------------------

Is this a feature or a bug?

Paul Jargowsky



=========================================================================
Paul A. Jargowsky, Ph.D., Assoc. Prof. of Political Economy
Director, The Bruton Center, School of Social Sciences (GR 31)
University of Texas at Dallas, 2601 North Floyd Road, Richardson TX 75080
=========================================================================
email: jargo@utdallas.edu or Paul.Jargowsky@utdallas.edu
Home page: http://www.utdallas.edu/~jargo
Voice: 972-883-2992; FAX: 972-883-2735
=========================================================================
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index