Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Questions on the rules guiding the -table- command


From   Clyde B Schechter <clyde.schechter@einstein.yu.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Questions on the rules guiding the -table- command
Date   Wed, 29 Aug 2012 15:00:12 +0000

Sergiy Radyakin asks, among other things:

 First, consider the following example:

version 12.0
clear all
webuse nlsw88

// does not include the "mining" category
table industry union,c(mean wage) row, if union==1

// does include the "mining" category
generate unionwage=wage if union==1
table industry union, c(mean unionwage) row

// does not include the "mining" category
tabulate industry union if union==1

In the results you may see that some of the tables include the "mining" category
and some don't. I would like to learn more about the rules of inclusion and the
background if possible.

I think the behavior he observes has nothing to do with the -table- command, but reflects the behavior of -if- conditions.

The manual says "The if exp qualifier restricts the scope of a command to those observations for which the value
of the expression is true."  [U] 11.1.3.

Otherwise put, 

-<command> if <condition>-

is equivalent to

preserve
keep if <condition>
-<command>-
restore

In the nlsw88 data set, there are _no_ observations where industry == "Mining" and union == 1:


. tab industry union

                      |     union worker
             industry |  nonunion      union |     Total
----------------------+----------------------+----------
Ag/Forestry/Fisheries |        10          2 |        12 
               Mining |         2          0 |         2 
         Construction |        17          3 |        20 
        Manufacturing |       234         84 |       318 
Transport/Comm/Utilit |        39         48 |        87 
Wholesale/Retail Trad |       239         21 |       260 
Finance/Ins/Real Esta |       144          9 |       153 
  Business/Repair Svc |        51          8 |        59 
    Personal Services |        58          5 |        63 
Entertainment/Rec Svc |        12          2 |        14 
Professional Services |       500        220 |       720 
Public Administration |       101         56 |       157 
----------------------+----------------------+----------
                Total |     1,407        458 |     1,865 


So when a command is issued with -if union == 1-, the command is carried out on a (virtual) data set in which there are no observations where industry is Mining.  So Mining will not appear in the output of that command (nor be involved in its computations), any more than would, say, "Health Care" which doesn't appear anywhere in the data set at all.

By contrast, the command that produced output which did include the Mining category has no -if- condition, so -table- is applied to the entire data set, including those observations where industry is Mining, notwithstanding the absence of any values for the variable unionwage.

His second question about the interpretation of the Total row, I believe is similarly explained by the way -if- operates.

His other question I cannot answer--I think only those who authored the -table- command would know.

Hope this helps.

Clyde Schechter
Department of Family & Social Medicine
Albert Einstein College of Medicine
Bronx, New York, USA


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index