Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Questions on the rules guiding the -table- command |

Date |
Tue, 28 Aug 2012 18:59:02 -0400 |

Dear All, I am looking for the formal rules of inclusion of empties into the tables that Stata produces and I have a number of [boring] questions. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - First, consider the following example: version 12.0 clear all webuse nlsw88 // does not include the "mining" category table industry union,c(mean wage) row, if union==1 // does include the "mining" category generate unionwage=wage if union==1 table industry union, c(mean unionwage) row // does not include the "mining" category tabulate industry union if union==1 In the results you may see that some of the tables include the "mining" category and some don't. I would like to learn more about the rules of inclusion and the background if possible. Also, how common is it to use the second approach of generating a variable with missings for inapplicable cases, rather than restricting the sample with an if-condition? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - The second question is regarding the interpretation of the "Total" line. Consider the following example: version 12.0 clear webuse nlsw88 table industry, c(mean wage) row table industry, c(mean wage) row, if industry!=1 replace industry=. if industry==1 table industry, c(mean wage) row There are [at least] two possible interpretations of the "total" line. 1) The total line reflects the mean among all the valid observations (for which the row and the outcome variables are not missing). 2) The total line reflects the mean among all the observations where outcome variable is not missing, regardless of whether the row variable is missing or not. Stata seems to be using the first definition. Was this choice conscious? Why? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Why is the option -concise- asymmetric? (works only for rows, but not for columns)? Compare the output: table industry union, c(mean unionwage) row concise and table union industry, c(mean unionwage) row concise (I know it works only for rows according to the documentation, but why?) Also if there is any convention in reporting results shown in any of the cases above, I would like to get some references as well. I fully realize that "in some cases A is preferable, in others B is preferable". But there might be some studies as to how people interpret these situations without help or guidance, etc. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Finally, I would like to restate the 2009 question regarding the labels in the -table- command (see link): http://www.stata.com/statalist/archive/2009-08/msg00505.html adding to the above question also why can't I force -table- to show missing categories of row- and column-variables? Thank you, Sergiy Radyakin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: xtmepoisson postestimation** - Next by Date:
**Re: st: have different sizes of dots in scatter plot** - Previous by thread:
**st: have different sizes of dots in scatter plot** - Next by thread:
**Re: st: Questions on the rules guiding the -table- command** - Index(es):