Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Tabout including all categories

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Tabout including all categories
Date	Thu, 29 Dec 2011 15:04:18 +0000

Expecting -tabout- to know about -fre- just won't work (both SSC).

One alternative is -groups- (SSC).

Nick


On 28 Dec 2011, at 17:35, Elizabeth Knaster <[email protected]> wrote:

Thanks for your reply. Yes, I meant to say "cells with zerofrequencies." Any ideas?


Take care,

Liz

Elizabeth Knaster, MPH
Project Coordinator
Urban Indian Health Institute
Seattle Indian Health Board
Phone: 206-812-3032
Fax: 206-812-3044
Email: [email protected]

Sign up for the UIHI's Weekly Resource E-mail here or subscribe at http://www.uihi.org/. The Weekly Resource E-mail is UIHI's primary communication onopportunities for staff development, grant announcements and otherrelevant public health information.



------------------------------

Date: Thu, 22 Dec 2011 20:22:56 +0000
From: Nick Cox <[email protected]>
Subject: Re: st: Tabout including all categories

Showing zero values is not a problem with any tabulation command. Do
you mean cells with zero frequencies?

Nick

On Thu, Dec 22, 2011 at 7:31 PM, Elizabeth Knaster <[email protected]> wrote:

Hello! I could use some help with tabout:
I want to use tabout to produce tables with all categories of avariable, even if the value is equal to zero. I have installed frefrom SSC and have successfully used includelabeled, for example,"fre agecat, includelabeled" but I am unable to use"includelabeled" with tabout. Is there a way to incorporate fre andincludelabeled in the tabout syntax? Or is there some other way tohave tabout display all categories of a variable, including zero?
This is the current code I am using, for reference:

foreach var0 in sex agecat durdmcat dmtype BMIcat   {
tabout `var0' year using "AllSitesTrends.xls", append mi c(freqcol) f(0 3p) clab(N %)
}

Thanks, and happy holidays!

Liz

Elizabeth Knaster, MPH
Project Coordinator
Urban Indian Health Institute
Seattle Indian Health Board
Phone: 206-812-3032
Fax: 206-812-3044
Email: [email protected]
Sign up for the UIHI's Weekly Resource E-mail here or subscribe at http://www.uihi.org/. The Weekly Resource E-mail is UIHI's primary communication onopportunities for staff development, grant announcements and otherrelevant public health information.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 15:44:51 -0500
From: Eric N <[email protected]>
Subject: st: random effects models with weighted observations

My understanding is that there have been some user defined models like
xtregre2 since xtreg, re does not permit weighting of observations.
Does anybody have experience with xtregre2 and some advice in using
it.

- --Eric
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 15:56:37 -0500
From: Austin Nichols <[email protected]>

Subject: Re: st: analysis of cluster of fungal infection in an ICU-unit


roland andersson <[email protected]>:

I meant that if you just want a test of whether a given type of
infection is more likely after the same type, which you have already
said you observed in a graph, you could run a simple mlogit.  No
infection could also be a category modeled, and you could include all
the negative results.

For example, here is a case where the null is true (no clustering
implied by the DGP):

clear
range id 1 1000 1000
g type=ceil(uniform()*6)
tsset id
g lasttype=l.type
mlogit type i.lasttype

A more sensible analysis might use duration with exact times of tests
and entry into into the ICU, as opposed to simple order of test for
infection, and try to isolate the actual mechanism causing the
observed clustering, perhaps using a competing risks analysis on time
to infection (with leaving the ICU being a censoring event). A good
model should incorporate a deep understanding of the science and
setting, which I do not have for fungal infections in an ICU.  I would
suspect ceiling tiles before staff, for example, but you clearly have
a reason for suspecting the staff of transmitting the infections.

On Wed, Dec 21, 2011 at 5:50 PM, roland andersson
<[email protected]> wrote:

Austin

<snip>

I do not understand what you mean by  "You could just run an -mlogit-
of type on last type"?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 16:40:55 -0500
From: "Data Analytics Corp." <[email protected]>
Subject: Re: st: RE: Hierarchical Bayes with MCMC

Hi,

This is good news.  I'll definitely look at the web site and pdf file.

Thanks for the help,

Walt

________________________

Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536
________________________
(V) 609-936-8999
(F) 609-936-3733
[email protected]
www.dataanalyticscorp.com
_____________________________________________________

On 12/22/2011 6:23 AM, George Leckie wrote:

Following on from Nick Cox's comment.

Yes, you can fit multilevel logistic regression models by Bayesian
estimation (MCMC) in Stata by using the runmlwin command to callthe MLwiN
statistical software package.
You can also fit a wide range of other multilevel models by bothlikelihood
and Bayesian methods.

We gave a talk on runmlwin at the recent UK Stata Users' Group, 17th
Meeting (16th September 2011)

http://www.bristol.ac.uk/cmm/media/runmlwin/London.pdf
We have also set up a runmlwin website for interested users whichcontainscomprehensive documentation, worked examples and an activediscussion forum
http://www.bristol.ac.uk/cmm/software/runmlwin/
In particular, see our examples page where there are sample datasets,
do-files and log files showing you how to fit multilevel logistic
regressions by MCMC as well as many other models.

http://www.bristol.ac.uk/cmm/software/runmlwin/examples/

The command can be downloaded from SSC in the usual way

. ssc install runmlwin, replace
While MLwiN is a commercial package, MLwiN is free to UK academics(thanksto ESRC funding body) and a fully functional 30-day free version ofMLwiN
is available to all other users.

Best wishes

George
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 18:08:05 -0400
From: Daniel Marcelino <[email protected]>
Subject: st: capture results from tabulate

Dear all,

I looking for capture somehow the higher value showed in "Freq."
column, as well the label of v3 for each city table. Any idea?

bysort city : tabulate v3 [iw=weight]


/*replication*/
clear
input str2 city weight byte(v1 v2 v3)
 "a" .5 1 2 3
 "a" .1 2 3 4
 "a" .9 3 2 5
  "a" .8 3 4 2
 "a" .2 4 5 1
 "a" .3 5 1 3
 "b" .4 1 4 3
 "b" .1 2 3 4
 "b" .6 3 2 5
"b" .8 4 1 2
"b" .5 4 5 1
"b" .7 1 5 4
"c" .4 2 1 3
"c" .2 2 4 1
"c" .7 3 4 5
"c" .3 4 1 2
"c" .8 4 5 1
"c" .1 5 4 3
end
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 22:16:48 +0000
From: Nick Cox <[email protected]>
Subject: Re: st: capture results from tabulate

Use -contract- instead. Under -by:- only the last table is saved

Nick

On 22 Dec 2011, at 22:08, Daniel Marcelino <[email protected]> wrote:

Dear all,

I looking for capture somehow the higher value showed in "Freq."
column, as well the label of v3 for each city table. Any idea?

bysort city : tabulate v3 [iw=weight]


/*replication*/
clear
input str2 city weight byte(v1 v2 v3)
"a" .5 1 2 3
"a" .1 2 3 4
"a" .9 3 2 5
 "a" .8 3 4 2
"a" .2 4 5 1
"a" .3 5 1 3
"b" .4 1 4 3
"b" .1 2 3 4
"b" .6 3 2 5
"b" .8 4 1 2
"b" .5 4 5 1
"b" .7 1 5 4
"c" .4 2 1 3
"c" .2 2 4 1
"c" .7 3 4 5
"c" .3 4 1 2
"c" .8 4 5 1
"c" .1 5 4 3
end
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

------------------------------

Date: Thu, 22 Dec 2011 18:18:00 -0500
From: Steve Samuels <[email protected]>
Subject: Re: st: standard errors after xtmixed, predit.., fitted

Correction:
If q = 1 - p
se_logit =  se_p/(p*q)
se_logit^2 = (se_p/(p*q))^2

Steve

Jennyfer.

If there are not many different regions at your highest level, Idoubt that you should be fitting each a random effect-in what senseare they random?; fixed effects for the highest levels wouldprobably be better.

In addition to a covariance term (below), you will need to add aterm for the survey standard errors. If se_p is the estimatedsurvey standard error for a proportion p, then the squared standarderror for the logit to add would be: se_logit^2 = se_p^2/(p*(1-p)).

And yes, compute interval endpoints on the logit scale and convertback with the invlogit() function. And no, back-transformed standarderrors (or SDs) need not look like those for the original data.


***********************
sysuse auto, clear
gen lprice = log(price)
mean price lprice
di exp(0.0455814)
******************

You have an additional problem if the yearly estimates for a singlecountry are correlated by virtue of the survey design. I would haveconsidered adding a correlation structure to the residual s.

Here is how to add the covariance contribution to the standard errorifor Stata -productivity- example. Note that the "atr" term is thehyperbolic arctangent, not the log, of the correlation.


***********************************************
webuse productivity, clear

xtmixed gsp private emp hwy water other unemp ///
   || region: || state: unemp, cov(unstructured) reml
matrix list e(b)  //names of terms
scalar sd_err  = exp([lnsig_e]_cons )
scalar sd_region = exp([lns1_1_1]_cons)
scalar sd_state_u  = exp([lns2_1_1]_cons)
scalar sd_state   = exp([lns2_1_2]_cons)

scalar atrho = [atr2_1_1_2]_cons
scalar  rho =   (exp(2*atrho)-1)/(exp(2*atrho)+1)
scalar  cov = rho*sd_state*sd_state_u

scalar dir // check these quantities against results

predict fitted, fitted
predict se_fix,  stdp

gen se_fitted=  ///
sqrt(se_fix^2 +sd_region^2 +  sd_state^2 ///
+ (unemp*sd_state_u)^2 +unemp*2*cov + sd_err^2)
sum  se_fit*
*****************************************************

When you post in the future, please describe what you really did(Statalist FAQ Section 3.3). It will save a lot of time. Just towarn you: I'll have only infrequent looks at Statalist for the next10 days.


Steve
[email protected]


On Dec 22, 2011, at 6:11 AM, Jennyfer Wolf wrote:

Dear Steve, thanks again so much!

I have data from different surveys (survey point estimates) and I use
a term for unstructured covariance in my model:

xtmixed wat_tot year_spline1*|| reg1: || reg2: || country2:year_cat,
cov(unstructured).

so I will add an error term for this in my calculation of the fitted
standard error:
scalar sd_cov = exp([atr3_1_1_2]_cons)

Actually wat_tot (the dependent variable) is transformed with logit(),
to restrict observations between 0 and 1 as I am modelling
proportions.

After "predict A, fitted" I use the inlogit() command to get the
backtransformed estimates. However, I had problems to backtransform
the standard errors because when I compared standard errors received
without any transformation of the dependent variable and
backtransformed standard errors received from a transformed dependent
variable, these values were very different. Would you know a solution
to this (is it correct to also backtransform the fitted standard
error) and also (of course) I would like to restrict my confidence
intervals to values between 0 and 1.

One concern rests relating to the confidence intervals I calculate
with the fitted standard errors:
The model fits the individual country data very well and the
predictions for the estimates and the fitted values seem very
sensible, however, the standard error and the CIs calculated with the
method you proposed for the fitted values are huge and actually take
any sense away from making a prediction..
Does that mean I need to mak my model simpler?

Thank you very much for the great support!
Jennyfer


2011/12/22 Steve Samuels <[email protected]>:

Jennyfer,

I misunderstood your request: my solution was for an observationchosen at random and it incorrectly omitted the residual SD term,to boot. Try this.


*******************************************
webuse productivity, clear
xtmixed gsp private emp hwy water other unemp ///
  || region: || state: unemp

matrix list e(b)  //names of terms
scalar sd_res = exp([lnsig_e]_cons)

predict se_fix,  stdp
predict se_region se_state_u se_state, reses
des se* //check against variable labels
gen se_fitted =  ///
sqrt(se_fix^2 +se_region^2 +  se_state^2 ///
+ (unemp*se_state_u)^2 +sd_res^2)
*******************************************

I think that in your case the last three statements will be:
******************************************************************
predict se_region1 se_region2  se_country_year se_country, rses
des se*  //check against variable labels
sqrt(se_fix^2 +se_region1^2 +  se_region2^2 + ///
+ se_country^2 + (year_cat*se_country_year)^2 +sd_res^2)
******************************************************************

Note that these statements assume that there is no correlationbetween the country and countryXyear random effects, which is whatyour model implies. If there is such correlation (and you can testfor it), then a covariance term must be added to the estimatedstandard error.

If you happen to have sample survey data, then be sure to read thesection of Survey Data in the manual entry for -xtmixed-.


Steve
[email protected]


On Dec 21, 2011, at 10:01 AM, Jennyfer Wolf wrote:

Thank you very much for your answer. I've tried it in many different
variations but I guess there are problems with this approach:

1. the squared standard deviations that we are adding up are
describing variation from the fixed effects but, when I understand
right, not the error of the model

2. the CIs I need describe the uncertainty for the estimates for each
country so countries with more datapoints have a narrower CI and also
for future predictions the CI should get wider (which does not happen
with the approach you suggested.

I tried gllamm and used the "ci_marg_mu" command after "gllapred x,mu

marg fsample" but this does not fit to my individual country data and
still gives me the same CIs no matter how many survey points I have
per country.

Any more ideas on how to get confidence intervals after "xtmixed" and
"predict x, fitted" for the predicted values in multilevel modeling?
(Alternatively with gllamm)
Thank you very, very much.

Jennyfer


2011/12/17 Steve Samuels <[email protected]>:

Correction: I should not have included the SD for the error term,as it is not part of the fitted value.
Here's an example more like yours, but with two levels, not three.I expect t

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Tabout including all categories
  - From: Elizabeth Knaster <[email protected]>

Prev by Date: Re: st: heteroskedasticity and serial correlation in binary choice models
Next by Date: RE: st: heteroskedasticity and serial correlation in binary choice models
Previous by thread: st: Tabout including all categories
Next by thread: st: random effects models with weighted observations
Index(es):
- Date
- Thread