Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: st: Different Results for the same estimation


From   Johannes Schoder <johannes.schoder@soi.uzh.ch>
To   statalist@hsphsun2.harvard.edu
Subject   Re: AW: st: Different Results for the same estimation
Date   Wed, 16 Sep 2009 15:14:38 -0400

Hi Martin:
Nice your suggestion works, you are a genius!!
Thanks a lot for your help I really appreciate!!!
Johannes


Martin Weiss schrieb:
<>

Not sure whether your data transferred well, but this is probably close to
what you want :-)


**************
clear*

inp  id  time_in_months county/*
*/ str10 cancer
1    13 2  breast
2    14 2  breast
3 1 2 breast end

compress
list, noobs
bys county cancer: /*
*/egen N_survivorsOneYear  /*
*/ =total((time_in_months>12))

list, noobs

**************


HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes Schoder
Sent: Mittwoch, 16. September 2009 00:39
To: statalist@hsphsun2.harvard.edu
Subject: Re: AW: st: Different Results for the same estimation

I found another bug in my calculations:
Since I have the number of diagnosed cancer cases per cancer type, county, and the survival time in months I wanted to calculate the number of people surviving one year per county and cancer type. However I did it wrong. How can I generate a variable that gives my the number of people who survived 12 months?

bysort CancerType COUNTY: egen N_SurvivorsOneYear = count(time_in_months) if time_in_months>12


When time_in_months<12 N_SurvivorsOneYear gets zero or "." (missing value)
but I want that it takes the value of the number of survivors per disease and per county.
I know my description sounds confusing here is an example:

id  time_in_months county    N_survivorsOneYear   cancer type
1 13 01 2 breast 2 14 02 2 breast 3 1 03 . breast

instead of the "." missing value for id 3 and N_survivorsOneYear I want to have "2"
Thanks a lot for your help!
Johannes




Martin Weiss schrieb:
<>

You can always -collapse- or make up a fake identifier as

-bys County disease: gen personid=_n-
-la var personid "Fake Identifier"-


To appreciate the meaning of this command, check Nick`s
http://www.stata-journal.com/sjpdf.html?articlenum=pr0004



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Johannes
Schoder
Gesendet: Dienstag, 15. September 2009 22:16
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: Different Results for the same estimation

Hi Martin:
Thanks a lot for your help.
Yes you are right I have nesting levels, within counties there are diseases that afflict individuals. Unfortunately I messed (or the data provider) something up when importing the data. I just realized that I have a lot of individuals with the same identifier variable (although they are not the same), so I can't really use the id number. Is there any alternative of aggregating the individual level data to the county level?
Johannes


Martin Weiss schrieb:
<>


So there are three nesting levels? Within counties, there are diseases
afflicting individuals? If that is the case, you should amend your
command
as

- bysort County disease (individual): keep if _n==1-

to make it stable for the -glm- analysis. "individual" should be replaced
by
some identifier variable, like an id number.

Also look at -egen, tag()- as -drop-ping is not generally the best
approach
to conducting a restricted analysis ("How are you going to get the
dropped
obs back when you need them quickly?").

Also look at -xtmixed- and its brothers, as your analysis sounds like a
good
case for them...


HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes
Schoder
Sent: Dienstag, 15. September 2009 20:17
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Different Results for the same estimation

I found the bug:

Since I am using the following command before the estimation:
bysort County disease: keep if _n==1
Stata probably kicks out different obervations eacht time.
Does someone knows how to avoid that? A similar question was posed a couple of days ago: How to delete duplicate observations, Martin recommended the following command that I used (see above):

bysort ID: keep if_n==1



However my problem is not exactly the same:
Since I would like to aggregate my individual level data to the county level I would like to just keep one observation for each county [instead of keeping one observation per county I would like to keep 98 observations per county (one observation per county and per cancer type; there are 98 different cancer types)]. Therefore the observations I would like to drop are not the same individuals, they just live in the same county and suffer from the same disease.

Thanks for your help!!
Johannes






Johannes Schoder schrieb:
Dear Statalist users:

When I am estimating the same model several times afterwards (with the same computer): xi: glm [dep. var.] [indep. var.] i.county i.year, family (binomial weight) link(logit)

I get different results for the exactly same specification.
Does anyone know whats going on here? Is it because of the different number of iterations (sometimes 8,9 or 10)? Which results are right? What can I do to get the identical result for the same estimation?
Thanks a lot for any suggestion!
Johannes

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index