# Re: AW: st: Different Results for the same estimation

 From Johannes Schoder To statalist@hsphsun2.harvard.edu Subject Re: AW: st: Different Results for the same estimation Date Wed, 16 Sep 2009 15:14:38 -0400

```Hi Martin:
Nice your suggestion works, you are a genius!!
Thanks a lot for your help I really appreciate!!!
Johannes

Martin Weiss schrieb:
```
```<>

Not sure whether your data transferred well, but this is probably close to
what you want :-)

**************
clear*

inp  id  time_in_months county/*
*/ str10 cancer
1    13 2  breast
2    14 2  breast
```
3 1 2 breast end
```
compress
```
list, noobs
```bys county cancer: /*
*/egen N_survivorsOneYear  /*
*/ =total((time_in_months>12))

list, noobs

**************

HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes Schoder
Sent: Mittwoch, 16. September 2009 00:39
To: statalist@hsphsun2.harvard.edu
Subject: Re: AW: st: Different Results for the same estimation

I found another bug in my calculations:
```
Since I have the number of diagnosed cancer cases per cancer type, county, and the survival time in months I wanted to calculate the number of people surviving one year per county and cancer type. However I did it wrong. How can I generate a variable that gives my the number of people who survived 12 months?
```
```
bysort CancerType COUNTY: egen N_SurvivorsOneYear = count(time_in_months) if time_in_months>12
```

When time_in_months<12 N_SurvivorsOneYear gets zero or "." (missing value)
```
but I want that it takes the value of the number of survivors per disease and per county.
```I know my description sounds confusing here is an example:

id  time_in_months county    N_survivorsOneYear   cancer type
```
1 13 01 2 breast 2 14 02 2 breast 3 1 03 . breast
```
```
instead of the "." missing value for id 3 and N_survivorsOneYear I want to have "2"
```Thanks a lot for your help!
Johannes
```
```

Martin Weiss schrieb:
```
<>
```
```
You can always -collapse- or make up a fake identifier as
```
-bys County disease: gen personid=_n-
-la var personid "Fake Identifier"-

To appreciate the meaning of this command, check Nick`s
http://www.stata-journal.com/sjpdf.html?articlenum=pr0004

HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Johannes
Schoder
Gesendet: Dienstag, 15. September 2009 22:16
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: Different Results for the same estimation

Hi Martin:
Thanks a lot for your help.
```
Yes you are right I have nesting levels, within counties there are diseases that afflict individuals. Unfortunately I messed (or the data provider) something up when importing the data. I just realized that I have a lot of individuals with the same identifier variable (although they are not the same), so I can't really use the id number. Is there any alternative of aggregating the individual level data to the county level?
```Johannes

Martin Weiss schrieb:
```
```<>

So there are three nesting levels? Within counties, there are diseases
afflicting individuals? If that is the case, you should amend your
```
```command
```
```as

- bysort County disease (individual): keep if _n==1-

to make it stable for the -glm- analysis. "individual" should be replaced
```
```by
```
some identifier variable, like an id number.
```
Also look at -egen, tag()- as -drop-ping is not generally the best
```
```approach
```
```to conducting a restricted analysis ("How are you going to get the
```
```dropped
```
```obs back when you need them quickly?").

Also look at -xtmixed- and its brothers, as your analysis sounds like a
```
```good
```
```case for them...

HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes
```
```Schoder
```
```Sent: Dienstag, 15. September 2009 20:17
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Different Results for the same estimation

I found the bug:

Since I am using the following command before the estimation:
bysort County disease: keep if _n==1
Stata probably kicks out different obervations eacht time.
```
Does someone knows how to avoid that? A similar question was posed a couple of days ago: How to delete duplicate observations, Martin recommended the following command that I used (see above):
```
bysort ID: keep if_n==1

However my problem is not exactly the same:
```
Since I would like to aggregate my individual level data to the county level I would like to just keep one observation for each county [instead of keeping one observation per county I would like to keep 98 observations per county (one observation per county and per cancer type; there are 98 different cancer types)]. Therefore the observations I would like to drop are not the same individuals, they just live in the same county and suffer from the same disease.
```
Johannes

Johannes Schoder schrieb:
```
```Dear Statalist users:

```
When I am estimating the same model several times afterwards (with the same computer): xi: glm [dep. var.] [indep. var.] i.county i.year, family (binomial weight) link(logit)
```
I get different results for the exactly same specification.
```
Does anyone know whats going on here? Is it because of the different number of iterations (sometimes 8,9 or 10)? Which results are right? What can I do to get the identical result for the same estimation?
```Thanks a lot for any suggestion!
Johannes

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```