Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: AW: st: Different Results for the same estimation


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: AW: st: Different Results for the same estimation
Date   Wed, 16 Sep 2009 00:52:37 +0200

<>

Not sure whether your data transferred well, but this is probably close to
what you want :-)


**************
clear*

inp  id  time_in_months county/*
*/ str10 cancer
1    13 2  breast
2    14 2  breast
3    1  2  breast      
end

compress
list, noobs  

bys county cancer: /*
*/egen N_survivorsOneYear  /*
*/ =total((time_in_months>12))

list, noobs

**************


HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes Schoder
Sent: Mittwoch, 16. September 2009 00:39
To: statalist@hsphsun2.harvard.edu
Subject: Re: AW: st: Different Results for the same estimation

I found another bug in my calculations:
Since I have the number of diagnosed cancer cases per cancer type, 
county, and the survival time in months   I wanted to calculate the 
number of people surviving one year per county and cancer type. However 
I did it wrong.
How can I generate a variable that gives my the number of people who 
survived 12 months?

bysort CancerType COUNTY: egen N_SurvivorsOneYear = 
count(time_in_months) if time_in_months>12


When time_in_months<12 N_SurvivorsOneYear gets zero or "." (missing value)
 but I want that it takes the value of the number of survivors per 
disease and per county.
I know my description sounds confusing here is an example:

id  time_in_months county    N_survivorsOneYear   cancer type
1    13                       01          
2                                    breast
2    14                       02          
2                                    breast
3    1                         03          
.                                     breast

instead of the "." missing value  for id 3 and N_survivorsOneYear I want 
to have  "2"
Thanks a lot for your help!
Johannes
 





Martin Weiss schrieb:
> <> 
>
>
> You can always -collapse- or make up a fake identifier as 
>
>
> -bys County disease: gen personid=_n-
> -la var personid "Fake Identifier"-
>
>
> To appreciate the meaning of this command, check Nick`s
> http://www.stata-journal.com/sjpdf.html?articlenum=pr0004
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Johannes
> Schoder
> Gesendet: Dienstag, 15. September 2009 22:16
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: Different Results for the same estimation
>
> Hi Martin:
> Thanks a lot for your help.
> Yes you are right I have nesting levels, within counties there are 
> diseases that afflict individuals.
> Unfortunately I messed (or the data provider) something up when 
> importing the data. I just realized that I have a lot of individuals 
> with the same identifier variable (although they are not the same), so I 
> can't really use the id number.
> Is there any alternative of aggregating the individual level data to the 
> county level?
> Johannes
>
>
> Martin Weiss schrieb:
>   
>> <>
>>
>>
>> So there are three nesting levels? Within counties, there are diseases
>> afflicting individuals? If that is the case, you should amend your
command
>> as
>>
>> - bysort County disease (individual): keep if _n==1-
>>
>> to make it stable for the -glm- analysis. "individual" should be replaced
>>     
> by
>   
>> some identifier variable, like an id number. 
>>
>>
>> Also look at -egen, tag()- as -drop-ping is not generally the best
>>     
> approach
>   
>> to conducting a restricted analysis ("How are you going to get the
dropped
>> obs back when you need them quickly?").
>>
>> Also look at -xtmixed- and its brothers, as your analysis sounds like a
>>     
> good
>   
>> case for them...
>>
>>
>> HTH
>> Martin
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Johannes
>>     
> Schoder
>   
>> Sent: Dienstag, 15. September 2009 20:17
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Different Results for the same estimation
>>
>> I found the bug:
>>
>> Since I am using the following command before the estimation:
>> bysort County disease: keep if _n==1
>> Stata probably kicks out different obervations eacht time.
>> Does someone knows how to avoid that? A similar question was posed a 
>> couple of days ago:
>> How to delete duplicate observations,  Martin recommended the following 
>> command that I used (see above):
>>
>> bysort ID: keep if_n==1
>>
>>
>>
>> However my problem is not exactly the same:
>> Since I would like to aggregate my individual level data to the county 
>> level I would like to just keep one observation for each county [instead 
>> of keeping one observation per county I would like to keep 98 
>> observations per county (one observation per county and per cancer type; 
>> there are 98 different cancer types)].
>> Therefore the observations I would like to drop are not the same 
>> individuals, they just live in the same county and suffer from the same 
>> disease.
>>
>> Thanks for your help!!
>> Johannes
>>
>>
>>
>>
>>
>>
>> Johannes Schoder schrieb:
>>   
>>     
>>> Dear Statalist users:
>>>
>>> When I am estimating the same model several times afterwards (with the 
>>> same computer):
>>> xi: glm [dep. var.] [indep. var.]  i.county i.year, family (binomial 
>>> weight) link(logit)
>>>
>>> I get different results for the exactly same specification.
>>> Does anyone know whats going on here? Is it because of the different 
>>> number of iterations (sometimes 8,9 or 10)?
>>> Which results are right? What can I do to get the identical result for 
>>> the same estimation?
>>> Thanks a lot for any suggestion!
>>> Johannes
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>     
>>>       
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>   
>>     
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>   

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index