Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Losing Observations in Logit


From   Isobel Williams <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Losing Observations in Logit
Date   Sat, 1 Mar 2014 14:12:56 +0000

Fernando,

Thanks a lot for your help- in just changing the code to groupA==1 | groupB==1, I now have 15,988 observations. However shouldn't I have 46,413 observations? 
groupA= 32,187; groupB=14,226, so in a group A and group B sample I should have 32,187+14,226 =46,413

I tried to make groups A, B, and C as exclusive as possible. Here is how my code, let me know what you think:

//generate group A: in treatment area, eligible, and incorporated
generate groupA=1 if zona==1 
replace groupA=1 if cla_tam==1
replace groupA=1 if h_selec==1
replace groupA=0 if zona==0
replace groupA=0 if cla_tam==2
replace groupA=0 if cla_tam==3
replace groupA=0 if h_selec==0
replace groupA=0 if h_selec==9 

///generate group B: in treatment area, eligible, but not incorporated
generate groupB=1 if zona==1
replace groupB=1 if cla_tam==1
replace groupB=1 if h_selec==0
replace groupB=0 if zona==0
replace groupB=0 if cla_tam==2
replace groupB=0 if cla_tam==3
replace groupB=0 if h_selec==1
replace groupB=0 if h_selec==9

///generate group C: not in treatment area, eligible, not incorporated
generate groupC=1 if zona==0
replace groupC=1 if cla_tam==1
replace groupC=1 if h_selec==0
replace groupC=0 if zona==1
replace groupC=0 if cla_tam==2
replace groupC=0 if cla_tam==3
replace groupC=0 if h_selec==1
replace groupC=0 if h_selec==9

Thanks,
Isobel Williams

----------------------------------------
> Date: Sat, 1 Mar 2014 08:57:10 -0500
> Subject: Re: st: Losing Observations in Logit
> From: [email protected]
> To: [email protected]
>
> Hi Isobel,
> I think there are two main problems with your coding.
> First, your sample condition says:
> keep if groupA==1 & groupB==1
> But should say:
> keep if groupA==1 | groupB==1
> Since you want observations from both groups, (one or the other)
> The second problem I see, based on your previous mail, is that groups
> A, B and C should be mutually exclusive. Then again, perhaps the
> definition of treatment and groups are not connected.
> Fernando
>
>
>
> On Sat, Mar 1, 2014 at 8:48 AM, Isobel Williams <[email protected]> wrote:
>> Dear All,
>>
>> I am running a logistic regression, and then pairing them using propensity score matching. Within the dataset, I have divided the data into groups A, B, and C:
>>
>> generate groupA=1 if zona==1
>> replace groupA=1 if cla_tam==1
>> replace groupA=1 if h_selec==1
>> replace groupA=0 if zona==0
>> replace groupA=0 if cla_tam==2
>> replace groupA=0 if cla_tam==3
>> replace groupA=0 if h_selec==0
>> replace groupA=0 if h_selec==9
>>
>>
>> generate groupB=1 if zona==1
>> replace groupB=1 if cla_tam==1
>> replace groupB=1 if h_selec==0
>> replace groupB=0 if zona==0
>> replace groupB=0 if cla_tam==2
>> replace groupB=0 if cla_tam==3
>> replace groupB=0 if h_selec==1
>> replace groupB=0 if h_selec==9
>>
>> generate groupC=1 if zona==0
>> replace groupC=1 if cla_tam==1
>> replace groupC=1 if h_selec==0
>> replace groupC=0 if zona==1
>> replace groupC=0 if cla_tam==2
>> replace groupC=0 if cla_tam==3
>> replace groupC=0 if h_selec==1
>> replace groupC=0 if h_selec==9
>>
>> when I run the tab command for all groups, Stata tells me that:
>>
>> . tab groupA
>>
>> groupA | Freq. Percent Cum.
>> ------------+-----------------------------------
>> 0 | 45,316 58.47 58.47
>> 1 | 32,187 41.53 100.00
>> ------------+-----------------------------------
>> Total | 77,503 100.00
>>
>> . tab groupB
>>
>> groupB | Freq. Percent Cum.
>> ------------+-----------------------------------
>> 0 | 63,277 81.64 81.64
>> 1 | 14,226 18.36 100.00
>> ------------+-----------------------------------
>> Total | 77,503 100.00
>>
>>
>> . tab groupC
>>
>> groupC | Freq. Percent Cum.
>> ------------+-----------------------------------
>> 0 | 56,780 73.26 73.26
>> 1 | 20,723 26.74 100.00
>> ------------+-----------------------------------
>> Total | 77,503 100.00
>>
>>
>> However, when I run a logistic regression on propensity score matching between groups A and B, Stata tells me that I only have 7099 observations.
>>
>> Furthermore, when I wrote the code for keep if, the programme says "63227 observations deleted".
>>
>> Here is what I tried to do in estimating a logit propensity score and match (nearest neighbor) between groups A & B:
>>
>> preserve
>> keep if groupA==1 & groupB==1
>> logit treat floor fmiss wall hhinc2....
>> predict double ps1
>> psmatch2 treat, outcome (S06A20) pscore (ps1) caliper(0.2) common logit
>> restore
>>
>> The objective is to estimate a logit/propensity score and match observations from group A with group B. Any help on this matter would be very much appreciated.
>>
>> Many thanks,
>> Isobel Williams
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index