Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: confusion about psmatch2


From   "Lijun Song" <ls8@duke.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: confusion about psmatch2
Date   Sat, 27 Mar 2004 11:44:20 -0500

Hi,

As a beginner at Stata, I am confused about the psmatch2.

Suppose I am interested in the causal effect of college (a 1-0 
treatment indicator) on income. I also assume that race, sex, and 
family background (such as Mother and Father's SEI) will influence the 
treatment assignment. I also could not deny that race,sex, and family 
background also influence the outcome of interests, income directly.

Then, after regressing income on college, I arrive at an estimated 
coffecient. After that, I use "psmatch2 college race sex masei pasei".
I think the causal effect estimated by OLS estimator should be higher 
than those by Propensity Score Matching right?

But my results show that the causal effect estimated by OLS is smaller 
than these by Propensity Score Matching. Why?

In addition, the causal effect estimated by OLS is ATT or ATE?

Thanks.

Lijun



---------- Original message ----------
From: n.j.cox@durham.ac.uk
To: statalist@hsphsun2.harvard.edu
Sent: Saturday, March 27, 2004 3:54:26 PM
Subject: st: RE: Date Data/Asserts

You identify the problems very clearly, and you 
are aware of some useful tools here. However, 
a variable so messy is not going to surrender
faced with just a single weapon of mess dissection. 

Here's one way of proceeding. 

I copied your data and argued as follows. 

1. The value is most commonly the first word
of the string, but we want it as a number. 

. gen fvalue = real(word(ferritin,1)) 
(140 missing values generated)

2. The date is most commonly the second word
of the string, interpreted as a (recent) daily date. 

. gen fdate = date(word(ferritin,2),"mdy",2050)
(162 missing values generated)

3. I see lots of values with "U"s. I want to check
that when "U" occurs it occurs only as itself. 

. assert ferritin == "U" if index(ferritin, "U") 

Here -assert- says nothing. No news is good news, 
i.e. Stata assents to the assertion. (In this 
case, all "U"s are bad news, i.e. no information
at all.) 

4. So I know all about the "U" problem. What values
are missing, but not because of "U"? 

. list ferritin fvalue if mi(fvalue) & ferritin != "U" 

     +------------------------+
     |      ferritin   fvalue |
     |------------------------|
178. |        normal        . |
225. |                      . |
286. | >1400 3/24/99        . |
324. |   > 2K 1/7/02        . |
     +------------------------+

You have to decide what to do about these. 

5. What dates are missing, but not because of 
these? 

. list ferritin fdate if mi(fdate) & ferritin != "U" 

     +---------------------------+
     |          ferritin   fdate |
     |---------------------------|
 66. |         5760 5/02       . |
 70. |              4968       . |
 98. |         4782 6/02       . |
117. |         1137 4/02       . |
121. |         3237 3/01       . |
     |---------------------------|
126. |         1060 8/02       . |
148. | 1422 ng/ml 3/1998       . |
154. |         1014 8/01       . |
178. |            normal       . |
209. |  646 mg/ml 9/2000       . |
     |---------------------------|
210. |              1938       . |
212. |               127       . |
225. |                         . |
278. |          129 6/00       . |
289. |              2000       . |
     |---------------------------|
300. |              4995       . |
301. |         2748 9/02       . |
302. |        187 3/2000       . |
303. |               489       . |
305. |        3564 11/01       . |
     |---------------------------|
307. |          862 1990       . |
324. |       > 2K 1/7/02       . |
333. |         3039 2/02       . |
338. |   1777 ng/ml 5/02       . |
352. |           40 3/02       . |
     |---------------------------|
357. |               953       . |
     +---------------------------+

Similarly, you have to decide what to do here. There 
are, it seems, various kinds of problem: 

* No date supplied. 
* A year only supplied. 
* A month and year only supplied. 
* A day, month and year supplied. 
* More than two words in the string. 

In fact, we would have been better off 
trying to extract the date from 
-word(ferritin,-1)-, and that would 
yield one further daily date. 

You can use -assert- to test for truth 
or falsity. Thus 

. assert 42 == int(42) 

asserts that 42 is an integer. More 
usefully 

. assert x == int(x) 

asserts that -x- contains integer 
values only, as only for integers 
is -x- unchanged by applying the -int()- 
function. 

Similarly 

. assert real(date(stringvar, "mdy", 2050)) < . 

asserts that all values of stringvar can 
be treated as daily dates. By itself -assert- has 
very little syntax, and using it requires knowledge
of other parts of Stata. 

Nick 
n.j.cox@durham.ac.uk 

Ward Hagar
> 
> I am new to Stata and new to databases. I've been trying to 
> get an access database "Stata ready". I can insheet it well 
> and have "cleaned up" most of the variables. However, I've 
> hit a snag with one variable named "Ferritin", reproduced 
> below for those interested. (n= 362). 
> 
> My goal is to separate the value form the date.
> 
> But note:
> 1. Most have a four digit value
> 2. Many have a date of the test in an inconsistent format
> 3. Many have "U" for unknown
> 4. One has "normal" for a value
> 5. A few have the ">" operator before the value
> 6. One has the units for the measurement
> 
> 
> I've tried the date() functions, split, and word() and all 
> cause some other problem.
> 
> This is important because the ferritin is the first of a list 
> of similarly (mis)coded variables.
> 
> Is there a Stata-esque approach to this, or am I left with 
> line-by-line cleanup?
> 
> Second question is whether ASSERT can be used to test whether 
> a value is a date, integer, string, etc?
> 
> Many thanks.
> 
> . l ferritin
> 
>      +-------------------+
>      |          ferritin |
>      |-------------------|
>   1. |       540 11/6/02 |
>   2. |                 U |

> 348. |                 U |
> 349. |                 U |
> 350. |       132 2/20/02 |
>      |-------------------|
> 351. |                 U |
> 352. |           40 3/02 |
> 353. |        97 3/29/02 |
> 354. |      6151 5/22/02 |
> 355. |                 U |
>      |-------------------|
> 356. |       1390 5/8/02 |
> 357. |               953 |
> 358. |         54 8/2/02 |
> 359. |      2779 8/12/02 |
> 360. |                 U |
>      |-------------------|
> 361. |                 U |
> 362. |                 U |
>      +-------------------+

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index