Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: merging data unique identification problem


From   <ghuiber@ups.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: merging data unique identification problem
Date   Tue, 31 Jan 2006 16:26:45 -0500

Instead of
keep id year semester granttype`i'

Read
keep id year semester awardamt`i'

Sorry.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Huiber Gabi
(nat1gxh)
Sent: Tuesday, January 31, 2006 4:20 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: merging data unique identification problem


Brute force method for cleaning dataset B of spurious missing
observations:

If you do this

reshape wide awardamt, i(id year semester) j(granttype)

foreach i in 1 2 {
preserve
tempfile file`i'
keep id year semester granttype`i'
drop if granttype`i'==.
sort id year semester
save "`file`i''", replace
restore
}

*preserve
clear
use "`file1'"
merge id year semester using "`file2'"
tab _merge

Then you will get the "squished" file you're looking for. From tab
_merge you should see see whether all your observations in that file
have two non-missing awardamt's each. There's no guarantee that they
should, right?

Gabi

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jennifer
Delaney
Sent: Tuesday, January 31, 2006 4:08 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: merging data unique identification problem


I have two datasets that I am trying to merge.  One dataset (A) is
uniquely identified by three variables (ID YEAR SEMESTER).  The other
dataset is uniquely identified by four variables (ID YEAR SEMESTER
GRANTTPYE).  In the second dataset (B) students can get more than one
GRANTTYPE in a given semester so it is important to keep this factor.

What I've been trying to do is to collapse the second dataset (B) so
that it is identified by only three variables then I can merge it with
dataset A.

I think it's clearest if I can illustrate with a simple example...

Current form of dataset B:

ID  YEAR  SEMESTER  GRANTTYPE AWARDAMT
1   1990   Fall         1        500
1   1990   Fall         0        200


Form that I am attempting to convert dataset B into:

ID  YEAR  SEMESTER  GRANTTYPE1awardamt  GRANTTYPE2awardamt
1   1990   Fall          500                200


The only form that I have been able to achieve so far:
ID  YEAR  SEMESTER  GRANTTYPE1awardamt  GRANTTYPE2awardamt
1   1990   Fall          500                .
1   1990   Fall          .                  200


How do I "squish" the data so that I end up with one entry per kid per
year per term?

Thanks,

Jennifer

-- 
Jennifer Delaney
PhD candidate in Higher Education
Stanford University
delaneyj@stanford.edu
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index