Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: merge one to many


From   "Amy Dunbar" <[email protected]>
To   <[email protected]>
Subject   st: RE: merge one to many
Date   Wed, 27 Jan 2010 08:40:10 -0500

Thank you so much for your response, Padmakumar.  You must have computed
discretionary accruals in your research, since you picked up on the dacc
mnemonic.  

I did as you suggested, but I get an error code at the merge 1:m.   I am
using Stata/SE 10.1, so I did a Google search and found that the merge
1:m in Stata 11.  I will be updating today.  THANK YOU again.

. use match, clear

. gen  c_gvkey= gvkey

. save control, replace
file control.dta saved

. use match, clear

. * drop _merge
. sort year sic2 rroa

. merge 1:m sic2 year rroa using control
1 invalid name

Date: Tue, 26 Jan 2010 10:36:58 -0600
From: Padmakumar Sivadasan <[email protected]>
Subject: Re: st: One to many merge

Pardon me if this a reposting.. I am not sure the last one made it to
the list as I posted that from the wrong email id.

Hope the following helps.

Step One
Duplicate match as control and rename only the gvkey as c_gvkey.

Step Two
Merge the match and control datasets.

merge 1:m sic2 year rroa using control

 Step three
Drop cases where gvkey==c_gvkey (these are the discretionary accruals
of the same firm that you don't want to include in calculating the
decile average or other decile stat)

 drop if gvkey==c_gvkey


Padmakumar

On Tue, Jan 26, 2010 at 8:50 AM, Amy Dunbar
<[email protected]> wrote:
> I want to create a dataset, call it 'match-control' that groups
together
> all the observations in the same industry-year-decile combination, but
> excludes the observation's own values for a particular variable. Thus
if
> there is only one firm in a decile, that firm will not be in the
> combined sample. The resulting sample has obs= N * (N-1). If there are
3
> firms, there will 3 *2 = 6 in the resulting match_control group for
the
> industry_year_decile.
>
> Assume I have a dataset (match) that includes a unique identifier
> (gvkey), industry (sic2), year, and decile (rroa), and the variable of
> interest (dacc); I duplicate the dataset (control) renaming the vars
as
> c_gvkey, c_sic2, c_year, c_rroa, and c_dacc. For example, assume that
> for the first decile I have the following for match and the same for
> control, except that the var names have c_:
>
> data = match
> N       gvkey   year    sic2    rroa         dacc
> 1       1860    1991    13              0       0.11
> 2       2304    1991    13              0       -0.43
> 3       3949    1991    13              0       -0.31
> 4       5140    1991    13              0       -0.38
> 5       5188    1991    13              0       -0.28
> 6       7159    1991    13              0       -0.16
> 7       8104    1991    13              0       0.04
> 8       8838    1991    13              0       -0.07
> 9       8974    1991    13              0       0.08
> 10      12042   1991    13              0       -0.97
> 11      13183   1991    13              0       -0.28
> 12      15306   1991    13              0       -0.30
> 13      22398   1991    13              0       -0.21
> 14      23129   1991    13              0       -0.27
> 15      27199   1991    13              0       -0.17
>
>
> Consider the SIC2=13 group. The first decile (coded 0) in the sample)
> has 15 firms in 1991.  The resulting group is 15 * 14 = 210 firms in
the
> match-control sample. The merging for the first gvkey is the
following:
>
> data = match_control
> gvkey   year    sic2    dacc    rroa    c_dacc
> 1860    1991    13      0.11    0       -0.43
> 1860    1991    13      0.11    0       -0.31
> 1860    1991    13      0.11    0       -0.38
> 1860    1991    13      0.11    0       -0.28
> 1860    1991    13      0.11    0       -0.16
> 1860    1991    13      0.11    0       0.04
> 1860    1991    13      0.11    0       -0.07
> 1860    1991    13      0.11    0       0.08
> 1860    1991    13      0.11    0       -0.97
> 1860    1991    13      0.11    0       -0.28
> 1860    1991    13      0.11    0       -0.30
> 1860    1991    13      0.11    0       -0.21
> 1860    1991    13      0.11    0       -0.27
> 1860    1991    13      0.11    0       -0.17
>
> In SAS, the following code does the trick.
> proc sql;
>      create table match_control as select
>      match.*, control.c_dacc
>      from match, control
>      where sic2 = c_sic2
>      and year = c_year
>      and rroa = c_rroa
>      and gvkey ne c_gvkey;   /* the magic words */
>
> I have tried merge and joinby, but I have had no success.  Can someone
> tell me how to accomplish this one to many merge in Stata?
>
> Thank you.
>
> Amy Dunbar
> University of Connecticut
> School of Business
> Department of Accounting
> 2100 Hillside Road Unit 1041
> Storrs, CT 06269-1041


-----Original Message-----
From: [email protected]
[mailto:[email protected]] 
Sent: Wednesday, January 27, 2010 2:33 AM
To: [email protected]
Subject: statalist-digest V4 #3682

statalist-digest     Wednesday, January 27 2010     Volume 04 : Number
3682


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index