[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: differences between merge and joinby

From	"Sebastian Kruk" <[email protected]>
To	[email protected]
Subject	Re: st: differences between merge and joinby
Date	Wed, 12 Mar 2008 18:47:55 -0300

I have one observation by person. I one dataset I  have personal
chracteristics and in the other laboral variables.

SEbas.

2008/3/11, Vladimir Vakhitov <[email protected]>:
> It depends on what you would like to get at the end. The first
> question is "what does uniquely identifies your observations?" What is
> an observation, then?
>
> -Merge- is used when you want to add some extra characteristics to
> your observations (for example, add a region characteristics to each
> observation based on its region ID).
> -standby- forms a set of all possible intersections, kinda like INNER
> JOIN in Access. Sometimes you don't want this because you get too many
> messy intersections.
>
>
>
> Volodymyr
>
>
> 2008/3/11, Sebastian Kruk <[email protected]>:
> > Dear statalist users,
> >
> >
> >  I have two dataset, A and B.
> >
> >  A) number, sex, age, citizen
> >
> >  B) number, sex, civil status, children
> >
> >  I have to form a new dataset but number and sex do not uniquely
> >  identify observations.
> >  Number of observacions of A can be >,< or = Number of observations of B.
> >
> >  What's better, merge or joinby?
> >
> >  Bye,
> >
> >  Sebastian.
> >
> >  2008/3/5, Nick Cox <[email protected]>:
> >  > Two minute comments only:
> >  >
> >  >     local assets=qassets
> >  >
> >  >  this looks wrong: qassets[`i'] ?
> >  >
> >  >     quietly:use compustat if  `qtr'=obsqtr & `sic'=sic3 &
> >  >  qsales<=1.15*`sales'/*
> >  >     */ &
> >  >  qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear
> >  >
> >  >  tests for equality are ==, not =
> >  >
> >  >  Malcolm Wardlaw
> >  >
> >  >  I wanted to pose this question to Statalist regarding matching data to a
> >  >
> >  >  range of values instead of exact values.  I kind of asked this question
> >  >  before, but I realized from the response that my question was somewhat
> >  >  ill formed, so I'll try to be as explicit as possible.  I will use an
> >  >  example to illustrate the question.
> >  >
> >  >  Let's say I want to do a long-run event study on the changes in real
> >  >  growth of companies.  In order to do this, I need to appropriately match
> >  >
> >  >  the company I am running the event study on to a group of comparable
> >  >  companies.  For this, I need a matched dataset of all companies that
> >  >  match in a range of accounting variables.
> >  >
> >  >  The match occurs as follows.  I have a data set (1) containing all of
> >  >  the companies I wish to perform the event study on.  I need to then
> >  >  create a dataset (2) that contains matching companies from a dataset of
> >  >  the larger Compustat universe of all firms (3).  To do this, I need to
> >  >  gather all firms that have the same SIC code, sales that are between 15%
> >  >
> >  >  and -15% of the event company, and assets that are between 20% and -20%
> >  >  of the event company in the quarter of the event.  The new dataset must
> >  >  also have a marker for each of these group of sample firms that
> >  >  corresponds to the event firm.
> >  >
> >  >  Here is how I originally dealt with the problem. In the program, Stata
> >  >  is continually cycling through the data, loading part of another dataset
> >  >
> >  >  into memory, appending it to another dataset from disk, saving that
> >  >  dataset to disk, and then reloading the original dataset from disk each
> >  >  time.  It works, but it seems very inefficient.
> >  >
> >  >  Is there a best practice on how to do this, or is this basically as good
> >  >
> >  >  as it's going to get?
> >  >
> >  >  ---------------------------------------
> >  >  local num = _N
> >  >  forval i = 1/`num' {
> >  >     /*The sales of Event Company i*/
> >  >     local sales=sales[`i']
> >  >     /*The quarter of the observation*/
> >  >     local qtr=eventquarter[`i']
> >  >     /*SIC code*/
> >  >     local sic=sic3[`i']
> >  >     /*Assets of the event company*/
> >  >     local assets=qassets
> >  >     /*A code that uniquely tags the event*/
> >  >     local code=code[`i']
> >  >     quietly:use compustat if  `qtr'=obsqtr & `sic'=sic3 &
> >  >  qsales<=1.15*`sales'/*
> >  >     */ &
> >  >  qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear
> >  >     gen code=`code'
> >  >     append using comparables
> >  >     quietly:save comparables,replace
> >  >     use events
> >  >  }
> >  >
> >  >  *
> >  >  *   For searches and help try:
> >  >  *   http://www.stata.com/support/faqs/res/findit.html
> >  >  *   http://www.stata.com/support/statalist/faq
> >  >  *   http://www.ats.ucla.edu/stat/stata/
> >  >
> >  *
> >  *   For searches and help try:
> >  *   http://www.stata.com/support/faqs/res/findit.html
> >  *   http://www.stata.com/support/statalist/faq
> >  *   http://www.ats.ucla.edu/stat/stata/
> >
>
>
> --
> __________________
> Volodymyr Vakhitov
> [email protected]
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: differences between merge and joinby
  - From: "Sebastian Kruk" <[email protected]>
- Re: st: differences between merge and joinby
  - From: "Vladimir Vakhitov" <[email protected]>

Prev by Date: Re: st: Comparing results from two separate t-tests
Next by Date: st: Not Directly Stata Related, But Please Help
Previous by thread: Re: st: differences between merge and joinby
Next by thread: st: SSC Archive activity, February 2008
Index(es):
- Date
- Thread