Re: st: differences between merge and joinby

Wed, 12 Mar 2008 18:47:55 -0300

I have one observation by person. I one dataset I have personal chracteristics and in the other laboral variables. SEbas. 2008/3/11, Vladimir Vakhitov <vvakhitov@gmail.com>: > It depends on what you would like to get at the end. The first > question is "what does uniquely identifies your observations?" What is > an observation, then? > > -Merge- is used when you want to add some extra characteristics to > your observations (for example, add a region characteristics to each > observation based on its region ID). > -standby- forms a set of all possible intersections, kinda like INNER > JOIN in Access. Sometimes you don't want this because you get too many > messy intersections. > > > > Volodymyr > > > 2008/3/11, Sebastian Kruk <residuo.solow@gmail.com>: > > Dear statalist users, > > > > > > I have two dataset, A and B. > > > > A) number, sex, age, citizen > > > > B) number, sex, civil status, children > > > > I have to form a new dataset but number and sex do not uniquely > > identify observations. > > Number of observacions of A can be >,< or = Number of observations of B. > > > > What's better, merge or joinby? > > > > Bye, > > > > Sebastian. > > > > 2008/3/5, Nick Cox <n.j.cox@durham.ac.uk>: > > > Two minute comments only: > > > > > > local assets=qassets > > > > > > this looks wrong: qassets[`i'] ? > > > > > > quietly:use compustat if `qtr'=obsqtr & `sic'=sic3 & > > > qsales<=1.15*`sales'/* > > > */ & > > > qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear > > > > > > tests for equality are ==, not = > > > > > > Malcolm Wardlaw > > > > > > I wanted to pose this question to Statalist regarding matching data to a > > > > > > range of values instead of exact values. I kind of asked this question > > > before, but I realized from the response that my question was somewhat > > > ill formed, so I'll try to be as explicit as possible. I will use an > > > example to illustrate the question. > > > > > > Let's say I want to do a long-run event study on the changes in real > > > growth of companies. In order to do this, I need to appropriately match > > > > > > the company I am running the event study on to a group of comparable > > > companies. For this, I need a matched dataset of all companies that > > > match in a range of accounting variables. > > > > > > The match occurs as follows. I have a data set (1) containing all of > > > the companies I wish to perform the event study on. I need to then > > > create a dataset (2) that contains matching companies from a dataset of > > > the larger Compustat universe of all firms (3). To do this, I need to > > > gather all firms that have the same SIC code, sales that are between 15% > > > > > > and -15% of the event company, and assets that are between 20% and -20% > > > of the event company in the quarter of the event. The new dataset must > > > also have a marker for each of these group of sample firms that > > > corresponds to the event firm. > > > > > > Here is how I originally dealt with the problem. In the program, Stata > > > is continually cycling through the data, loading part of another dataset > > > > > > into memory, appending it to another dataset from disk, saving that > > > dataset to disk, and then reloading the original dataset from disk each > > > time. It works, but it seems very inefficient. > > > > > > Is there a best practice on how to do this, or is this basically as good > > > > > > as it's going to get? > > > > > > --------------------------------------- > > > local num = _N > > > forval i = 1/`num' { > > > /*The sales of Event Company i*/ > > > local sales=sales[`i'] > > > /*The quarter of the observation*/ > > > local qtr=eventquarter[`i'] > > > /*SIC code*/ > > > local sic=sic3[`i'] > > > /*Assets of the event company*/ > > > local assets=qassets > > > /*A code that uniquely tags the event*/ > > > local code=code[`i'] > > > quietly:use compustat if `qtr'=obsqtr & `sic'=sic3 & > > > qsales<=1.15*`sales'/* > > > */ & > > > qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear > > > gen code=`code' > > > append using comparables > > > quietly:save comparables,replace > > > use events > > > } > > > > > > * > > > * For searches and help try: > > > * http://www.stata.com/support/faqs/res/findit.html > > > * http://www.stata.com/support/statalist/faq > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > * > > * For searches and help try: > > * http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > > -- > __________________ > Volodymyr Vakhitov > vvakhitov@gmail.com > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

