[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Nick Winter <nw53@cornell.edu> |

To |
Stata List <statalist@hsphsun2.harvard.edu> |

Subject |
Fwd: Re: st: Collapsing Over Limited Set |

Date |
Thu, 15 Sep 2005 15:52:47 -0400 |

Sorry ... sent the last email by mistake. Trying again:

I believe this will do it:

bysort product: egen n_p = count( markup<. & company<. )

by product: egen sum_p = sum( markup * !mi(company) )

bysort company product : egen n_cp = count( markup<. )

by company product: egen sum_cp = sum(markup)

gen averagemarkup = (sum_p-sum_cp) / (n_p-n_cp)

If you know you don't have missing values on company or on markup, then you can simplify:

bysort product: gen n_p = _N

by product: egen sum_p = sum(markup)

bysort company product : gen n_cp = _N

by company product: egen sum_cp = sum(markup)

gen averagemarkup = (sum_p-sum_cp) / (n_p-n_cp)

That is, get the sum and number of observations for each product overall, then get the sum and number of observations for each company-product pair...then subtract off these latter quantities from the former to calculate the average you want.

Note in the first approach, "markup<." means "nonmissing on markup", but the egen function count() doesn't allow the missing() function to appear within it.

--Nick Winter

________________________________________________________At 03:11 PM 9/15/2005, you wrote:Hi there, I would really appreciate someone's help with this question. I'm trying to generate a dataset of statistics by collapsing another dataset, but each of the id's I'm collapsing by, I want to use every observation in the dataset except the one for the id under consideration. Would there be a way to do this? For example, I have data of the form: company product markup 100 31 .3 100 55 .2 111 31 0 111 55 .1 120 31 .1 120 55 .1 Now I want to ask the question: for each company, calculate the average markup of each product it produces, where the average is taken over all companies that sell the product except the company itself. So I want to end up with company product averagemarkup 100 31 .05 100 55 .1 111 31 .2 111 55 .15 120 31 .15 120 55 .15 Obviously collapsing the data the standard way is not going to do this. I need to do this for hundreds of thousands of observations (hundreds of companies and thousands of products) so am looking for a way to do this that would be relatively quick. I would be grateful for any suggestions. Thanks very much. Jason * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/________________________________________________________ Nicholas J. G. Winter 607.255.8819 t Assistant Professor 607.255.4530 f Department of Government nw53@cornell.edu e Cornell University falcon.arts.cornell.edu/nw53 w 308 White Hall Ithaca, NY 14853-4601

Nicholas J. G. Winter 607.255.8819 t

Assistant Professor 607.255.4530 f

Department of Government nw53@cornell.edu e

Cornell University falcon.arts.cornell.edu/nw53 w

308 White Hall

Ithaca, NY 14853-4601

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Collinearity: Why two L's?** - Next by Date:
**st: Collinearity in 2SLS** - Previous by thread:
**Re: st: Collapsing Over Limited Set** - Next by thread:
**st: Collinearity: Why two L's?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |