(As expected) you were right and it was coming back wrong because I had not dropped the obs corresponding to the products before launch. But I realised have another problem now... I want to count the number of products that are generic in a subclass for a given quarter, but because it's retrieving the number of presentation (id) that are generic in a subclass for a given quarter. This because in each product there are several presentation each one with one obs per quarter. How can I count the # products in each subclass in a given quarter that are generic instead of the # presentations (id)? The dataset looks something like this: id product subclass generic quarter 1 1 1 1 1 2 1 1 1 2 3 1 1 1 3 4 2 1 0 1 5 2 1 0 2 6 2 1 0 3 7 3 2 1 1 8 3 2 1 2 9 3 2 1 3 Many thanks in advance! On Mon, Apr 12, 2010 at 8:45 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > I don't think so. The -by(subclass quarter)- subdivides observations > according to combinations of the two variables, which I thought was what > you wanted. > > Nick > n.j.cox@durham.ac.uk > > Rodrigo Refoios Camejo > > Thanks for your quick reply, Nick... however, your suggestion is > returning only the generic==1 in each subclass and ignoring the by > quarter, i.e. sum is the same for all obs of thin each subclass > irrespectively of the quarter it was observed at. > > Any idea what may be going wrong? I've grouped quarter, I've sorted by > quarter subclass and still the same result... > > On 4/12/10, Nick Cox <n.j.cox@durham.ac.uk> wrote: >> The larger question here is "How should I model this?" which is >> difficult at the best of times and in any case better left to >> subject-matter experts. >> >> The smaller question is more my thing. >> >> egen sum = total(generic == 1), by(subclass quarter) >> >> may be the sort of solution you need. >> >> Nick >> n.j.cox@durham.ac.uk >> >> Rodrigo Refoios Camejo [edited] >> >> I have 10 years of panel data on pricing and quantities sold of >> pharmaceuticals. For each presentation (i.e. dosage and package) of >> each product I have data on the prices and quantities sold in each >> quarter. My idea is to fit a regression model with price as the >> dependent variable and independent variables related to competition >> like: # products in the therapeutic class at time of launch; # >> generics in the therapeutic class at time t; market share of market >> leader at time t; price of market leader at time t-1; etc. >> >> How can I deal with the fact that some products were only launched >> half-way the data timeframe and some were discontinued after launch, >> i.e. when price=="0" & quantity=="0" for all t before t launch and for >> some after t launch. How can I have Stata including in the regression >> only the data for which price and quantity is available? Simply drop >> if price==0? Will it not treat it as missing if I do so? >> >> Another side question, how can I make Stata count the # products with >> a particular characteristic (e.g. generic==1) marketed in each group >> (subclass) of drugs in each quarter? And then assign to each product >> the count corresponding to the quarter in which that product was >> launched? > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

