Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?

 From "Nick Cox" To Subject RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year? Date Tue, 13 Apr 2010 14:28:37 +0100

```One technique here is simply tag and sum.

1. Tag what you want to count with 1 and anything else with 0.

2. Sum the 1s and 0s. Evidently the result is the count.

Thus if the question were: how many distinct products which are generic
in every subclass and every quarter?

egen tag = tag(product subclass quarter) if generic == 1

egen count = total(tag), by(subclass quarter)

should do the trick.

There's much more discussion within various FAQs and articles:

. search distinct

Keyword search

Keywords:  distinct
Search:  (1) Official help files, FAQs, Examples, SJs, and
STBs

Search of official help files, FAQs, Examples, SJs, and STBs

[P]     levelsof  . . . . . . . . . . . . . . . . . . . . . Levels of
variable
(help levelsof)

FAQ     . . . . . . . . . . . . . .  Calculating the number of distinct
values
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
9/06    How do I calculate the number of distinct
values seen so far?

http://www.stata.com/support/faqs/data/distinctvalues.html

FAQ     . . . . . . . . .  Counting distinct strings across a set of
variables
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
7/04    How do I count the number of distinct strings
across a set of variables?

http://www.stata.com/support/faqs/data/distinctstrings.html

FAQ     . . . . . . . . . . . . . . . . . . .  Number of distinct
observations
. . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and G.
Longton
4/02    How do I compute the number of distinct observations?
http://www.stata.com/support/faqs/data/distinct.html

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata:
Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . .  N.
J. Cox
Q1/09   SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced

SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct
observations
(help distinct if installed)  . . . . . .  N. J. Cox and G. M.
Longton
Q4/08   SJ 8(4):557--568
from first principles; provides a convenience command

Nick
n.j.cox@durham.ac.uk

Rodrigo Refoios Camejo

(As expected) you were right and it was coming back wrong because I
had not dropped the obs corresponding to the products before launch.
But I realised have another problem now... I want to count the number
of products that are generic in a subclass for a given quarter, but
because it's retrieving the number of presentation (id) that are
generic in a subclass for a given quarter. This because in each
product there are several presentation each one with one obs per
quarter. How can I count the # products in each subclass in a given
quarter that are generic instead of the # presentations (id)? The
dataset looks something like this:

id product subclass generic quarter
1      1          1            1           1
2      1          1            1           2
3      1          1            1           3
4      2          1            0           1
5      2          1            0           2
6      2          1            0           3
7      3          2            1           1
8      3          2            1           2
9      3          2            1           3

On Mon, Apr 12, 2010 at 8:45 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:

> I don't think so. The -by(subclass quarter)- subdivides observations
> according to combinations of the two variables, which I thought was
what
> you wanted.

Rodrigo Refoios Camejo

> returning only the generic==1 in each subclass and ignoring the by
> quarter, i.e. sum is the same for all obs of thin each subclass
> irrespectively of the quarter it was observed at.
>
> Any idea what may be going wrong? I've grouped quarter, I've sorted by
> quarter subclass and still the same result...

On 4/12/10, Nick Cox <n.j.cox@durham.ac.uk> wrote:

>> The larger question here is "How should I model this?" which is
>> difficult at the best of times and in any case better left to
>> subject-matter experts.
>>
>> The smaller question is more my thing.
>>
>> egen sum = total(generic == 1), by(subclass quarter)
>>
>> may be the sort of solution you need.

Rodrigo Refoios Camejo [edited]

>> I have 10 years of panel data on pricing and quantities sold of
>> pharmaceuticals. For each presentation (i.e. dosage and package) of
>> each product I have data on the prices and quantities sold in each
>> quarter. My idea is to fit a regression model with price as the
>> dependent variable and independent variables related to competition
>> like: # products in the therapeutic class at time of launch; #
>> generics in the therapeutic class at time t; market share of market
>> leader at time t; price of market leader at time t-1; etc.
>>
>> How can I deal with the fact that some products were only launched
>> half-way the data timeframe and some were discontinued after launch,
>> i.e. when price=="0" & quantity=="0" for all t before t launch and
for
>> some after t launch. How can I have Stata including in the regression
>> only the data for which price and quantity is available? Simply drop
>> if price==0? Will it not treat it as missing if I do so?
>>
>> Another side question, how can I make Stata count the # products with
>> a particular characteristic (e.g. generic==1) marketed in each group
>> (subclass) of drugs in each quarter? And then assign to each product
>> the count corresponding to the quarter in which that product was
>> launched?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```