Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
Date	Tue, 13 Apr 2010 14:28:37 +0100

One technique here is simply tag and sum. 

1. Tag what you want to count with 1 and anything else with 0. 

2. Sum the 1s and 0s. Evidently the result is the count. 

Thus if the question were: how many distinct products which are generic
in every subclass and every quarter? 

egen tag = tag(product subclass quarter) if generic == 1 

egen count = total(tag), by(subclass quarter) 

should do the trick. 

There's much more discussion within various FAQs and articles: 

. search distinct

Keyword search

        Keywords:  distinct
          Search:  (1) Official help files, FAQs, Examples, SJs, and
STBs

Search of official help files, FAQs, Examples, SJs, and STBs


[P]     levelsof  . . . . . . . . . . . . . . . . . . . . . Levels of
variable
        (help levelsof)

FAQ     . . . . . . . . . . . . . .  Calculating the number of distinct
values
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        9/06    How do I calculate the number of distinct
                values seen so far?
 
http://www.stata.com/support/faqs/data/distinctvalues.html

FAQ     . . . . . . . . .  Counting distinct strings across a set of
variables
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
J. Cox
        7/04    How do I count the number of distinct strings
                across a set of variables?
 
http://www.stata.com/support/faqs/data/distinctstrings.html

FAQ     . . . . . . . . . . . . . . . . . . .  Number of distinct
observations
        . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and G.
Longton
        4/02    How do I compute the number of distinct observations?
                http://www.stata.com/support/faqs/data/distinct.html

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata:
Rowwise
        (help rowsort, rowranks if installed) . . . . . . . . . . .  N.
J. Cox
        Q1/09   SJ 9(1):137--157
        shows how to exploit functions, egen functions, and Mata
        for working rowwise; rowsort and rowranks are introduced

SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct
observations
        (help distinct if installed)  . . . . . .  N. J. Cox and G. M.
Longton
        Q4/08   SJ 8(4):557--568
        shows how to answer questions about distinct observations
        from first principles; provides a convenience command


Nick 
[email protected] 

Rodrigo Refoios Camejo

(As expected) you were right and it was coming back wrong because I
had not dropped the obs corresponding to the products before launch.
But I realised have another problem now... I want to count the number
of products that are generic in a subclass for a given quarter, but
because it's retrieving the number of presentation (id) that are
generic in a subclass for a given quarter. This because in each
product there are several presentation each one with one obs per
quarter. How can I count the # products in each subclass in a given
quarter that are generic instead of the # presentations (id)? The
dataset looks something like this:

id product subclass generic quarter
1      1          1            1           1
2      1          1            1           2
3      1          1            1           3
4      2          1            0           1
5      2          1            0           2
6      2          1            0           3
7      3          2            1           1
8      3          2            1           2
9      3          2            1           3

On Mon, Apr 12, 2010 at 8:45 PM, Nick Cox <[email protected]> wrote:

> I don't think so. The -by(subclass quarter)- subdivides observations
> according to combinations of the two variables, which I thought was
what
> you wanted.

Rodrigo Refoios Camejo

> Thanks for your quick reply, Nick... however, your suggestion is
> returning only the generic==1 in each subclass and ignoring the by
> quarter, i.e. sum is the same for all obs of thin each subclass
> irrespectively of the quarter it was observed at.
>
> Any idea what may be going wrong? I've grouped quarter, I've sorted by
> quarter subclass and still the same result...

On 4/12/10, Nick Cox <[email protected]> wrote:

>> The larger question here is "How should I model this?" which is
>> difficult at the best of times and in any case better left to
>> subject-matter experts.
>>
>> The smaller question is more my thing.
>>
>> egen sum = total(generic == 1), by(subclass quarter)
>>
>> may be the sort of solution you need.

Rodrigo Refoios Camejo [edited]

>> I have 10 years of panel data on pricing and quantities sold of
>> pharmaceuticals. For each presentation (i.e. dosage and package) of
>> each product I have data on the prices and quantities sold in each
>> quarter. My idea is to fit a regression model with price as the
>> dependent variable and independent variables related to competition
>> like: # products in the therapeutic class at time of launch; #
>> generics in the therapeutic class at time t; market share of market
>> leader at time t; price of market leader at time t-1; etc.
>>
>> How can I deal with the fact that some products were only launched
>> half-way the data timeframe and some were discontinued after launch,
>> i.e. when price=="0" & quantity=="0" for all t before t launch and
for
>> some after t launch. How can I have Stata including in the regression
>> only the data for which price and quantity is available? Simply drop
>> if price==0? Will it not treat it as missing if I do so?
>>
>> Another side question, how can I make Stata count the # products with
>> a particular characteristic (e.g. generic==1) marketed in each group
>> (subclass) of drugs in each quarter? And then assign to each product
>> the count corresponding to the quarter in which that product was
>> launched?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
  - From: Rodrigo Refoios Camejo <[email protected]>
- RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
  - From: "Nick Cox" <[email protected]>
- Re: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
  - From: Rodrigo Refoios Camejo <[email protected]>
- RE: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
  - From: "Nick Cox" <[email protected]>
- Re: st: Time series - what to do before a product launch?; how to count n obs showing an attribute in each year?
  - From: Rodrigo Refoios Camejo <[email protected]>

Prev by Date: st: RE: sum() with egenerate
Next by Date: st: AW: AW: sum() with egenerate
Previous by thread: RE: st: maximum likelihood
Next by thread: st: Changing variable name to variable label
Index(es):
- Date
- Thread