Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Extract year of maximum production


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Extract year of maximum production
Date   Sat, 20 Oct 2012 09:18:35 +0100

There is a review of working row-wise in

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
        (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
        Q1/09   SJ 9(1):137--157
        shows how to exploit functions, egen functions, and Mata
        for working rowwise; rowsort and rowranks are introduced

which is accessible to all under the SJ's 3-year rule. -search
rowwise- to get a clickable link.

You have already alluded to two key principles in this territory

1. A -reshape long- can make a row-wise problem easier.

2. There are -egen- functions for row problems.

Even more important than #2 is

3. You can solve many problems with a loop over variables.

as #2 is just an application of #3.

With 3000-odd counties there may well be ties, i.e. counties for which
two or more years show the same maximum.

Let's be alert to that amd use a slightly odd technique. (Others will
be suggested by my paper referred to above.) Starting with your idea

egen maxprod = rowmax(timber????)

let's initialise a string variable

gen whenmax = ""

Now whenever we find a year when the maximum occurred we add that year
to our string

* isn't that 101 years? no matter
qui forval y = 1910/2010 {
        replace whenmax = whenmax + "`y'  " if timber`y' == maxprod
}

Note the extra space in the " " above.

In most counties there will be a single year when there was a maximum

gen when = real(whenmax) if  wordcount(whenmax) == 1

And you can look at the cases with ties to think what you want to do with them:

l county whenmax if wordcount(whenmax) > 1

Notes.

To extract words from a string, use -word()-.

If you have counties with 0 values, they might as well be -mvdecode-d
to missing before you calculate your maximum. Extreme case: a county
with 0s in every year; iit is absurd to say that the maximum
production occurred in every year. (Or you might just -drop- such
counties.)

Similarly, this technique assumes that there aren't so many ties that
the years can't be fit in a -str244- variable and if that's wrong you
may need a different technique

Nick

On Sat, Oct 20, 2012 at 4:10 AM, Daniel Escher <descher@nd.edu> wrote:
> I have 100 years of timber production data for all counties in the US
> (~3,100). The data are currently in wide format - i.e., timber1910,
> timber1911 ... timber2010 (but I can switch them to long if needed).
>
> I would like to extract the year of maximum timber production for each
> county and put that year in a variable called "peakyr." Two things I
> thought might be helpful: 1) I can use -egen newvar = rowmax...- to get the
> maximum value of a row. 2) I can separate the stub and the year in the
> variable name using the -substr- function. Unfortunately, I don't know how
> to make those two processes  "talk" to each other - if they are even the
> right ones to use.
>
> Stata/IC 12.1 for Windows (32-bit)
> Revision 01 Oct 2012
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index