Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Extract year of maximum production


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Extract year of maximum production
Date   Sat, 20 Oct 2012 19:59:53 +0100

For completeness, note

reshape long timber, i(county) j(year)
bysort county (timber) : gen whenmax = year[_N]

Nick

On Sat, Oct 20, 2012 at 3:36 PM, Daniel Escher <descher@nd.edu> wrote:
> Nick, that worked perfectly. Thank you so much!
>
> egen maxprod = rowmax(timber1910-timber2010)
> recode maxprod 0 = .  // I recoded counties without production to
> missing for now
> gen whenmax = ""
> qui forval y = 1910/2010 {
>         replace whenmax = whenmax + "`y'  " if timber`y' == maxprod
> }
> gen when = real(whenmax) if  wordcount(whenmax) == 1
>
> *Yes, 1910-2010 is 101 years.
> *I didn't have any duplicates/ties to worry about
>
> On Sat, Oct 20, 2012 at 4:18 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> There is a review of working row-wise in
>>
>> SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
>>         (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
>>         Q1/09   SJ 9(1):137--157
>>         shows how to exploit functions, egen functions, and Mata
>>         for working rowwise; rowsort and rowranks are introduced
>>
>> which is accessible to all under the SJ's 3-year rule. -search
>> rowwise- to get a clickable link.
>>
>> You have already alluded to two key principles in this territory
>>
>> 1. A -reshape long- can make a row-wise problem easier.
>>
>> 2. There are -egen- functions for row problems.
>>
>> Even more important than #2 is
>>
>> 3. You can solve many problems with a loop over variables.
>>
>> as #2 is just an application of #3.
>>
>> With 3000-odd counties there may well be ties, i.e. counties for which
>> two or more years show the same maximum.
>>
>> Let's be alert to that amd use a slightly odd technique. (Others will
>> be suggested by my paper referred to above.) Starting with your idea
>>
>> egen maxprod = rowmax(timber????)
>>
>> let's initialise a string variable
>>
>> gen whenmax = ""
>>
>> Now whenever we find a year when the maximum occurred we add that year
>> to our string
>>
>> * isn't that 101 years? no matter
>> qui forval y = 1910/2010 {
>>         replace whenmax = whenmax + "`y'  " if timber`y' == maxprod
>> }
>>
>> Note the extra space in the " " above.
>>
>> In most counties there will be a single year when there was a maximum
>>
>> gen when = real(whenmax) if  wordcount(whenmax) == 1
>>
>> And you can look at the cases with ties to think what you want to do with them:
>>
>> l county whenmax if wordcount(whenmax) > 1
>>
>> Notes.
>>
>> To extract words from a string, use -word()-.
>>
>> If you have counties with 0 values, they might as well be -mvdecode-d
>> to missing before you calculate your maximum. Extreme case: a county
>> with 0s in every year; iit is absurd to say that the maximum
>> production occurred in every year. (Or you might just -drop- such
>> counties.)
>>
>> Similarly, this technique assumes that there aren't so many ties that
>> the years can't be fit in a -str244- variable and if that's wrong you
>> may need a different technique
>>
>> Nick
>>
>> On Sat, Oct 20, 2012 at 4:10 AM, Daniel Escher <descher@nd.edu> wrote:
>>> I have 100 years of timber production data for all counties in the US
>>> (~3,100). The data are currently in wide format - i.e., timber1910,
>>> timber1911 ... timber2010 (but I can switch them to long if needed).
>>>
>>> I would like to extract the year of maximum timber production for each
>>> county and put that year in a variable called "peakyr." Two things I
>>> thought might be helpful: 1) I can use -egen newvar = rowmax...- to get the
>>> maximum value of a row. 2) I can separate the stub and the year in the
>>> variable name using the -substr- function. Unfortunately, I don't know how
>>> to make those two processes  "talk" to each other - if they are even the
>>> right ones to use.
>>>
>>> Stata/IC 12.1 for Windows (32-bit)
>>> Revision 01 Oct 2012
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index