Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: egen(mean or suchlike) for a string variable


From   joe j <[email protected]>
To   [email protected]
Subject   Re: st: AW: egen(mean or suchlike) for a string variable
Date   Thu, 8 Oct 2009 14:36:20 +0200

Thanks so much, Martin. I learned something new!
JJ

On Thu, Oct 8, 2009 at 1:28 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
>
>
> *************
> clear*
>
> inp year str10(Uni Prof)
> 1990  Harvard   " S Smith"
> 1990   ""      "S Smith"
> 1990  UCLA      "P Williams"
> 1990  Yale       " K John"
> 1991   ""        "K Evert"
> 1991  Oxford     "K Evert"
> 1991  ""        "K Evert"
> end
>
> replace Uni=trim(Uni)
> replace Prof=trim(Prof)
> compress
>
> gen byte nonmiss=!mi(Uni)
>
> //replace with last obs
> bys year Prof (nonmiss): /*
> */ replace Uni=Uni[_N]  /*
> */ if nonmiss==0
>
> l, noo sepby(year Prof)
> *************
>
>
>
> HTH
> Martin
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von joe j
> Gesendet: Donnerstag, 8. Oktober 2009 12:33
> An: [email protected]
> Betreff: Re: st: AW: egen(mean or suchlike) for a string variable
>
> Thanks. (Your suggestion helped me create a variable that takes a
> numeric value, instead of the university name; this is definitely an
> improvement.)
>
> This is how the data looks like:
>
> Year  University Professor
>
> 1990  Harvard    S Smith
> 1990   ---------     S Smith
> 1990  UCLA      P Williams
> 1990  Yale        K John
>
> 1991   ---------    K Evert
> 1991  Oxford     K Evert
>
> What I want is to replace the missing names above, in 1990 with
> Harvard and in 1991 with Oxford.
>
> JJ
>
> On Thu, Oct 8, 2009 at 11:59 AM, Martin Weiss <[email protected]> wrote:
>>
>> <>
>>
>>
>>
>> You should turn the string into a numeric variable via -encode-. Then
> -egen-
>> can go to work. Also provide an excerpt of your data and show what you
> want
>> to happen to them...
>>
>>
>>
>> HTH
>> Martin
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von joe j
>> Gesendet: Donnerstag, 8. Oktober 2009 11:57
>> An: [email protected]
>> Betreff: st: egen(mean or suchlike) for a string variable
>>
>> In my data I have a string variable "University", which lists
>> university names. In some years the names are missing. Two other
>> variables I've are "Professor" and "Year". The same "Professor" and
>> "University" can occur multiple times in a year.
>>
>> The problem I have is that there are quite a few University names that
>> are missing. What I want to do is to replace as many missing
>> University names as possible, by assuming that: when a professor is
>> linked to a university at least once in a year, she is linked to the
>> same university during that year - so the missing university name when
>> her name occurs again in the same year can be replaced (why there are
>> missing university names is a complicated story:)).
>>
>>  Any suggestion would be appreciated.
>>
>> Best,
>> JJ
>>
>> I tried the following in Stata (it's foolish, I know):
>>
>>  bysort year professor: egen University_all=mean(University)
>>
>> But I get the warning "type mismatch".
>> *
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index