Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: AW: egen(mean or suchlike) for a string variable


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   AW: st: AW: egen(mean or suchlike) for a string variable
Date   Thu, 8 Oct 2009 13:28:49 +0200

<> 



*************
clear*

inp year str10(Uni Prof)
1990  Harvard   " S Smith"
1990   ""      "S Smith"
1990  UCLA      "P Williams"
1990  Yale       " K John"
1991   ""        "K Evert"
1991  Oxford     "K Evert"
1991  "" 	"K Evert"
end

replace Uni=trim(Uni)
replace Prof=trim(Prof)
compress

gen byte nonmiss=!mi(Uni)

//replace with last obs
bys year Prof (nonmiss): /* 
*/ replace Uni=Uni[_N]  /* 
*/ if nonmiss==0

l, noo sepby(year Prof)
*************



HTH
Martin

-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von joe j
Gesendet: Donnerstag, 8. Oktober 2009 12:33
An: [email protected]
Betreff: Re: st: AW: egen(mean or suchlike) for a string variable

Thanks. (Your suggestion helped me create a variable that takes a
numeric value, instead of the university name; this is definitely an
improvement.)

This is how the data looks like:

Year  University Professor

1990  Harvard    S Smith
1990   ---------     S Smith
1990  UCLA      P Williams
1990  Yale        K John

1991   ---------    K Evert
1991  Oxford     K Evert

What I want is to replace the missing names above, in 1990 with
Harvard and in 1991 with Oxford.

JJ

On Thu, Oct 8, 2009 at 11:59 AM, Martin Weiss <[email protected]> wrote:
>
> <>
>
>
>
> You should turn the string into a numeric variable via -encode-. Then
-egen-
> can go to work. Also provide an excerpt of your data and show what you
want
> to happen to them...
>
>
>
> HTH
> Martin
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von joe j
> Gesendet: Donnerstag, 8. Oktober 2009 11:57
> An: [email protected]
> Betreff: st: egen(mean or suchlike) for a string variable
>
> In my data I have a string variable "University", which lists
> university names. In some years the names are missing. Two other
> variables I've are "Professor" and "Year". The same "Professor" and
> "University" can occur multiple times in a year.
>
> The problem I have is that there are quite a few University names that
> are missing. What I want to do is to replace as many missing
> University names as possible, by assuming that: when a professor is
> linked to a university at least once in a year, she is linked to the
> same university during that year - so the missing university name when
> her name occurs again in the same year can be replaced (why there are
> missing university names is a complicated story:)).
>
>  Any suggestion would be appreciated.
>
> Best,
> JJ
>
> I tried the following in Stata (it's foolish, I know):
>
>  bysort year professor: egen University_all=mean(University)
>
> But I get the warning "type mismatch".
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index