Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Data Management


From   "Rijo John" <[email protected]>
To   [email protected]
Subject   Re: st: RE: Data Management
Date   Fri, 21 Nov 2008 11:35:01 -0600

Hi Nick,

 I will read more into the tip you gave.. When I gave the command you suggested

by id : gen prev = city[1]
by id : list prev city if prev != city & _n == 2

it just lists all the ids... one by one... Doesn't solve the problem.

Thanks.
Rijo.



On Fri, Nov 21, 2008 at 11:26 AM, Nick Cox <[email protected]> wrote:
> This synthetic example shows that the command will list precisely those
> observations that differ from the previous observation. But this
> includes the first, as city[0] evaluates to string missing, i.e. "".
> More generally, varname[0] is regarded as missing in the sense of the
> variable's data type, i.e. numeric missing . or string missing "". So
> the first in each group will always be listed (unless its value is
> missing).
>
> . l
>
>     +------------+
>     |       city |
>     |------------|
>  1. | Durham, UK |
>  2. | Durham, UK |
>  3. | Durham, UK |
>  4. | Durham, NC |
>  5. | Durham, NC |
>     |------------|
>  6. | Durham, NH |
>  7. | Durham, NH |
>  8. | Durham, NH |
>  9. | Durham, NH |
>  10. | Durham, NH |
>     +------------+
>
> . list if city != city[_n-1]
>
>     +------------+
>     |       city |
>     |------------|
>  1. | Durham, UK |
>  4. | Durham, NC |
>  6. | Durham, NH |
>     +------------+
>
> You probably want
>
> by id : gen prev = city[1]
> by id : list prev city if prev != city & _n == 2
>
> There is no royal road to cleaning up string variables. The matter was
> discussed on the list earlier this year and written up as a Tip:
>
> SJ-8-3  dm0039  . . .  Stata tip 64: Cleaning up user-entered string
> variables
>        . . . . . . . . . . . . . . . . . . . . . . . .  J. Herrin and
> E. Poen
>        Q3/08   SJ 8(3):444--445                                 (no
> commands)
>        tip on how to clean up user-entered string variables
>
> Nick
> [email protected]
>
> Rijo John
>
> I have a data set as follows
>
> ID  City          Year
> 1    City name   1
> 1    City name   2
>
>
> The data is suppose to have same city names for each ids for year 1
> and two. but there are many occasions where city for the year 1 is
> spelt differently thanthat for year 2. I just want to list out or edit
> those cities where city names are different for year 1 and 2 for the
> same ID. When I issue the following command
>
> bysort ID : list if  City!=City[_n-1]
>
> it lists all observations in the data whether or not the city is spelt
> differently in years one and two. Thats strange to me? Can someone
> tell what  I am doing wrong here?
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Rijo M John,
Institute for Health Research and Policy (MC 275),
1747 West Roosevelt Road, Room 558,
Chicago, Illinois 60608.
Ph: 312-413-9057
Fax: 312-355-2801
URL: http://ihrp-web.ihrp.uic.edu/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index