Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Data Management


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Data Management
Date   Fri, 21 Nov 2008 17:48:32 -0000

I am confused. You changed my code, which you should do if it is wrong. 
But I don't think it is wrong. 

Also, if the second command is exactly what you typed, it should -list-
at most one observation. _n == 2 is true for the whole dataset just
once, and the compound statement will be true either never or once. 

There is some misinformation somewhere. 

Nick 
n.j.cox@durham.ac.uk 

Rijo John

Thanks Nick.

 When I wrote the same command

by id : gen prev = city[1]
list prev city if prev != city & _n == 2

it gave me the the solution.

If I use "by id" again with the second command it would not list what I
want.

Thanks,
Rijo.

On Fri, Nov 21, 2008 at 11:35 AM, Rijo John <rmjohn@gmail.com> wrote:
> Hi Nick,
>
>  I will read more into the tip you gave.. When I gave the command you
suggested
>
> by id : gen prev = city[1]
> by id : list prev city if prev != city & _n == 2
>
> it just lists all the ids... one by one... Doesn't solve the problem.
>
> Thanks.
> Rijo.
>
>
>
> On Fri, Nov 21, 2008 at 11:26 AM, Nick Cox <n.j.cox@durham.ac.uk>
wrote:
>> This synthetic example shows that the command will list precisely
those
>> observations that differ from the previous observation. But this
>> includes the first, as city[0] evaluates to string missing, i.e. "".
>> More generally, varname[0] is regarded as missing in the sense of the
>> variable's data type, i.e. numeric missing . or string missing "". So
>> the first in each group will always be listed (unless its value is
>> missing).
>>
>> . l
>>
>>     +------------+
>>     |       city |
>>     |------------|
>>  1. | Durham, UK |
>>  2. | Durham, UK |
>>  3. | Durham, UK |
>>  4. | Durham, NC |
>>  5. | Durham, NC |
>>     |------------|
>>  6. | Durham, NH |
>>  7. | Durham, NH |
>>  8. | Durham, NH |
>>  9. | Durham, NH |
>>  10. | Durham, NH |
>>     +------------+
>>
>> . list if city != city[_n-1]
>>
>>     +------------+
>>     |       city |
>>     |------------|
>>  1. | Durham, UK |
>>  4. | Durham, NC |
>>  6. | Durham, NH |
>>     +------------+
>>
>> You probably want
>>
>> by id : gen prev = city[1]
>> by id : list prev city if prev != city & _n == 2
>>
>> There is no royal road to cleaning up string variables. The matter
was
>> discussed on the list earlier this year and written up as a Tip:
>>
>> SJ-8-3  dm0039  . . .  Stata tip 64: Cleaning up user-entered string
>> variables
>>        . . . . . . . . . . . . . . . . . . . . . . . .  J. Herrin and
>> E. Poen
>>        Q3/08   SJ 8(3):444--445                                 (no
>> commands)
>>        tip on how to clean up user-entered string variables
>>
>> Nick
>> n.j.cox@durham.ac.uk
>>
>> Rijo John
>>
>> I have a data set as follows
>>
>> ID  City          Year
>> 1    City name   1
>> 1    City name   2
>>
>>
>> The data is suppose to have same city names for each ids for year 1
>> and two. but there are many occasions where city for the year 1 is
>> spelt differently thanthat for year 2. I just want to list out or
edit
>> those cities where city names are different for year 1 and 2 for the
>> same ID. When I issue the following command
>>
>> bysort ID : list if  City!=City[_n-1]
>>
>> it lists all observations in the data whether or not the city is
spelt
>> differently in years one and two. Thats strange to me? Can someone
>> tell what  I am doing wrong here?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index