Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: forvalues & replace not working under two 'not equal to' conditions


From   joe j <[email protected]>
To   [email protected]
Subject   Re: st: AW: forvalues & replace not working under two 'not equal to' conditions
Date   Wed, 11 Nov 2009 15:58:08 +0100

Thanks Martin. For now I can manage with what I have.

On Wed, Nov 11, 2009 at 2:29 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
> I am sure some combination of -duplicates tag- and -egen, group()- can get
> you there, but I am _way_ over my time limit on this one task. So I hope
> someone else can provide you with an answer.
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von joe j
> Gesendet: Mittwoch, 11. November 2009 13:20
> An: [email protected]
> Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
> to' conditions
>
> Thank you. Yh, the definition for nongroup_f should have been what I
> wrote today, and last night in response to Tim's mail.
> The final goal is:
> (a) contract_id;        (b) firm_id     (c) nation_id (d) group_d (e)
> group_f
> (f) nongroup_d (g) nongroup_f
> 1       2       US      1       0       0       0
> 1       2       US      1       0       0       0
> 4       3       UK      0       1       0       0
> 4       3       US      0       1       0       0
> 8       3       US      0       0       1       1
> 8       4       UK      0       1       0       1
> 8       4       US      0       1       1       0
> 9       3       US      0       0       1       1
> 9       4       UK      0       0       0       1
> 9       5       US      0       0       1       1
> 10      4       CH      0       1       0       1
> 10      4       UK      0       1       0       1
> 10      5       US      1       0       0       1
> 10      5       US      1       0       0       1
> 10      6       NL      0       0       1       1
> 10      7       NL      0       0       1       1
>
> And, the correct definitions of the last four 'output' variables are:
>
> (d)  group_d = 1 when both firm_id and nation_id are same for the
>  given observation relative to at least one other observation with
> the same contract_id
> (e)  group_f = 1  when firm_id is same but nation_id is different for the
>  given observation relative to at least one other observation with
> the same contract_id
> (f)  nongroup_d = 1  when firm_id is different but nation_id is same for the
>  given observation relative to at least one other observation with
> the same contract_id
> (g) nongroup_f = 1  when both firm_id and nation_id are different for the
>  given observation relative to at least one other observation with
> the same contract_id
>
> The first three variables could be derived following your logic, and
> for the last I'd see how to apply your suggestions (I'd also re-read
> Nick's paper).
>
> On Wed, Nov 11, 2009 at 12:40 PM, Martin Weiss <[email protected]> wrote:
>>
>> <>
>>
>>
>> Wait a minute! Seems to me you also changed the definition itself, which
>> triggers a different outcome for this last dummy? Anyway, provide your new
>> final goal, as you did yesterday, together with the correct definitions.
>>
>> I think you can safely omit the -forvalues- loops. Nick was not fond of
> them
>> yesterday, and neat solutions to such problems usually are derived from a
>> judicious combination of -bysort- and some -egen- function(s). This is
>> material covered comprehensively in Nick`s seminal
>> http://www.stata-journal.com/sjpdf.html?articlenum=pr0004. Other commands
>> recently employed for insidious problems of this kind are -expand-,
>> -tempfile- and -merge-...
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von joe j
>> Gesendet: Mittwoch, 11. November 2009 12:06
>> An: [email protected]
>> Betreff: Re: st: AW: forvalues & replace not working under two 'not equal
>> to' conditions
>>
>> Just an update. I discovered that given the definition of nongroup_f
>> "as equals  1  when both firm_id and nation_id are different for the
>> given observation relative to at least one other observation within
>> the same contract_id", the following should be the correct output for
>> contract_id 8 (the columns being contract_id, firm_id, country_id and
>> nongroup_f):
>>
>> 8       3       US      1
>> 8       4       UK      1
>> 8       4       US      0
>> Note that for firm_id 4 for for the US, the value of nongroup_f should
>> be 0. (Indeed I had made a mistake in the output I posted yesterday).
>> While I will use Martin's excellent code for the other three columns
>> (group_d, etc), for the nongroup_f column alone, following Nick's
>> pointers, I found that adding to the IF clause "nation_id[_n+`i']!=."
>> in my clunky code would yield the correct result.
>>
>> forvalues i=1/`=_N'{
>> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
>> (nation_id~=nation_id[_n-`i']) & (nation_id[_n-`i']!=.)
>> }
>> forvalues i=1/`=_N'{
>> bys id_a: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
>> (nation_id~=nation_id[_n+`i']) & (nation_id[_n+`i']!=.)
>> }
>> (I know it doesn't make sense to use _N as the upper limit; I'd
>> perhaps use the number of records in the contract_id with the maximum
>> number of records. I'd also see if Martin's code could be used here as
>> well with modifications)
>>
>> Thanks again for all the help.
>>
>> On Wed, Nov 11, 2009 at 12:18 AM, joe j <[email protected]> wrote:
>>> Sorry, I should have explained it better. nongroup_f = 1  when both
>>> firm_id and nation_id are different for the given observation relative
>>> to "at least one other observation" within the same contract_id. Thus
>>> in the following case of contract_id=10, we have value 1 for all
>>> observations for the nongroup_f variable. Martin's last response gives
>>> the correct result. Thanks, joe.
>>>
>>> 10      4       CH      0       1       0       1
>>> 10      4       UK      0       1       0       1
>>> 10      5       US      1       0       0       1
>>> 10      5       US      1       0       0       1
>>> 10      6       NL      0       0       1       1
>>> 10      7       NL      0       0       1       1
>>>
>>> On Tue, Nov 10, 2009 at 11:54 PM, Tim Wade <[email protected]> wrote:
>>>> Maybe I am missing something obvious here, but I can't follow what you
>>>> are trying to do either. This criterion:
>>>>
>>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>>> two or more observations with the same contract id
>>>>
>>>> does not seem to be consistent with this line listing:
>>>>
>>>>> 10      5       US      1       0       0       1
>>>>> 10      5       US      1       0       0       1
>>>>
>>>> here are two observations with the same firm_id and nation_id yet
>>>> nongroup_f is 1. However, you may want to try looking at some
>>>> combinations of -duplicates, tag- and levelsof, this might help as an
>>>> alternative approach.
>>>>
>>>> Tim
>>>>
>>>>
>>>> On Tue, Nov 10, 2009 at 12:08 PM, joe j <[email protected]> wrote:
>>>>> Thanks. The last 4 columns (group_d; group_f; nongroup_d; nongroup_f)
>>>>> are the final output variables. Their definitions are below the table.
>>>>>
>>>>> ******
>>>>> contract_id; firm_id; nation_id; group_d; group_f; nongroup_d;
>> nongroup_f
>>>>> 1       2       US      1       0       0       0
>>>>> 1       2       US      1       0       0       0
>>>>> 4       3       UK      0       1       0       0
>>>>> 4       3       US      0       1       0       0
>>>>> 8       3       US      0       0       1       1
>>>>> 8       4       UK      0       1       0       1
>>>>> 8       4       US      0       1       1       1
>>>>> 9       3       US      0       0       1       1
>>>>> 9       4       UK      0       0       0       1
>>>>> 9       5       US      0       0       1       1
>>>>> 10      4       CH      0       1       0       1
>>>>> 10      4       UK      0       1       0       1
>>>>> 10      5       US      1       0       0       1
>>>>> 10      5       US      1       0       0       1
>>>>> 10      6       NL      0       0       1       1
>>>>> 10      7       NL      0       0       1       1
>>>>> ******
>>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>>> more observations with the same contract id
>>>>> 2. group_f = 1  when firm_id is same but nation_id is different for
>>>>> two or more observations with the same contract id
>>>>> 3. nongroup_d = 1  when firm_id is different but nation_id is same for
>>>>> two or more observations with the same contract id
>>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>>> two or more observations with the same contract id
>>>>>
>>>>>
>>>>> On Tue, Nov 10, 2009 at 5:47 PM, Martin Weiss <[email protected]>
>> wrote:
>>>>>>
>>>>>> <>
>>>>>>
>>>>>>
>>>>>> For clarification, you could provide the solution, i.e. the dummies
>> that you
>>>>>> actually want to see as your final output, for your chosen example.
>> Makes it
>>>>>> considerably easier to work towards code for you...
>>>>>>
>>>>>>
>>>>>>
>>>>>> HTH
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Ursprüngliche Nachricht-----
>>>>>> Von: [email protected]
>>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>>> Gesendet: Dienstag, 10. November 2009 17:39
>>>>>> An: [email protected]
>>>>>> Betreff: Re: st: AW: forvalues & replace not working under two 'not
>> equal
>>>>>> to' conditions
>>>>>>
>>>>>> Thanks Martin. I think I wasn't clear enough in the last mail. I was
>>>>>> not looking at various combinations of firm_id, nation_id and
>>>>>> contract_id 'for each observation'. Rather I was looking at the
>>>>>> similarity or difference of firm_id/nation_id 'between two or more
>>>>>> observations' under each contract_id.
>>>>>>
>>>>>> Based on Martin's suggestion I could derive group_d (see below). But I
>>>>>> still can't get right nongroup_f, which equals 1 (for all
>>>>>> observations) if firm_id and nation_id are different for two or more
>>>>>> observations under each contract_id (but it takes a value 1, wrongly,
>>>>>> for all observations in the data)
>>>>>>
>>>>>> *deriving group_d (this works)
>>>>>> egen groups=group(firm_id nation_id)
>>>>>>
>>>>>> bys contract_id (groups):  /*
>>>>>> */ gen byte distinctcount_group_d= /*
>>>>>> */ (groups[_n]==groups[_n+1])
>>>>>>
>>>>>> bys contract_id (groups):  /*
>>>>>> */ replace distinctcount_group_d=1 /*
>>>>>> */ if (groups[_n]==groups[_n-1])
>>>>>>
>>>>>> *2 deriving nongroup_f doesnt work (e.g. it should be 0 for
>> contract_id=1)
>>>>>> bys contract_id (groups):  /*
>>>>>> */ gen byte distinctcount_nongroup_f= /*
>>>>>> */ (groups[_n]~=groups[_n+1]) & (nation_id[_n]~=nation_id[_n+1])
>>>>>>
>>>>>> bys contract_id (groups):  /*
>>>>>> */ replace distinctcount_nongroup_f=1 /*
>>>>>> */ if (groups[_n]~=groups[_n-1]) & (nation_id[_n]~=nation_id[_n-1])
>>>>>>
>>>>>> On Tue, Nov 10, 2009 at 4:14 PM, Martin Weiss <[email protected]>
>> wrote:
>>>>>>>
>>>>>>> <>
>>>>>>>
>>>>>>> I think a variable denoting the combinations between the three ids is
>> a
>>>>>> good
>>>>>>> place to start for you:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *************
>>>>>>> clear*
>>>>>>> inp byte(contract_id firm_id) nation_id:mylabel, auto
>>>>>>> 1   2   "US"
>>>>>>> 1   2   "US"
>>>>>>> 4   3   "UK"
>>>>>>> 4   3   "US"
>>>>>>> 8   4   "US"
>>>>>>> 8   4   "UK"
>>>>>>> 8   3   "US"
>>>>>>> 9   5   "US"
>>>>>>> 9   4   "UK"
>>>>>>> 9   3   "US"
>>>>>>> 10   5   "US"
>>>>>>> 10   5   "US"
>>>>>>> 10   6   "NL"
>>>>>>> 10   7   "NL"
>>>>>>> 10   4   "UK"
>>>>>>> 10   4   "CH"
>>>>>>> end
>>>>>>>
>>>>>>> egen groups=group(contract_id firm_id nation_id)
>>>>>>>
>>>>>>> l, sepby(con) noobs
>>>>>>> *************
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> HTH
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> -----Ursprüngliche Nachricht-----
>>>>>>> Von: [email protected]
>>>>>>> [mailto:[email protected]] Im Auftrag von joe j
>>>>>>> Gesendet: Dienstag, 10. November 2009 16:04
>>>>>>> An: [email protected]
>>>>>>> Betreff: st: forvalues & replace not working under two 'not equal to'
>>>>>>> conditions
>>>>>>>
>>>>>>> My dataset has three variables 1. contract_id, 2. firm_id and 3.
>>>>>>> nation_id. I want to create 4 variables, each of which gets a value
> of
>>>>>>> 1 if certain conditions are met. The variables I want to create are
>>>>>>> specific to the contract id, and are:
>>>>>>>
>>>>>>> 1. group_d = 1 when both firm_id and nation_id are same for two or
>>>>>>> more firms with the same contract id
>>>>>>> 2. group_f = 1  when firm_id is same but nation_id is different for
>>>>>>> two or more firms with the same contract id
>>>>>>> 3. nongroup_d = 1  when firm_id is different but nation_id is same
> for
>>>>>>> two or more firms with the same contract id
>>>>>>> 4 .nongroup_f = 1  when both firm_id and nation_id are different for
>>>>>>> two or more firms with the same contract id
>>>>>>>
>>>>>>> The following code works well for the first three variables, but not
>>>>>>> for the last, nongroup_f; the value is 1 for all observations. I
> can't
>>>>>>> figure out why.
>>>>>>>
>>>>>>> This is a sample code:
>>>>>>>
>>>>>>> clear
>>>>>>> inp str10(contract_id firm_id   nation_id)
>>>>>>> 1   2   "US"
>>>>>>> 1   2   "US"
>>>>>>> 4   3   "UK"
>>>>>>> 4   3   "US"
>>>>>>> 8   4   "US"
>>>>>>> 8   4   "UK"
>>>>>>> 8   3   "US"
>>>>>>> 9   5   "US"
>>>>>>> 9   4   "UK"
>>>>>>> 9   3   "US"
>>>>>>> 10   5   "US"
>>>>>>> 10   5   "US"
>>>>>>> 10   6   "NL"
>>>>>>> 10   7   "NL"
>>>>>>> 10   4   "UK"
>>>>>>> 10   4   "CH"
>>>>>>> end
>>>>>>>
>>>>>>>
>>>>>>> *1.group_d . WORKS!
>>>>>>> gen group_d=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n-`i'] &
>>>>>>> nation_id==nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_d=1 if firm_id==firm_id[_n+`i'] &
>>>>>>> nation_id==nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *2.group_f  WORKS!
>>>>>>> gen group_f=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n-`i'] &
>>>>>>> nation_id!=nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace group_f=1 if firm_id==firm_id[_n+`i'] &
>>>>>>> nation_id!=nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *3. nongroup_d  WORKS!
>>>>>>> gen nongroup_d=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n-`i'] &
>>>>>>> nation_id==nation_id[_n-`i']
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_d=1 if firm_id!=firm_id[_n+`i'] &
>>>>>>> nation_id==nation_id[_n+`i']
>>>>>>> }
>>>>>>>
>>>>>>> *4.nongroup_f DOESN'T WORK!!
>>>>>>> gen nongroup_f=.
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n-`i']) &
>>>>>>> (nation_id~=nation_id[_n-`i'])
>>>>>>> }
>>>>>>> forvalues i=1/`=_N'{
>>>>>>> bys contract_id: replace nongroup_f=1 if (firm_id~=firm_id[_n+`i']) &
>>>>>>> (nation_id~=nation_id[_n+`i'])
>>>>>>> }
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>
>>>>>>>
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>>
>>>>>>
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>>
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/statalist/faq
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index