Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: testing -duplicates tag-


From   "Martin Weiss" <martin.weiss@uni-tuebingen.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: testing -duplicates tag-
Date   Thu, 4 Sep 2008 09:58:18 +0200

Ok, so let`s try that again. The tag should now reliably indicate that an
observation is duplicated more times overall than in the domestic subgroup,
implying that it must have at least one match in the foreign group...

*********

sysuse auto, clear
g id=_n

duplicates tag headroom trunk if foreign==0, generate(dupdom)
duplicates tag headroom trunk, generate(dupall)

*tag to indicate domestic obs with at least one match in foreign
g byte tag = for==0 & dupall>dupdom

*letīs see
l tag id f if for==0, noo h(25)
*********

HTH
Martin


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Michael McCulloch
Sent: Thursday, September 04, 2008 6:30 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: testing -duplicates tag-

The code suggested by Martin gets me closer, but the pattern is still 
not exclusive. I'm trying to identify observations in DOMESTIC, which 
are duplicates (in headroom & trunk) of observations in FOREIGN. Here 
are two sets of those duplicates. Note how 20 is a duplicate of 57, 
where the patterns of missing and 0 in dupfor and dupdom seem to form 
a pattern; that pattern, however is contradicted in the next set, 
where 53 71 and 72 are duplicates of 32.

Any ideas would be appreciated!

id	foreign	 headroom	trunk	dupall	dupfor	dupdom
20	Domestic		2	8	1	.	0
57	Foreign		2	8	1	0	.
*	*	*	*	*	*	*
32	Domestic		3	15	3	.	0
53	Foreign		3	15	3	2	.
71	Foreign		3	15	3	2	.
72	Foreign		3	15	3	2	.




>Try this:
>
>sysuse auto, clear
>duplicates tag headroom trunk if foreign==1, generate(dupfor)
>*duplicates tag headroom trunk if foreign==0, generate(dupdom)
>duplicates tag headroom trunk, generate(dupall)
>l if dupfor==0 & dupall>0
>
>
>HTH
>Martin
>
>
>Quoting Michael McCulloch <mm@pinest.org>:
>
>>On other question, if I may:
>>How would I modify the list command as re-written below, to identify
>>only those duplicates where:
>>	headroom and trunks are duplicated, but
>>	foreign is not,
>>so that I could find only those Foreign cars who have duplicates in the
>>set of Domestic cars (in this case observations #7 and #8)?
>>
>>clear
>>sysuse auto
>>list foreign headroom trunk
>>duplicates tag headroom trunk, generate(dup)
>>sort headroom trunk
>>list foreign headroom trunk dup if dup>0 & trunk==8, clean noobs
>>
>>
>>
>>
>>>Well, as -help duplicates- shows, a -varlist- is allowed with all  
>>>of the fice commands. If you had the *OR* operator, this would be  
>>>pointless. -duplicates tag- watches out for unique combinations of  
>>>the variables in your -varlist- and then tags with the number of  
>>>other observations sharing this unique combination.
>>>
>>>sysuse auto, clear
>>>duplicates tag head mpg, gen(dup)
>>>duplicates report headroom mpg
>>>ta dup
>>>
>>>duplicates tag head mpg tru, gen(dup1)
>>>duplicates report headroom mpg tru
>>>ta dup1
>>>
>>>
>>>HTH
>>>Martin
>>>
>>>Quoting Michael McCulloch <mm@pinest.org>:
>>>
>>>>Thanks Martin. Am I correct in understanding that, in this revised
>>>>example immediately below, the command:
>>>>
>>>>	. duplicates tag headroom trunk, generate(dup)
>>>>
>>>>would tag as dup>0 all sets of observations for which there are   
>>>>duplicates of:
>>>>	headroom *AND* trunk
>>>>and not just those for which there are duplicates of:
>>>>	headroom *OR* trunk
>>>>?
>>>>It looks that way on visual inspection of this example's output, but I
>>>>want to make sure before applying it to my much larger dataset.
>>>>
>>>>
>>>>clear
>>>>sysuse auto
>>>>list foreign headroom trunk
>>>>duplicates tag headroom trunk, generate(dup)
>>>>sort headroom trunk
>>>>list foreign headroom trunk dup if dup>0, clean
>>>>
>>>>Michael
>>>>
>>>>>Well, the question is not much clearer now, at least to me. I   
>>>>>suspect you want something like
>>>>>
>>>>>count if duptag > 0
>>>>>
>>>>>after your commands. Just replace duptag with the tag used by  
>>>>>Stata  and be aware that two observations sharing the same  
>>>>>covariate  pattern would each be counted twice (58 and 59 would  
>>>>>both count  under this rule). If that is not what you want,  
>>>>>clarify!
>>>>>
>>>>>
>>>>>HTH
>>>>>Martin
>>>>>
>>>>>Quoting Michael McCulloch <mm@pinest.org>:
>>>>>
>>>>>>Apologies, I wasn't clear in my question. What I want to do is find
>>>>>>records for which *both* trunk and headroom are duplicates. So
>>>>>>following the command suggested by Martin and Nick, I get:
>>>>>>
>>>>>>
>>>>>>. list foreign headroom trunk if trunk==8, clean
>>>>>>
>>>>>>       foreign   headroom   trunk  20.   Domestic        2.0       8
>>>>>>45.   Domestic        1.5       8  57.    Foreign        2.0       8
>>>>>>58.    Foreign        2.5       8  59.    Foreign        2.5       8
>>>>>>Note that:
>>>>>>	observations 20 and 57 both have headroom==2.0, trunk==8
>>>>>>	observations 58 and 59 both have headroom==2.5, trunk==8
>>>>>>
>>>>>>Since I'm developing this command for use in a large dataset, how
would
>>>>>>I follow up -duplicates tag- to identify those unique sets of records,
>>>>>>where two variables are duplicates simultaneously, without having to
>>>>>>search manually?
>>>>>>
>>>>>>>I cannot see your point. Stata does tag these observations 
>>>>>>>with   tag 1. Just
>>>>>>>-list- after -duplicates tag-.
>>>>>>>
>>>>>>>**********
>>>>>>>clear
>>>>>>>sysuse auto
>>>>>>>list foreign headroom trunk if trunk==8
>>>>>>>duplicates tag headroom trunk, generate(dup_admission_id)
>>>>>>>*Let`s see...
>>>>>>>list dup_* foreign headroom trunk if trunk==8
>>>>>>>**********
>>>>>>>
>>>>>>>HTH
>>>>>>>Martin
>>>>>>>
>>>>>>>-----Original Message-----
>>>>>>>From: owner-statalist@hsphsun2.harvard.edu
>>>>>>>[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of   
>>>>>>>Michael McCulloch
>>>>>>>Sent: Wednesday, September 03, 2008 6:29 PM
>>>>>>>To: Statalist
>>>>>>>Subject: st: testing -duplicates tag-
>>>>>>>
>>>>>>>Hello,
>>>>>>>I'm testing -duplicates tag-, and puzzled as to why it won't show the
>>>>>>>two observations where headroom==2.0 and trunk==8.
>>>>>>>
>>>>>>>clear
>>>>>>>sysuse auto
>>>>>>>list foreign headroom trunk if trunk==8
>>>>>>>duplicates tag headroom trunk, generate(dup_admission_id)
>>>>>>>
>>>>>>>--
>>>>>>>
>>>>>>>Best wishes,
>>>>>>>Michael McCulloch
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>Pine Street Foundation
>>>>>>>124 Pine St., San Anselmo, CA 94960-2674
>>>>>>>Tel:	(415) 407-1357
>>>>>>>Fax:	(415) 485-1065
>>>>>>>mcculloch@pinestreetfoundation.org
>>>>>>>www.pinestreetfoundation.org
>>>>>>>*
>>>>>>>*   For searches and help try:
>>>>>>>*   http://www.stata.com/help.cgi?search
>>>>>>>*   http://www.stata.com/support/statalist/faq
>>>>>>>*   http://www.ats.ucla.edu/stat/stata/
>>>>>>>
>>>>>>>
>>>>>>>*
>>>>>>>*   For searches and help try:
>>>>>>>*   http://www.stata.com/help.cgi?search
>>>>>>>*   http://www.stata.com/support/statalist/faq
>>>>>>>*   http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>>*
>>>>>>*   For searches and help try:
>>>>>>*   http://www.stata.com/help.cgi?search
>>>>>>*   http://www.stata.com/support/statalist/faq
>>>>>>*   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>*
>>>>>*   For searches and help try:
>>>>>*   http://www.stata.com/help.cgi?search
>>>>>*   http://www.stata.com/support/statalist/faq
>>>>>*   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>>*
>>>>*   For searches and help try:
>>>>*   http://www.stata.com/help.cgi?search
>>>>*   http://www.stata.com/support/statalist/faq
>>>>*   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>>
>>>
>>>*
>>>*   For searches and help try:
>>>*   http://www.stata.com/help.cgi?search
>>>*   http://www.stata.com/support/statalist/faq
>>>*   http://www.ats.ucla.edu/stat/stata/
>>
>>*
>>*   For searches and help try:
>>*   http://www.stata.com/help.cgi?search
>>*   http://www.stata.com/support/statalist/faq
>>*   http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>*
>*   For searches and help try:
>*   http://www.stata.com/help.cgi?search
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index