Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Creating a dummy variable that 'marks out' useless


From   "Clive Nicholas" <[email protected]>
To   [email protected]
Subject   Re: st: RE: Creating a dummy variable that 'marks out' useless
Date   Wed, 15 Dec 2004 02:11:14 -0000 (GMT)

(Many apologies for not reporting back much earlier.)

Scott Merryman replied:

[...]

> . mark mark2 if edconch != .

> . replace mark2 = 0 if conch == .
  (0 real changes made)

> . replace mark2 = 0 if marker == 0
  (6 real changes made)

[...]

This worked perfectly: thanks!

Nick Cox replied:

[...]

> gen mark = conch < . & edconch < . & marker != 0

[...]

Thanks for this: this also worked almost as well. Why the difference?
Well, consider this block of data for Plaid Cymru (National Party of
Wales) net votes, where -mark4- represents Scott's routine, -nmark-
represents Nick's routine and -marker- was an -egen, tag()- generated
indicator marking out unavoidable duplicate cases:

         natch    ednatch    marker     mark4      nmark
  1. -4.727228    21.4088         1         1          1
  2.  .7938523   9.794947         1         1          1
  3. -1.833212  -4.599777         1         1          1
  4.  2.022674   24.89381         1         1          1
  5.  1.016729   5.512588         1         1          1
  6.  3.982042   5.620074         1         1          1
  7. -1.833212  -4.599777         0         0          0
  8.  2.022674  -2.766565         0         0          0
  9.  2.022674  -2.766565         0         0          0
 10. -1.833212   21.08436         0         0          0
 11.         .          .         1         0          1
 12.         .          .         1         0          1
 13.         .          .         1         0          1
 14.         .          .         1         0          1
 15.         .          .         1         0          1
 16.         .          .         1         0          1
 17.         .          .         0         0          0
 18.         .          .         0         0          0
 19.         .          .         0         0          0
 20.         .          .         0         0          0
 21.         .          .         0         0          0
 22.         .          .         0         0          0
 23.         .          .         0         0          0
 24. -1.250048   9.806999         1         1          1
 25. -1.039213  -1.973771         1         1          1
 26.  .0781536  -.9345583         1         1          1
 27.  .1249657  -1.012712         1         1          1
 28.  .6216435          .         1         0          0
 29.  1.577682   5.098733         1         1          1
 30.         .          .         1         0          1

Scott's routine correctly marks out the missing observations 11-16 and 30;
Nick's routine marks them _in_ because marker=1 (but I don't need these
marked cases if they're missing on at least one of the vote variables).

However, Nick's approach is attractive as it attempts to do it all in a
single line, so the question is whether or not Nick's code can be tweaked
to correctly mark out these cases. This isn't to disparage Scott's
approach just because he does it in three.

Thanks to Scott for introducing me to -mark- and to Nick for thinking up
such a lateral way to -generate- (I would never have thought to use "<.").
:)

CLIVE NICHOLAS        |t: 0(044)7903 397793
Politics              |e: [email protected]
Newcastle University  |http://www.ncl.ac.uk/geps

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index