Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: limitations of "generate" with missing data

 From Nick Cox <[email protected]> To [email protected] Subject Re: st: limitations of "generate" with missing data Date Mon, 11 Apr 2011 23:19:51 +0100

```Add ) at end in #3.

On Mon, Apr 11, 2011 at 11:15 PM, Nick Cox <[email protected]> wrote:
> The underlying problem can be illustrated by sorting. Suppose we
> -sort- a variable, which contains missings, in numeric order. Where do
> the missings go? We need a decision: either missing is regarded as
> larger than any non-missing, or smaller than any non-missing. Stata
>
> Any way, here are some solutions:
>
>
>
>
>
> (5. don't throw away information by turning a measure into an indicator!)
>
> Nick
>
> On Mon, Apr 11, 2011 at 11:01 PM, Michael Costello
> <[email protected]> wrote:
>> Statalisters,
>>
>> I recently ran into a problem with the following dataset:
>>
>>  score_pcnt |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>          0 |        150        7.50        7.50
>>         .2 |         85        4.25       11.75
>>         .4 |         97        4.85       16.60
>>         .6 |         82        4.10       20.70
>>         .8 |         72        3.60       24.30
>>          1 |         15        0.75       25.05
>>          . |      1,499       74.95      100.00
>> ------------+-----------------------------------
>>      Total |      2,000      100.00
>>
>> The high number of "missing" is by design, a by-product of a
>> horizontally structured dataset that I have yet to rectify.
>>
>> When I run the command:
>> I am left with
>>
>> score_pcnt8 |
>>          0 |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>          0 |        414       20.70       20.70
>>          1 |      1,586       79.30      100.00
>> ------------+-----------------------------------
>>      Total |      2,000      100.00
>>
>> As you can see, the 87 values above .79 were set to 1, but so were all
>> the missing values!!  I have toyed with the code a bit, trying
>> variations such as
>> but that converts all the missing to 0's, which is only marginally better.
>>
>> So the question is, is there some way to use a single, precise line of
>> code to create eighty-seven 1's, four hundred fourteen  0's and 1499
>> Missing values in one dummy variable?  I know I can do it with several
>> lines of code, but I'm looking for something more concise, as it needs
>> to run many hundreds of times.
>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```