Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Replace with returned results in svy loop
From 
 
Richard Williams <[email protected]> 
To 
 
[email protected], [email protected] 
Subject 
 
Re: st: Replace with returned results in svy loop 
Date 
 
Mon, 30 May 2011 13:59:03 -0500 
At 09:59 AM 5/30/2011, Steven Samuels wrote:
Nick is right and I was wrong.  The code does work with `r(N)'. I 
apologize, but the question remains: what led to Kate's error message?
First, let me say that I hope Nick is having a good holiday weekend. 
:-) Second, I don't think Kate showed us the final winning code. My 
own guess was that the mysterious $fooditemall included a string 
variable. Third, as a sidelight, while it may be fine in this 
specific case, I try to avoid coding like priceperkilo !=. Better is 
priceperkilo < . or else !missing(priceperkilo). This covers you in 
case .a, .b, etc. have been used as missing data codes.
Steve
On May 30, 2011, at 1:45 AM, Nick Cox wrote:
I agree with Steve's general advice.
Actually, r(N) does have a local macro persona `r(N)' . If you use
`r(N)'  the results will in this case be identical. Compare
. sysuse auto
(1978 Automobile Data)
. count if foreign
  22
. di r(N)
22
. di `r(N)'
22
This is not illegal, but just a little indirect when a reference to
r(N) will do fine.
Note that the intermediate scalar is unnecessary and that, in any
case, the scalar created was -vmean- but there was a later attempt to
use -varmean-. So, the code could have been
foreach X of varlist $fooditemall {
        quietly count if `X'==1 & priceperkilo != .
        if r(N) > 0 {
                    display "Variable is: " "`X'"
                    svy, subpop(`X'): mean priceperkilo
                    mat meanpriceperkilo = e(b)
        }
       quietly count if `X'==1 & priceperkilo ==.
       if `r(N)'>0 {
                    replace priceperkilo = meanpriceperkilo[1,1] if
`X'==1 & priceperkilo ==.
      }
}
I can't verify that this is exactly as intended and I won't try to
undermine the strong advice that this is not a good idea.
I will mention a surprising initial assumption on this list that
"everyone" on this international list celebrates U.S. Memorial Day
weekend!
Nick
On Mon, May 30, 2011 at 4:56 AM, Steven Samuels <[email protected]> wrote:
> Kate Schneider:
>
> I would put it more strongly than Richard: substituting the mean 
for all the data has no justification.
>
> If you are going to work in Stata, then I suggest you  read the 
error messages and do some digging. You haven't helped us by 
showing (as requested in the FAQ) exactly what Stata typed, so we 
could see where the error took place.
>
>
> But in your case, the error message makes the problem easy to 
find:  A type mismatch can occur only in some kind of comparison, 
and you have only two lines that contain these:  One is "if `X' == 
1 & priceperkilo !=.".  But you already know that clause is 
OK.   That leaves:  "if `r(N)'> 0". What could be wrong?  If you 
look at the -help- for "return", you will see that it never refers 
to saved results as local macros. And only local macros get single 
quotes around them (like `X').  If you didn't know that, but did a 
Google search for "stata r(N)" you would be led to: 
http://www.ats.ucla.edu/stat/stata/faq/returned_results.htm where 
returned results are used without quotes. So the answer is: remove 
the single quotes from around r(N).
>
> But although that solves the immediate problem, you are still 
left with the fact that there is also no justification for 
substituting the mean for missing values. There are two reasons. 
First, mean substitution distorts most other sample statistics, for 
example the standard deviation, skewness, kurtosis, some quantiles, 
correlations with other variables. Second, there are good 
imputation alternatives, as Richard has already stated.
>
>
> Steve
> [email protected]
>
>
> On May 29, 2011, at 10:16 PM, Richard Williams wrote:
>
> At 06:53 PM 5/29/2011, Kate Schneider wrote:
>> Thank you so much Richard! It totally works! We can't get it to only
>> replace the missing values, but I think it is more valid to create a
>> new variable with the estimated value anyways. Thanks so much!
>>
>> Kate
>
> Richard wrote:
>
> I'll take your word for it. :) I only looked at a small portion 
of the code and didn't try to figure out the rest. I'd be careful, 
though, if you aren't getting what you expected. You have to be 
pretty lucky to accidentally get a result that is better than the 
one you intended. Also, while I am not crazy about substituting the 
mean for missing data, I am even less crazy about substituting the 
mean for all the data, if that is in fact what you have done.
>
> Kate Schneider wrote:
> -Also, we have amended our code based on your recommendations but are
> getting a "type mismatch error"
>
> Here is what we're trying:
>
> foreach X of varlist $fooditemall {
> quietly count if `X'==1 & priceperkilo !=.
>       if `r(N)'>0 {
>       display "Variable is: " "`X'"
>       svy, subpop(`X'): mean priceperkilo
>       mat meanpriceperkilo = e(b)
>       scalar vmean = meanpriceperkilo[1,1]
>       }
> quietly count if `X'==1 & priceperkilo ==.
>       if `r(N)'>0 {
>       replace priceperkilo = varmean if `X'==1 & priceperkilo ==.
>       }
> }
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/