[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
RE: st: Some kind of count or tabulation? |

Date |
Fri, 6 Jan 2006 21:28:58 -0000 |

I certainly and Scott probably overlooked the fact that you were using "." as a personal code for missing. By and large, Stata commands do not treat "." as meaning missing. The main, and perhaps only, exception is -destring-, which is working on the assumption that a string variable is really a numeric variable trapped in a string body. (-compare- used to be another exception.) It follows that counting missings, whether using -egen- or my more direct approach, won't work for you until you re-code "." as "", hence Svend's suggestion. Otherwise, Scott's and Svend's suggestions are suggesting complementary -egen- functions. I can't explain why Scott's and my suggestions give different results unless you have other variables that are not captured by -Var*-. I used -*- as a wildcard, not -Var*-. In your code, the -sort- and the -by:- do no harm but are completely irrelevant. It would be easier to count "." rather than cycle through all the other values. With your previous set-up, gen nperiod = 0 foreach v of var Var* { replace nperiod = nperiod + (`v' == ".") } gives you a count of period missings, after which gen allpresent = nperiod == 0 gives what I think you want. You could also count occurrences != ".". For this and other reasons, -foreach- and -forval- are strongly recommended. The usual searches point to tutorials on those constructs. Nick n.j.cox (much editing in this digest) barleywater is using Stata 8.2, and asked > My data set looks like this: > > obs Var1 Var2 Var3 Var(nth) > 1 jacn clstr lnreg pval > 2 bstr . lgreg nopval > 3 . rct . nopval > 4 jacn clstr anova . > I want to find out how many observations contained all the variables. > In this example, only the first observation contained all the variables. Scott Merryman suggested > egen all_var = rmiss(Var*) > > count if all_var == 0 Nick Cox commented > It can also be done without generating a new variable. > unab var : * > local var : subinstr local var " " ",", all > count if !mi(`var') barleywater replied > I understand what Scott tried to do but looking at his commands made > me realised that perhaps he, and by extension also Nick too, > misunderstood my question, which could be better expressed. > I have less understanding of Nick's commands which use macros > (afraid my Stata fluency doesn't go that far yet). > > However, running Scott's command and Nick's showed a > difference of 1, > e.g. Scott's would return a value of 78 whilst Nick's 77. > I am not sure why that is the case. But > neither was what I was looking for. > Here's what I did to get what i want. > > gen dumvar1=. > gen dumvar2=. > . > . > . > sort var1 > by var1: replace dumvar1 = 1 if var1 == "jacn" > by var1: replace dumvar1 = 1 if var1 == "bstr" > sort dumvar1 > replace dumvar1 = 0 if dumvar1==. > . > . > . > sort var2 > by var2: replace dumvar2 = 1 if var2 == "clstr" > by var2: replace dumvar2 = 1 if var2 == "rct" > by var2: replace dumvar2 = 1 if var2 == "xovr" > sort dumvar2 > replace dumvar1 = 0 if dumvar1==. > . > . > gen total = var1 + var2 +... > sort total > l obs total > > but... > > 1. not elegant (not a problem since it does the job) > 2. it loses the information the variables conveyed by > replacing with 1's > (not ideal) > > I would appreciate further help/advice to shorten the do-file > if possible (i think it needed -foreach val- at the beginning). Svend Juul suggested > I understand that your var1-varn are string variables. For strings, > the missing value typically is a blank, not a period, so I would > first: > foreach V of varlist var1-varn { > replace `V' = "" if `V' == "." > } > If you feel unsecure about the above construct, you might instead give > as many -replace- commands as you have variables: > > replace var1="" if var1=="." > ... > Now you can use egen's -robs()- function with the -strok- option: > > egen nonmiss = robs(var1-varn) , strok > In Stata 9 the -robs()- function got the more telling name -rownonmiss()-. > Now, the variable -nonmiss- tells the number of nonmissing (i.e. > non-blank) values for each observation. barleywater replied > Your -egen- suggestion worked. Earlier on, at the stage of inputting the data, I indeed used many > replace var if ... > in a do-file to replace blanks with a period before running my > small do-file to count but I appreciate your -foreach- help suggestion. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Some kind of count or tabulation?***From:*"b. water" <[email protected]>

- Prev by Date:
**st: filling in missing string** - Next by Date:
**st: RE: filling in missing string** - Previous by thread:
**RE: st: Some kind of count or tabulation?** - Next by thread:
**RE: st: Some kind of count or tabulation?** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |