# st: RE: how to count non-numeric obs

 From "Nick Cox" To Subject st: RE: how to count non-numeric obs Date Thu, 5 Dec 2002 18:43:25 -0000

```Radu Ban
>
> I have a flat (ASCII) dataset of 130 columns and roughly
> 250,000 lines. In
> theory all the observations should be numeric, but just by
> visual inspection
> I can tell that's not the case. So now I want to count the number of
> non-numeric observations by column.
> I'm trying to use -infix-, i.e.
> infix var1 1 var2 2 .... var130 130 using ../rawdata/raw.txt
>
> Is there a quick way to put each column into a variable
> (other than typing
> all the indivdual variable names and column numbers), and
> dataset only once?
> I know I can do sth like
>
> forvalues i=1/130 {
>     infix var`i' `i' using ../rawdata/raw.txt, clear
>     sum var`i' *to see how many numeric obs i have
>     }
>
> but this would mean having to read in a sizeable dataset
> 130 times which
> would take a long time.

If you have Stata/SE you can -infix-
your data as a single str130 variable.

If you don't, you can -infix- them
as a str80 and a str50 variable.

Then within Stata, you can do something
like this

forval i = 1/130 {
gen str1 s`i' = substr(data,`i',1)
qui gen n`i' = real(s`i')
}

or

forval i = 1/80 {
gen str1 s`i' = substr(data1,`i',1)
qui gen n`i' = real(s`i')
}

forval i = 81/130
gen str1 s`i' = substr(data2,`i'-80,1)
qui gen n`i' = real(s`i')
}

Then -summarize- on the numeric variables
will show you how many missings you have.
Or there are many other ways of getting
at that, e.g. -nmissing- from STB-60.
And you can look at the string
variables to see why the numerics have
missings.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```