Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to count non-numeric obs

From   "Nick Cox" <>
To   <>
Subject   st: RE: how to count non-numeric obs
Date   Thu, 5 Dec 2002 18:43:25 -0000

Radu Ban
> I have a flat (ASCII) dataset of 130 columns and roughly
> 250,000 lines. In
> theory all the observations should be numeric, but just by
> visual inspection
> I can tell that's not the case. So now I want to count the number of
> non-numeric observations by column.
> I'm trying to use -infix-, i.e.
> infix var1 1 var2 2 .... var130 130 using ../rawdata/raw.txt
> Is there a quick way to put each column into a variable
> (other than typing
> all the indivdual variable names and column numbers), and
> reading in the
> dataset only once?
> I know I can do sth like
> forvalues i=1/130 {
>     infix var`i' `i' using ../rawdata/raw.txt, clear
>     sum var`i' *to see how many numeric obs i have
>     }
> but this would mean having to read in a sizeable dataset
> 130 times which
> would take a long time.

If you have Stata/SE you can -infix-
your data as a single str130 variable.

If you don't, you can -infix- them
as a str80 and a str50 variable.

Then within Stata, you can do something
like this

forval i = 1/130 {
	gen str1 s`i' = substr(data,`i',1)
	qui gen n`i' = real(s`i')


forval i = 1/80 {
	gen str1 s`i' = substr(data1,`i',1)
	qui gen n`i' = real(s`i')

forval i = 81/130
	gen str1 s`i' = substr(data2,`i'-80,1)
	qui gen n`i' = real(s`i')

Then -summarize- on the numeric variables
will show you how many missings you have.
Or there are many other ways of getting
at that, e.g. -nmissing- from STB-60.
And you can look at the string
variables to see why the numerics have


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index