Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to count non-numeric obs


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: how to count non-numeric obs
Date   Thu, 5 Dec 2002 18:43:25 -0000

Radu Ban
>
> I have a flat (ASCII) dataset of 130 columns and roughly
> 250,000 lines. In
> theory all the observations should be numeric, but just by
> visual inspection
> I can tell that's not the case. So now I want to count the number of
> non-numeric observations by column.
> I'm trying to use -infix-, i.e.
> infix var1 1 var2 2 .... var130 130 using ../rawdata/raw.txt
>
> Is there a quick way to put each column into a variable
> (other than typing
> all the indivdual variable names and column numbers), and
> reading in the
> dataset only once?
> I know I can do sth like
>
> forvalues i=1/130 {
>     infix var`i' `i' using ../rawdata/raw.txt, clear
>     sum var`i' *to see how many numeric obs i have
>     }
>
> but this would mean having to read in a sizeable dataset
> 130 times which
> would take a long time.

If you have Stata/SE you can -infix-
your data as a single str130 variable.

If you don't, you can -infix- them
as a str80 and a str50 variable.

Then within Stata, you can do something
like this

forval i = 1/130 {
	gen str1 s`i' = substr(data,`i',1)
	qui gen n`i' = real(s`i')
}

or

forval i = 1/80 {
	gen str1 s`i' = substr(data1,`i',1)
	qui gen n`i' = real(s`i')
}

forval i = 81/130
	gen str1 s`i' = substr(data2,`i'-80,1)
	qui gen n`i' = real(s`i')
}

Then -summarize- on the numeric variables
will show you how many missings you have.
Or there are many other ways of getting
at that, e.g. -nmissing- from STB-60.
And you can look at the string
variables to see why the numerics have
missings.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index