|Title||Numeric variables input as string|
|Author||Nicholas J. Cox, Durham University, UK|
Users often find that Stata is reading in most, or even all, variables as string variables, when most, or even all, are—or should be—numeric. If a variable is string, then typically Stata refuses to do calculations. You may even get the cryptic message no observations, which here means “no numeric values on which to do that”. A command such as tabulate will also list numeric values in alphanumeric order, rather than numeric, so that 1, 11, and 2 appear in that order. Most directly, describe will show string variables as having some storage type (for example, str1, str12) and as having a display format ending in s, such as %9s.
See [D] data types to learn more.
One common reason for this problem is that the data have been imported from a spreadsheet or something similar. Some users of Excel or similar programs get in the habit of putting several lines of header material before the body of their data. Although Stata tries to detect such lines, it is not always successful, and this may have caused each variable to be treated as string. For a discussion of this and several other possible problems, see the FAQ “How do I get information from Excel into Stata?”. Even if your data have been nowhere near Excel, that FAQ may still be helpful.
The most general remedy is to use the destring command. For example, typing
. destring, replace
will do the best it can at putting things right. First, however, see [D] destring to learn about its options for special problems, and, especially, check that there are no header lines in the body of the data. The easiest way to fix those may be by using the Data Editor.
Note, however, that destring is not a good idea for dates or times recorded as strings; it could, at best, convert "2004Q4" and "2005Q1" to 20044 and 20051, which are not at all what you want. The story on what to do with string dates and times is best continued in
. help datetime