Home  /  Resources & support  /  FAQs  /  Encoding a string variable

Why does the command complain that there are no observations?

Title   Encoding a string variable
Author James Hardin, StataCorp

The most common cause of this error message is that you are trying to use a string variable with a command that only supports numeric variables. You can only tell the type of a variable by using the describe command.

This is easy to fix.

If you have a string variable and want to convert it to a numeric variable, you can use the encode command. If you have a string variable that has only numbers in it, then you can alternatively use the real() function.

 . describe
        
 Contains data
   obs:             4                          
  vars:             2                          
  size:            48 
 ------------------------------------------------------------------------
               storage  display     value
 variable name   type   format      label      variable label
 ------------------------------------------------------------------------
 a               str4   %9s                    
 b               str4   %9s                    
 ------------------------------------------------------------------------
 Sorted by:  
      Note:  dataset has changed since last saved
 
 . list
 
      +-------+
      | a   b |
      |-------|
   1. | 1   a |
   2. | 2   b |
   3. | 3   c |
   4. | 4   d |
      +-------+
 
 . gen na = real(a)
 
 . encode b, gen(nb)
 
 . describe
        
 Contains data
   obs:             4                          
  vars:             4                          
  size:            80 
 ------------------------------------------------------------------------
               storage  display     value
 variable name   type   format      label      variable label
 ------------------------------------------------------------------------
 a               str4   %9s                    
 b               str4   %9s                    
 na              float  %9.0g                  
 nb              long   %8.0g       nb         
 ------------------------------------------------------------------------
 Sorted by:  
      Note:  dataset has changed since last saved
 
 . list

      +-----------------+
      | a   b   na   nb |
      |-----------------|
   1. | 1   a    1    a |
   2. | 2   b    2    b |
   3. | 3   c    3    c |
   4. | 4   d    4    d |
      +-----------------+

Although nb is a numeric variable, it looks like a string variable because the encode command added value labels to it.

 . list nb, nolab

      +----+
      | nb |
      |----|
   1. |  1 |
   2. |  2 |
   3. |  3 |
   4. |  4 |
      +----+
Warning:   If you have more than 67,784 unique values of the string variables that you are encoding, encode will complain. If that is the case, then you can use
        . egen nb = group(b)
which will generate a numeric variable nb that does not have value labels.