Tratamiento de los valores no disponibles

Speaker:   Jose Maria Sánchez Sáez

Syntax

        missing varlist, analysis method (method-option) [time-series-option]

Description

missing examines and replaces missing values for the variables in varlist.

Options

analysis displays a table of association measures. Specifically, Simple and Jaccard coefficients and their significance levels are calculated. High coefficient values correspond to strong relationships between variables.

method(method-option) specifies the method used for replacing missing values in varlist. Available methods are

  • drop drops observations for which any variable takes on missing value.
  • impute makes use of the impute ado-file for performing best subset regression. Since regression does not assume causality, each variable is modelled as a combination of the rest.
  • inter[varname] replaces missing values with linear interpolations of the existing values for each group defined by varname. When the missing values are placed at the beginning (end) of the group, the first (last) available value of that group is repeated. Interpolation only makes sense in the case of time series.
In order to sort by date the time-series option is required: date(date-variable)
  • mean[varname] replaces missing values with the mean value for each group defined by varname.
  • predict fits (fit command) a linear model to all the variables in varlist and replaces missing values with predicted (predict command) values. Note: that this method does not assure that all the missing values are filled in.

Reference

Jain, A. K., R.C. Dubes. 1988.
Algorithms for Clustering Data. Prentice Hall.