Distinct values of a column in current H2O frame


    _h2oframe _unique columnname [if] [in] [, options]
 options                Description
 clean                  display string values without compound double quotes
 missing                include missing values of columnname in calculation
 separate(separator)    separator to serve as punctuation for the values of returned
                          list; default is a space


_h2oframe _unique displays a list of the distinct values of the column columnname. columnname may not be a string column.


clean displays string values without compound double quotes. By default, each distinct string value is displayed within compound double quotes, because these are the most general delimiters. If you know that the string values in columnname do not include embedded spaces or embedded quotes, then clean is an appropriate option. clean does not affect the display of values from numeric columns.

missing specifies that missing values of columnname be included in the calculation. The default is to exclude them.

separate(separator) specifies a separator to serve as punctuation for the values of the returned list. The default is a space. A useful alternative is a comma.


_h2oframe _unique serves two different functions. First, it gives a compact display of the distinct values of columnname. More commonly, it is useful when you desire to cycle through the distinct values of columnname with (for example) foreach. _h2oframe _unique leaves behind a list in r(uniques) that may be used in a subsequent command.

_h2oframe _unique may hit the limits imposed by your Stata. However, it is typically used when the number of distinct values of columnname is not extremely large.


 . sysuse auto
 . _h2oframe _put, into(auto)
 . _h2oframe _change auto

 . _h2oframe _unique rep78
 . display "`r(uniques)'"

 . _h2oframe _unique rep78, sep(,)
 . display "`r(uniques)'"

Stored results

 _h2oframe _unique stores the following in r():

   r(r)           number of distinct values

   r(uniques)     list of distinct values