Create a new H2O frame

Syntax

    _h2oframe create newframename [, options]
 options                        Description
 -----------------------------------------------------------------------------------
 rows(#)                        specify number of rows
 cols(#)                        specify number of columns
 norandomize                    specify not to generate the data values randomly
 value(#)                       specify the value for all numeric columns when
                                  norandomize is specified
 realfraction(#)                specify the fraction of real columns
 realrange(#)                   specify the range of values for real columns
 catfraction(#)                 specify the fraction of categorical columns
 factors(#)                     specify the number of factor levels in each
                                  categorical column
 intfraction(#)                 specify the fraction of int columns
 intrange(#)                    specify the range of values for int columns
 binfraction(#)                 specify the fraction of binary-valued categorical
                                  columns
 binonefraction(#)              specify the fraction of ones for
                                  binary-valued categorical columns
 timefraction(#)                specify the fraction of time columns
 strfraction(#)                 specify the fraction of string columns
 missfraction(#)                specify the fraction of total entries in the frame
                                  to be missing
 response                       prepend an additional response column to the frame
 resfactors(#)                  specify the number of factor levels in the response
                                  column
 rseed(#)                       specify the random-number seed used to generate the
                                  random values
 rseedcoltype(#)                specify the random-number seed used to generate the
                                  random column types
 -----------------------------------------------------------------------------------

Description

_h2oframe create creates a new H2O frame with random data. The new H2O frame may contain real, int, enum (categorical), time, and string columns. If you are not familiar with H2O frames, read What is an H2O frame?.

Options

rows(#) specifies the number of rows to generate in the destination H2O frame. The default is 10,000.

cols(#) specifies the number of columns to generate in the destination H2O frame. The default is 10.

norandomize specifies not to randomly generate the data values in the numeric columns of the destination H2O frame.

If norandomize is specified, the data values in the destination H2O frame will be equal to the value specified in value(), or they will be missing values if the missing fraction specified in missfraction() is not 0.

value(#) specifies the value for the numeric columns of the destination H2O frame when norandomize is specified. The default is 0.

realfraction(#) specifies the fraction of real columns in the destination H2O frame. The default is 0.5.

realrange(#) specifies the range of data values for all real columns. The default is 100.0, which means that all data values in real columns are between -100.0 and 100.0, inclusive.

catfraction(#) specifies the fraction of enum (categorical) columns in the destination H2O frame. The default is 0.2.

factors(#) specifies the number of factor levels in each enum column. The default is 100.

intfraction(#) specifies the fraction of int columns in the destination H2O frame. The default is 0.2.

intrange(#) specifies the range of data values for all int columns. The default is 100, which means that all data values in int columns are between -100 and 100, inclusive.

binfraction(#) specifies the fraction of binary-valued enum columns in the destination H2O frame. The default is 0.1.

binonefraction(#) specifies the fraction of ones in a binary-valued enum column. The default is 0.02.

timefraction(#) specifies the fraction of time columns in the destination H2O frame. The default is 0.

strfraction(#) specifies the fraction of string columns in the destination H2O frame. The default is 0.

missfraction(#) specifies the fraction of total entries in the destination H2O frame to be missing. The default is 0 if norandomize is specified and is 0.01 otherwise.

response specifies that an additional response column be prepended to the destination H2O frame, which makes the total number of columns cols() + 1.

resfactors(#) specifies the number of factor levels in the response column added with the option response.

rseed(#) sets the random-number seed used to generate data values in the destination H2O frame. This option can be used to reproduce the data in the H2O frame.

rseedcoltype(#) sets the random-number seed used to generate column types in the destination H2O frame. This option can be used to reproduce the data in the H2O frame.

Examples

 Create a new H2O frame with 10,000 rows and 10 columns
     . _h2oframe create frame1, rseed(17) rseedcoltype(17)
     . _h2oframe change frame1
     . _h2oframe describe
     . _h2oframe list in 1/10

 Same as above, but include a string column
     . _h2oframe create frame2, strfraction(0.1) rseed(17) rseedcoltype(17)
     . _h2oframe change frame2
     . _h2oframe describe
     . _h2oframe list in 1/10

 Create a new H2O frame with all real values set to 5
     . _h2oframe create frame3, norandomize value(5)
     . _h2oframe change frame3
     . _h2oframe describe
     . _h2oframe list in 1/10