Frame (sfi.Frame)

class sfi.Frame

This class provides access to Stata frames. Functionality is provided by wrapping a Stata frame in a Python object of type Frame, which provides many methods for accessing the underlying Stata frame. If the underlying frame is renamed from Stata, Mata, etc., then access to the frame from its object will be lost. For more information about Stata frames, see help frames in Stata.

All variable and observation numbering begins at 0. The allowed values for the variable index var and the observation index obs are

-nvar <= var < nvar

and

-nobs <= obs < nobs

Here nvar is the number of variables defined in the underlying Stata frame, which is returned by getVarCount(). nobs is the number of observations defined in the underlying Stata frame, which is returned by getObsTotal().

Negative values for var and obs are allowed and are interpreted in the usual way for Python indexing. In all functions that take var as an argument, var can be specified as either the variable index or the variable name. Note that passing the variable index will be more efficient because looking up the index for the specified variable name is avoided for each function call.

Method Summary

connect(name) Connect to an existing frame in Stata and return a new Frame instance that can be used to access it.
create(name) Create a new frame in Stata and return a new Frame instance that can be used to access it.
addObs(n[, nofill]) Add n observations to the frame.
addVarByte(name) Add a variable of type byte to the frame.
addVarDouble(name) Add a variable of type double to the frame.
addVarFloat(name) Add a variable of type float to the frame.
addVarInt(name) Add a variable of type int to the frame.
addVarLong(name) Add a variable of type long to the frame.
addVarStr(name, length) Add a variable of type str to the frame.
addVarStrL(name) Add a variable of type strL to the frame.
allocateStrL(sc, size[, binary]) Allocate a strL so that a buffer can be stored using writeBytes(); the contents of the strL will not be initialized.
changeToCWF() Set the Frame as the current working frame in Stata.
clone(newName) Create a new Frame instance by cloning the current Frame and its contents.
drop() Drop the frame in Stata.
dropVar(var) Drop the specified variables from the frame.
get([var, obs, selectvar, valuelabel, …]) Read values from the frame.
getAsDict([var, obs, selectvar, valuelabel, …]) Read values from the frame and store them in a dictionary.
getAt(var, obs) Read a value from the frame.
getFormattedValue(var, obs, bValueLabel) Read a value from the frame, applying its display format.
getFrameAt(index) Utility method for getting the name of a Stata frame at a given index.
getFrameCount() Utility method for getting the number of frames in Stata.
getObsTotal() Get the number of observations in the frame.
getStrVarWidth(var) Get the width of the variable of type str.
getVarCount() Get the number of variables in the frame.
getVarFormat(var) Get the format for the variable in the frame.
getVarIndex(name) Look up the variable index for the specified name in the frame.
getVarLabel(var) Get the label for the Stata variable.
getVarName(index) Get the name for the variable in the frame.
getVarType(var) Get the storage type for the variable in the frame, such as byte, int, long, float, double, strL, str18, etc.
isVarTypeStr(var) Test if a variable is of type str.
isVarTypeString(var) Test if a variable is of type string.
isVarTypeStrL(var) Test if a variable is of type strL.
keepVar(var) Keep the specified variables in the frame.
list([var, obs]) List values from the frame.
readBytes(sc, length) Read a sequence of bytes from a strL in the frame.
rename(newName) Rename the frame in Stata.
renameVar(var, name) Rename a variable.
setObsTotal(nobs) Set the number of observations in the frame.
setVarFormat(var, format) Set the format for a Stata variable.
setVarLabel(var, label) Set the label for a Stata variable.
store(var, obs, val[, selectvar]) Store values in the frame.
storeAt(var, obs, val) Store a value in the frame.
storeBytes(sc, b, binary) Store a byte buffer to a strL in the frame.
writeBytes(sc, b[, off, length]) Write length bytes from the specified byte buffer starting at offset off to a strL in the frame; the strL must be allocated using allocateStrL() before calling this method.

Method Detail

classmethod connect(name)

Connect to an existing frame in Stata and return a new Frame instance that can be used to access it.

Parameters:

name (str) – Name of an existing Stata frame.

Returns:

A Frame that corresponds to the existing frame in Stata.

Return type:

Frame

Raises:

FrameError – This error can be raised if

  • the frame name does not already exist in Stata.
  • Python fails to connect to the frame.
classmethod create(name)

Create a new frame in Stata and return a new Frame instance that can be used to access it.

Parameters:name (str) – Name of the Stata frame to create.
Returns:A new Frame that corresponds to the new frame in Stata.
Return type:Frame
Raises:FrameError – If the creation of the new frame in Stata fails.
addObs(n, nofill=False)

Add n observations to the frame. By default, the added observations are filled with the appropriate missing-value code. If nofill is specified and equal to True, the added observations are not filled, which speeds up the process. Setting nofill to True is not recommended. If you choose this setting, it is your responsibility to ensure that the added observations are ultimately filled in or removed before control is returned to Stata.

There need not be any variables defined to add observations. If you are attempting to create a frame from nothing, you can add the observations first and then add the variables.

Parameters:
  • n (int) – Number of observations to add.
  • nofill (bool, optional) – Do not fill the added observations. Default is False.
Raises:

ValueError – If the number of observations to add, n, exceeds the limit of observations.

addVarByte(name)

Add a variable of type byte to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
addVarDouble(name)

Add a variable of type double to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
addVarFloat(name)

Add a variable of type float to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
addVarInt(name)

Add a variable of type int to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
addVarLong(name)

Add a variable of type long to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
addVarStr(name, length)

Add a variable of type str to the frame.

Parameters:
  • name (str) – Name of the variable to be created.
  • length (int) – Initial size of the variable. If the length is greater than Data:getMaxStrLength(), then a variable of type strL will be created.
Raises:

ValueError – This error can be raised if

  • name is not a valid Stata variable name.
  • length is not a positive integer.
addVarStrL(name)

Add a variable of type strL to the frame.

Parameters:name (str) – Name of the variable to be created.
Raises:ValueError – If name is not a valid Stata variable name.
allocateStrL(sc, size, binary=True)

Allocate a strL so that a buffer can be stored using writeBytes(); the contents of the strL will not be initialized.

Parameters:
  • sc (StrLConnector) – The StrLConnector representing a strL.
  • size (int) – The size in bytes.
  • binary (bool, optional) – Mark the data as binary. Note that if the data are not marked as binary, Stata expects that the data be UTF-8 encoded. An alternate approach is to call storeAt(), where the encoding is automatically handled. Default is True.
changeToCWF()

Set the Frame as the current working frame in Stata. The current working frame in Stata can be accessed using Data if desired.

clone(newName)

Create a new Frame instance by cloning the current Frame and its contents. This results in a new frame in Stata.

Parameters:newName (str) – The name of the new frame to be created.
Returns:A Frame that corresponds to the newly cloned frame in Stata.
Return type:Frame
Raises:FrameError – If the cloning of the frame fails.
drop()

Drop the frame in Stata. You may not drop a frame if it is the current working frame in Stata.

dropVar(var)

Drop the specified variables from the frame.

Parameters:var (int, str, or list-like) – Variables to drop. It can be specified as a single variable index or name, or an iterable of variable indices or names.
Raises:ValueError – If any of the variable indices or names specified in var is out of range or not found.
get(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())

Read values from the frame.

Parameters:
  • var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
  • obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
  • selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
  • valuelabel (bool, optional) – Use the value label when available. Default is False.
  • missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.
Returns:

A list of lists containing the values from the frame. Each sublist contains values for one observation.

Return type:

List

Raises:

ValueError – This error can be raised if

  • any of the variable indices or names specified in var is out of range or not found.
  • any of the observation indices specified in obs is out of range.
  • selectvar is out of range or not found.
getAsDict(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())

Read values from the frame and store them in a dictionary. The keys are the variable names. The values are the data values for the corresponding variables.

Parameters:
  • var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
  • obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
  • selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
  • valuelabel (bool, optional) – Use the value label when available. Default is False.
  • missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned dictionary are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.
Returns:

Return a dictionary containing the data values from the frame.

Return type:

dictionary

Raises:

ValueError – This error can be raised if

  • any of the variable indices or names specified in var is out of range or not found.
  • any of the observation indices specified in obs is out of range.
  • selectvar is out of range or not found.
getAt(var, obs)

Read a value from the frame.

Parameters:
  • var (int or str) – Variable to access. It can be specified as the variable index or name.
  • obs (int) – Observation to access.
Returns:

The value.

Return type:

float or str

Raises:

ValueError – This error can be raised if

  • var is out of range or not found.
  • obs is out of range.
getFormattedValue(var, obs, bValueLabel)

Read a value from the frame, applying its display format.

Parameters:
  • var (int or str) – Variable to access. It can be specified as the variable index or name.
  • obs (int) – Observation to access.
  • bValueLabel (bool) – Use the value label when available.
Returns:

The formatted value as a string.

Return type:

str

Raises:

ValueError – This error can be raised if

  • var is out of range or not found.
  • obs is out of range.
static getFrameAt(index)

Utility method for getting the name of a Stata frame at a given index.

Parameters:index (int) – The index for a frame.
Returns:The name of the frame for the specified index.
Return type:str
static getFrameCount()

Utility method for getting the number of frames in Stata.

Returns:The number of frames.
Return type:int
getObsTotal()

Get the number of observations in the frame.

Returns:The number of observations.
Return type:int
getStrVarWidth(var)

Get the width of the variable of type str.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:The width of the variable.
Return type:int
Raises:ValueError – If var is out of range or not found.
getVarCount()

Get the number of variables in the frame.

Returns:The number of variables.
Return type:int
getVarFormat(var)

Get the format for the variable in the frame.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:The variable format.
Return type:str
Raises:ValueError – If var is out of range or not found.
getVarIndex(name)

Look up the variable index for the specified name in the frame.

Parameters:name (str) – Variable to access.
Returns:The variable index.
Return type:int
Raises:ValueError – If name is not found.
getVarLabel(var)

Get the label for the Stata variable.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:The variable label.
Return type:str
Raises:ValueError – If var is out of range or not found.
getVarName(index)

Get the name for the variable in the frame.

Parameters:index (int) – Variable to access.
Returns:The variable name at the given index.
Return type:str
Raises:ValueError – If index is out of range.
getVarType(var)

Get the storage type for the variable in the frame, such as byte, int, long, float, double, strL, str18, etc.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:The variable storage type of the variable.
Return type:str
Raises:ValueError – If var is out of range or not found.
isVarTypeStr(var)

Test if a variable is of type str.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:True if the variable is of type str.
Return type:bool
Raises:ValueError – If var is out of range or not found.
isVarTypeString(var)

Test if a variable is of type string.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:True if the variable is of type str or strL.
Return type:bool
Raises:ValueError – If var is out of range or not found.
isVarTypeStrL(var)

Test if a variable is of type strL.

Parameters:var (int or str) – Variable to access. It can be specified as the variable index or name.
Returns:True if the variable is of type strL.
Return type:bool
Raises:ValueError – If var is out of range or not found.
keepVar(var)

Keep the specified variables in the frame.

Parameters:var (int, str, or list-like) – Variables to keep. It can be specified as a single variable index or name, or an iterable of variable indices or names.
Raises:ValueError – If any of the variable indices or names specified in var is out of range or not found.
list(var=None, obs=None)

List values from the frame. The values are displayed using their corresponding variable formats.

Parameters:
  • var (int, str, or list-like, optional) – Variables to display. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
  • obs (int or list-like, optional) – Observations to display. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
Raises:

ValueError – This error can be raised if

  • any of the variable indices or names specified in var is out of range or not found.
  • any of the observation indices specified in obs is out of range.
readBytes(sc, length)

Read a sequence of bytes from a strL in the frame.

Parameters:
Returns:

The array of bytes. An empty array of bytes is returned if there are no more data because the end has been reached.

Return type:

bytes

Raises:
  • ValueError – If length is not a positive integer.
  • IOError – If failure occurred when attempting to read a sequence of bytes.
rename(newName)

Rename the frame in Stata.

Parameters:newName (str) – The name of the new frame.
renameVar(var, name)

Rename a variable.

Parameters:
  • var (str or int) – Name or index of the variable to rename.
  • name (str) – New variable name.
Raises:

ValueError – This error can be raised if

  • var is not found or out of range.
  • name is not a valid Stata variable name.
setObsTotal(nobs)

Set the number of observations in the frame.

Parameters:nobs (int) – The number of observations to set.
Raises:ValueError – If the number of observations to set, nobs, exceeds the limit of observations.
setVarFormat(var, format)

Set the format for a Stata variable.

Parameters:
  • var (int or str) – Index or name of the variable to format.
  • format (str) – New format.
Raises:

ValueError – This error can be raised if

  • var is out of range or not found.
  • format is not a valid Stata format.
setVarLabel(var, label)

Set the label for a Stata variable.

Parameters:
  • var (int or str) – Index or name of the variable to label.
  • label (str) – New label.
Raises:

ValueError – If var is out of range or not found.

store(var, obs, val, selectvar=None)

Store values in the frame.

Parameters:
  • var (int, str, list-like, or None) – Variables to access. It can be specified as a single variable index or name, an iterable of variable indices or names, or None. If None is specified, all the variables are specified.
  • obs (int, list-like, or None) – Observations to access. It can be specified as a single observation index, an iterable of observation indices, or None. If None is specified, all the observations are specified.
  • val (array-like) – Values to store. The dimensions of val should match the dimensions implied by var and obs. Each of the values can be numeric or string based on the corresponding variable data types.
  • selectvar (int or str, optional) – Only store values for observations with selectvar!=0. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means values are stored for all observations specified. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be skipped.
Raises:
  • ValueError – This error can be raised if

    • any of the variable indices or names specified in var is out of range or not found.
    • any of the observation indices specified in obs is out of range.
    • dimensions of val do not match the dimensions implied by var and obs.
    • selectvar is out of range or not found.
  • TypeError – If any of the values specified in val does not match the corresponding variable data type.

storeAt(var, obs, val)

Store a value in the frame.

Parameters:
  • var (int or str) – Variable to access. It can be specified as the variable index or name.
  • obs (int) – Observation to access.
  • val (float or str) – Value to store. The value data type depends on the corresponding variable data type.
Raises:

ValueError – This error can be raised if

  • var is out of range or not found.
  • obs is out of range.
storeBytes(sc, b, binary)

Store a byte buffer to a strL in the frame. You do not need to call allocateStrL() before using this method.

Parameters:
  • sc (StrLConnector) – The StrLConnector representing a strL.
  • b (bytes or bytearray) – Bytes to store.
  • binary (bool) – Mark the data as binary.
writeBytes(sc, b, off=None, length=None)

Write length bytes from the specified byte buffer starting at offset off to a strL in the frame; the strL must be allocated using allocateStrL() before calling this method.

Parameters:
  • sc (StrLConnector) – The StrLConnector representing a strL.
  • b (bytes or bytearray) – The buffer holding the data to store.
  • off (int, optional) – The offset into the buffer. If not specified, 0 is used.
  • length (int, optional) – The number of bytes to write. If not specified, the size of b is used.
Raises:

ValueError – This error can be raised if

  • off is negative.
  • length is not a positive integer.

Examples

The following provides a few quick examples illustrating how to use this class:

>>> from sfi import Frame
>>> stata: sysuse auto, clear
(1978 Automobile Data)
>>> d = Frame.connect('default')
>>> f = d.clone('myauto')
>>> Frame.getFrameCount()
2
>>> Frame.getFrameAt(0)
'default'
>>> Frame.getFrameAt(1)
'myauto'
>>> f.get(0, 0)
[[AMC Concord]]
>>> f.getAt(0, 0)
'AMC Concord'
>>> f.get(var='price')
[4099, 4749, 3799, 4816, 7827, 5788, 4453, 5189, 10372, 4082, 11385, 14500, 15906, 3299, 5705,
 4504, 5104, 3667, 3955, 3984, 4010, 5886, 6342, 4389, 4187, 11497, 13594, 13466, 3829, 5379,
 6165, 4516, 6303, 3291, 8814, 5172, 4733, 4890, 4181, 4195, 10371, 4647, 4425, 4482, 6486, 40
 60, 5798, 4934, 5222, 4723, 4424, 4172, 9690, 6295, 9735, 6229, 4589, 5079, 8129, 4296, 5799,
 4499, 3995, 12990, 3895, 3798, 5899, 3748, 5719, 7140, 5397, 4697, 6850, 11995]
>>>
>>> f.get(obs=0)
['AMC Concord', 4099, 22, 3, 2.5, 11, 2930, 186, 40, 121, 3.5799999237060547, 0]
>>>
>>> f.get([0,2,3], [0,2,4,6])
[['AMC Concord', 22, 3], ['AMC Spirit', 22, 8.98846567431158e+307], ['Buick Electra', 15, 4], ['Buick Opel', 26, 8.98846567431158e+307]]
>>>
>>> f.getVarLabel(0)
'Make and Model'
>>> f.getVarLabel('price')
'Price'
>>> f.setVarLabel(1, 'Retail Price')
>>> f.setVarLabel('mpg', 'Mileage per Gallon')
>>> f.renameVar(0, 'make2')
>>> f.renameVar('price', 'price2')
>>> f.dropVar("make2")
>>> f.dropVar("price2 mpg rep78")
>>> f.dropVar(0)
>>> f.dropVar([0,2,3])

Next we will show you a more advanced example to illustrate how to communicate between Stata and Python using this class. Suppose we have a dataset in memory, and we want to create a new frame in Stata that clones the variables and data values from the dataset. Instead of using the clone() method above, we will create the frame from scratch using various functions in this class.

First, we load the data containing information on various automobiles into Stata.

 . webuse auto, clear
(1978 Automobile Data)
 . describe

Contains data from https://www.stata-press.com/data/r16/auto.dta
  obs:            74                          1978 Automobile Data
 vars:            12                          13 Apr 2018 17:45
                                          (_dta has notes)
--------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------
make            str18   %-18s                 Make and Model
price           int     %8.0gc                Price
mpg             int     %8.0g                 Mileage (mpg)
rep78           int     %8.0g                 Repair Record 1978
headroom        float   %6.1f                 Headroom (in.)
trunk           int     %8.0g                 Trunk space (cu. ft.)
weight          int     %8.0gc                Weight (lbs.)
length          int     %8.0g                 Length (in.)
turn            int     %8.0g                 Turn Circle (ft.)
displacement    int     %8.0g                 Displacement (cu. in.)
gear_ratio      float   %6.2f                 Gear Ratio
foreign         byte    %8.0g      origin     Car type
--------------------------------------------------------------------------------
Sorted by: foreign

Then, we write a Python script file, say, frameex.py, that creates a new empty frame named myauto in Stata, and then it clones all the variables and data values from the current dataset in memory and stores them in this frame.

import sys
from sfi import Data, Frame

# clone variables
def clone_var(f):
    nvar = Data.getVarCount()

    for i in range(nvar):
        varname = Data.getVarName(i)
        vartype = Data.getVarType(i)
        if vartype=="byte":
            f.addVarByte(varname)
        elif vartype=="double":
            f.addVarDouble(varname)
        elif vartype=="float":
            f.addVarFloat(varname)
        elif vartype=="int":
            f.addVarInt(varname)
        elif vartype=="long":
            f.addVarLong(varname)
        elif vartype=="strL":
            f.addVarStrL(varname)
        else:
            f.addVarStr(varname, 10)

        f.setVarFormat(i, Data.getVarFormat(i))
        f.setVarLabel(i, Data.getVarLabel(i))

# clone data values
def clone_data(f):
    f.setObsTotal(Data.getObsTotal())
    nvar = Data.getVarCount()

    for i in range(nvar):
        f.store(i, None, Data.get(var=i))

# create the new frame; the frame name is passed through
# the args() option of -python script-
newFrame = sys.argv[1]
fr = Frame.create(newFrame)

clone_var(fr)
clone_data(fr)

Next, we run this script file in Stata, clear the dataset in memory, and load the frame myauto into Stata as the current working dataset.

 . python script frameex.py, args("myauto")
 . clear
 . frames change myauto
 . describe

Contains data
  obs:            74
 vars:            12
--------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------
make            str10   %-18s                 Make and Model
price           int     %8.0gc                Price
mpg             int     %8.0g                 Mileage (mpg)
rep78           int     %8.0g                 Repair Record 1978
headroom        float   %6.1f                 Headroom (in.)
trunk           int     %8.0g                 Trunk space (cu. ft.)
weight          int     %8.0gc                Weight (lbs.)
length          int     %8.0g                 Length (in.)
turn            int     %8.0g                 Turn Circle (ft.)
displacement    int     %8.0g                 Displacement (cu. in.)
gear_ratio      float   %6.2f                 Gear Ratio
foreign         byte    %8.0g                 Car type
--------------------------------------------------------------------------------
Sorted by:

 . frames change default
 . frames reset