The stata module (pystata.stata)¶

This module contains core functions used to interact with Stata.

Method Summary

`run`(cmd[, quietly, echo, inline])	Run a single line or a block of Stata commands.
`nparray_to_data`(arr[, prefix, force])	Load a NumPy array into Stata’s memory, making it the current dataset.
`pdataframe_to_data`(df[, force])	Load a pandas DataFrame into Stata’s memory, making it the current dataset.
`nparray_from_data`([var, obs, selectvar, …])	Export values from the current Stata dataset into a NumPy array.
`pdataframe_from_data`([var, obs, selectvar, …])	Export values from the current Stata dataset into a pandas DataFrame.
`nparray_to_frame`(arr, stfr[, prefix, force])	Load a NumPy array into a specified frame in Stata.
`pdataframe_to_frame`(df, stfr[, force])	Load a pandas DataFrame into a specified frame in Stata.
`nparray_from_frame`(stfr[, var, obs, …])	Export values from a Stata frame into a NumPy array.
`pdataframe_from_frame`(stfr[, var, obs, …])	Export values from a Stata frame into a pandas DataFrame.
`get_return`()	Retrieve current r() results and store them in a Python dictionary.
`get_ereturn`()	Retrieve current e() results and store them in a Python dictionary.
`get_sreturn`()	Retrieve current s() results and store them in a Python dictionary.

Method Detail

pystata.stata.run(cmd, quietly=False, echo=None, inline=None)¶

Run a single line or a block of Stata commands.

If a single-line Stata command is specified, the command is run through Stata directly. If you need to run a multiple-line command or a block of Stata commands, enclose the commands within triple quotes, “”” or ‘’’. The set of commands will be placed in a temporary do-file and executed all at once. Because the commands are executed from a do-file, you can add comments and delimiters with the specified commands.

Parameters

cmd (str) – The commands to execute.
quietly (bool, optional) – Suppress output from Stata commands. Default is False. When set to True, output will be suppressed.
echo (None, True, or False, optional) – Specify whether to echo the command(s). By default, when a single line of Stata command is specified, only the ouput is displayed; when a block of Stata commands is specified, each command and its output are displayed in sequence. If echo is not specified or specified as None, the global setting specified with set_command_show() is applied.
inline (None, True, or False, optional) – Specify whether to export and display the graphs generated by the commands, if there are any. If inline is not specified or specified as None, the global setting specified with set_graph_show() is applied.

Raises

SystemError – This error can be raised if any of the specified Stata commands result in an error.

pystata.stata.nparray_to_data(arr, prefix='v', force=False)¶

Load a NumPy array into Stata’s memory, making it the current dataset.

When the data type of the array conforms to a Stata variable type, this variable type will be used in Stata. Otherwise, each column of the array will be converted into a string variable in Stata.

By default, v1, v2, … are used as the variable names in Stata. If prefix is specified, it will be used as the variable prefix for all the variables loaded into Stata.

If there is a dataset in memory and it has been changed since it was last saved, an attempt to load a NumPy array into Stata will raise an exception. The force argument will force loading of the array, replacing the dataset in memory.

Parameters

arr (NumPy array) – The array to be loaded.
prefix (str, optional) – The string to be used as the variable prefix. Default is v.
force (bool, optional) – Force loading of the array into Stata. Default is False.

Raises

SystemError – This error can be raised if there is a dataset in memory that has changed since it was last saved, and force is False.

pystata.stata.pdataframe_to_data(df, force=False)¶

Load a pandas DataFrame into Stata’s memory, making it the current dataset.

Each column of the DataFrame will be stored as a variable. If the column type conforms to a Stata variable type, the variable type will be used in Stata. Otherwise, the column will be converted into a string variable in Stata.

The variable names will correspond to the column names of the DataFrame. If the column name is a valid Stata name, it will be used as the variable name. If it is not a valid Stata name, a valid variable name is created by using the makeVarName() method of the SFIToolkit class in the Stata Function Interface (sfi) module.

If there is a dataset in memory and it has been changed since it was last saved, an attempt to load a DataFrame into Stata will raise an exception. The force argument will force loading of the DataFrame, replacing the dataset in memory.

Parameters

df (pandas DataFrame) – The DataFrame to be loaded.
force (bool, optional) – Force loading of the DataFrame into Stata. Default is False.

Raises

SystemError – This error can be raised if there is a dataset in memory that has been changed since it was last saved, and force is False.

pystata.stata.nparray_from_data(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶

Export values from the current Stata dataset into a NumPy array.

Parameters

var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
valuelabel (bool, optional) – Use the value label when available. Default is False.
missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.

Returns

A NumPy array containing the values from the dataset in memory.

Return type

NumPy array

Raises

ValueError – This error can be raised for three possible reasons. One is if any of the variable indices or names specified in var are out of range or not found. Another is if any of the observation indices specified in obs are out of range. Last, it may be raised if selectvar is out of range or not found.

Notes

The definition of the utility class _DefaultMissing is as follows:

class _DefaultMissing:
    def __repr__(self):
        return "_DefaultMissing()"

This class is defined only for the purpose of specifying the default value for the parameter missingval of the above function. Users are not recommended to use this class for any other purpose.

pystata.stata.pdataframe_from_data(var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶

Export values from the current Stata dataset into a pandas DataFrame.

Parameters

var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
valuelabel (bool, optional) – Use the value label when available. Default is False.
missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.

Returns

A pandas DataFrame containing the values from the dataset in memory.

Return type

pandas DataFrame

Raises

ValueError – This error can be raised for three possible reasons. One is if any of the variable indices or names specified in var are out of range or not found. Another is if any of the observation indices specified in obs are out of range. Last, it may be raised if selectvar is out of range or not found.

pystata.stata.nparray_to_frame(arr, stfr, prefix='v', force=False)¶

Load a NumPy array into a specified frame in Stata.

When the data type of the array conforms to a Stata variable type, this variable type will be used in the frame. Otherwise, each column of the array will be converted into a string variable in the frame.

By default, v1, v2, … are used as the variable names in the frame. If prefix is specified, it will be used as the variable prefix for all the variables loaded into the frame.

If the frame of the specified name already exists in Stata, an attempt to load a NumPy array into the frame will raise an exception. The force argument will force loading of the array, replacing the original frame.

Parameters

arr (NumPy array) – The array to be loaded.
stfr (str) – The frame in which to store the array.
prefix (str, optional) – The string to be used as the variable prefix. Default is v.
force (bool, optional) – Force loading of the array into the frame if the frame already exists. Default is False.

Raises

SystemError – This error can be raised if the specified frame already exists in Stata, and force is False.

pystata.stata.pdataframe_to_frame(df, stfr, force=False)¶

Load a pandas DataFrame into a specified frame in Stata.

Each column of the DataFrame will be stored as a variable in the frame. If the column type conforms to a Stata variable type, the variable type will be used in the frame. Otherwise, the column will be converted into a string variable in the frame.

The variable names will correspond to the column names of the DataFrame. If the column name is a valid Stata name, it will be used as the variable name. If it is not a valid Stata name, a valid variable name is created by using the makeVarName() method of the SFIToolkit class in the Stata Function Interface (sfi) module.

If the frame of the specified name already exists in Stata, an attempt to load a pandas DataFrame into the frame will raise an exception. The force argument will force loading of the DataFrame, replacing the original frame.

Parameters

df (pandas DataFrame) – The DataFrame to be loaded.
stfr (str) – The frame in which to store the DataFrame.
force (bool, optional) – Force loading of the DataFrame into the frame if the frame already exists. Default is False.

Raises

SystemError – This error can be raised if the specified frame already exists in Stata, and force is False.

pystata.stata.nparray_from_frame(stfr, var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶

Export values from a Stata frame into a NumPy array.

Parameters

stfr (str) – The Stata frame to export.
var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
valuelabel (bool, optional) – Use the value label when available. Default is False.
missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.

Returns

A NumPy array containing the values from the Stata frame.

Return type

NumPy array

Raises

ValueError – This error can be raised for three possible reasons. One is if any of the variable indices or names specified in var are out of range or not found. Another is if any of the observation indices specified in obs are out of range. Last, it may be raised if selectvar is out of range or not found.
FrameError – This error can be raised if the frame stfr does not already exist in Stata, or if Python fails to connect to the frame.

pystata.stata.pdataframe_from_frame(stfr, var=None, obs=None, selectvar=None, valuelabel=False, missingval=_DefaultMissing())¶

Export values from a Stata frame into a pandas DataFrame.

Parameters

stfr (str) – The Stata frame to export.
var (int, str, or list-like, optional) – Variables to access. It can be specified as a single variable index or name, or an iterable of variable indices or names. If var is not specified, all the variables are specified.
obs (int or list-like, optional) – Observations to access. It can be specified as a single observation index or an iterable of observation indices. If obs is not specified, all the observations are specified.
selectvar (int or str, optional) – Observations for which selectvar!=0 will be selected. If selectvar is an integer, it is interpreted as a variable index. If selectvar is a string, it should contain the name of a Stata variable. Specifying selectvar as “” has the same result as not specifying selectvar, which means no observations are excluded. Specifying selectvar as -1 means that observations with missing values for the variables specified in var are to be excluded.
valuelabel (bool, optional) – Use the value label when available. Default is False.
missingval (_DefaultMissing, optional) – If missingval is specified, all the missing values in the returned list are replaced by this value. If it is not specified, the numeric value of the corresponding missing value in Stata is returned.

Returns

A pandas DataFrame containing the values from the Stata frame.

Return type

pandas DataFrame

Raises

ValueError – This error can be raised for three possible reasons. One is if any of the variable indices or names specified in var are out of range or not found. Another is if any of the observation indices specified in obs are out of range. Last, it may be raised if selectvar is out of range or not found.
FrameError – This error can be raised if the frame stfr does not already exist in Stata, or if Python fails to connect to the frame.

pystata.stata.get_return()¶

Retrieve current r() results and store them in a Python dictionary.

The keys are Stata’s macro and scalar names, and the values are their corresponding values. Stata’s matrices are converted into NumPy arrays.

Returns: A dictionary containing current r() results.
Return type: Dictionary

pystata.stata.get_ereturn()¶

Retrieve current e() results and store them in a Python dictionary.

The keys are Stata’s macro and scalar names, and the values are their corresponding values. Stata’s matrices are converted into NumPy arrays.

Returns: A dictionary containing current e() results.
Return type: Dictionary

pystata.stata.get_sreturn()¶

Retrieve current s() results and store them in a Python dictionary.

The keys are Stata’s macro and scalar names, and the values are their corresponding values. Stata’s matrices are converted into NumPy arrays.

Returns: A dictionary containing current s() results.
Return type: Dictionary

Examples¶

In the following, we provide a few quick examples illustrating how to use this module. The example code was run in a command-line Python environment. Before running the following code, make sure that you have configured the pystata Python package; see Configuration for more information. Also see Example 5: Call Stata using API functions for more examples of how to use the above API functions.

>>> from pystata import stata
>>> stata.run("sysuse auto, clear")
(1978 automobile data)
>>> stata.run("summarize mpg")

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
         mpg |         74     21.2973    5.785503         12         41
>>> stata.run('''
...   // run a linear regression
...   regress mpg price i.foreign
...   ereturn list
... ''')

.
.    // run a linear regression
.    regress mpg price i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     23.01
       Model |  960.866305         2  480.433152   Prob > F        =    0.0000
    Residual |  1482.59315        71  20.8815937   R-squared       =    0.3932
-------------+----------------------------------   Adj R-squared   =    0.3761
       Total |  2443.45946        73  33.4720474   Root MSE        =    4.5696

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |   -.000959   .0001815    -5.28   0.000     -.001321    -.000597
             |
     foreign |
    Foreign  |   5.245271   1.163592     4.51   0.000     2.925135    7.565407
       _cons |   25.65058   1.271581    20.17   0.000     23.11512    28.18605
------------------------------------------------------------------------------

.    ereturn list

scalars:
                  e(N) =  74
               e(df_m) =  2
               e(df_r) =  71
                  e(F) =  23.00749448574634
                 e(r2) =  .3932401256962295
               e(rmse) =  4.569638248831391
                e(mss) =  960.8663049714787
                e(rss) =  1482.593154487981
               e(r2_a) =  .3761482982510528
                 e(ll) =  -215.9083177127538
               e(ll_0) =  -234.3943376482347
               e(rank) =  3

macros:
            e(cmdline) : "regress mpg price i.foreign"
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "ols"
             e(depvar) : "mpg"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"

matrices:
                  e(b) :  1 x 4
                  e(V) :  4 x 4
               e(beta) :  1 x 3

functions:
             e(sample)

.
>>>
>>>
>>> # Loading Stata's results into Python
>>> # access e() results using sfi module
>>> from sfi import Scalar, Matrix
>>> import numpy as np
>>> Scalar.getValue('e(r2)')
0.39324012569622946
>>> np.array(Matrix.get('e(V)'))
array([[ 3.29592449e-08,  0.00000000e+00, -1.02918123e-05,
        -2.00142479e-04],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -0.00000000e+00],
       [-1.02918123e-05,  0.00000000e+00,  1.35394617e+00,
        -3.39072871e-01],
       [-2.00142479e-04, -0.00000000e+00, -3.39072871e-01,
         1.61691892e+00]])
>>>
>>> # store e() results into a dictionary
>>> stata.get_ereturn()
{'e(N)': 74.0, 'e(df_m)': 2.0, 'e(df_r)': 71.0, 'e(F)': 23.007494485746342,
'e(r2)': 0.39324012569622946, 'e(rmse)': 4.569638248831391, 'e(mss)': 960.8663049714787,
'e(rss)': 1482.5931544879809, 'e(r2_a)': 0.3761482982510528, 'e(ll)': -215.90831771275379,
'e(ll_0)': -234.39433764823468, 'e(rank)': 3.0, 'e(cmdline)': 'regress mpg price i.foreign',
'e(title)': 'Linear regression', 'e(marginsprop)': 'minus', 'e(marginsok)': 'XB default',
'e(vce)': 'ols', 'e(_r_z_abs__CL)': '|t|', 'e(_r_z__CL)': 't', 'e(depvar)': 'mpg',
'e(cmd)': 'regress', 'e(properties)': 'b V', 'e(predict)': 'regres_p', 'e(model)': 'ols',
'e(estat_cmd)': 'regress_estat',
'e(b)': array([[-9.59034169e-04,  0.00000000e+00,  5.24527100e+00,
         2.56505843e+01]]), 'e(V)': array([[ 3.29592449e-08,  0.00000000e+00, -1.02918123e-05,
        -2.00142479e-04],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -0.00000000e+00],
       [-1.02918123e-05,  0.00000000e+00,  1.35394617e+00,
        -3.39072871e-01],
       [-2.00142479e-04, -0.00000000e+00, -3.39072871e-01,
         1.61691892e+00]]), 'e(beta)': array([[-0.4889233,  0.       ,  0.4172175]])}
>>>
>>>
>>> # Loading data into Stata's current dataset and frames
>>> # load a NumPy array into Stata as the current dataset
>>> np.random.seed(17)
>>> npa = np.random.randint(100, size=(1000,5))
>>> stata.nparray_to_data(npa, force=True)
>>> stata.run("list in 1/3")

     +------------------------+
     | v1   v2   v3   v4   v5 |
     |------------------------|
  1. | 15    6   22   57   45 |
  2. | 22   31   68   39   84 |
  3. | 44    7    1   17   41 |
     +------------------------+
>>> stata.nparray_to_data(npa, prefix="x", force=True)
>>> stata.run("list in 1/3")

     +------------------------+
     | x1   x2   x3   x4   x5 |
     |------------------------|
  1. | 15    6   22   57   45 |
  2. | 22   31   68   39   84 |
  3. | 44    7    1   17   41 |
     +------------------------+
>>>
>>> # load a NumPy array into Stata as a frame
>>> stata.nparray_to_frame(npa, "stfr1", force=True)
>>> stata.run("frame stfr1: list in 1/3")

     +------------------------+
     | v1   v2   v3   v4   v5 |
     |------------------------|
  1. | 15    6   22   57   45 |
  2. | 22   31   68   39   84 |
  3. | 44    7    1   17   41 |
     +------------------------+
>>> stata.nparray_to_frame(npa, "stfr2", prefix="x", force=True)
>>> stata.run("frame stfr2: list in 1/3")

     +------------------------+
     | x1   x2   x3   x4   x5 |
     |------------------------|
  1. | 15    6   22   57   45 |
  2. | 22   31   68   39   84 |
  3. | 44    7    1   17   41 |
     +------------------------+
>>>
>>> # load a pandas DataFrame into Stata as the current dataset
>>> import pandas as pd
>>> data = {
...     'name': ['Ann', 'Jane', 'Summer', 'Joy', 'Robin'],
...     'state': ['CA', 'TX', 'PA', 'NY', 'WA'],
...     'age': [28, 29, 21, 35, 50]
... }
>>> df = pd.DataFrame(data)
>>> df
     name state  age
0     Ann    CA   28
1    Jane    TX   29
2  Summer    PA   21
3     Joy    NY   35
4   Robin    WA   50
>>> stata.pdataframe_to_data(df, force=True)
>>> stata.run("list")

     +----------------------+
     |   name   state   age |
     |----------------------|
  1. |    Ann      CA    28 |
  2. |   Jane      TX    29 |
  3. | Summer      PA    21 |
  4. |    Joy      NY    35 |
  5. |  Robin      WA    50 |
     +----------------------+
>>>
>>> # load a pandas DataFrame into Stata as a frame
>>> stata.pdataframe_to_frame(df, "stfr1", force=True)
>>> stata.run("frame stfr1: list")

     +----------------------+
     |   name   state   age |
     |----------------------|
  1. |    Ann      CA    28 |
  2. |   Jane      TX    29 |
  3. | Summer      PA    21 |
  4. |    Joy      NY    35 |
  5. |  Robin      WA    50 |
     +----------------------+
>>>
>>>
>>> # Loading Stata's current dataset and frames into Python
>>> # push Stata's dataset to a NumPy array
>>> stata.run("sysuse auto, clear")
(1978 automobile data)
>>> np1 = stata.nparray_from_data(var="price rep78", obs=range(5))
>>> np1
array([[4.09900000e+003, 3.00000000e+000],
       [4.74900000e+003, 3.00000000e+000],
       [3.79900000e+003, 8.98846567e+307],
       [4.81600000e+003, 3.00000000e+000],
       [7.82700000e+003, 4.00000000e+000]])
>>> np2 = stata.nparray_from_data(var="price rep78", obs=range(5), missingval=-100)
>>> np2
array([[4099,    3],
       [4749,    3],
       [3799, -100],
       [4816,    3],
       [7827,    4]])
>>>
>>> # push Stata's dataset to a pandas DataFrame
>>> df1 = stata.pdataframe_from_data()
>>> df1.head()
            make  price  mpg          rep78  headroom  trunk  weight  length  turn  displacement  gear_ratio  foreign
0    AMC Concord   4099   22   3.000000e+00       2.5     11    2930     186    40           121        3.58        0
1      AMC Pacer   4749   17   3.000000e+00       3.0     11    3350     173    40           258        2.53        0
2     AMC Spirit   3799   22  8.988466e+307       3.0     12    2640     168    35           121        3.08        0
3  Buick Century   4816   20   3.000000e+00       4.5     16    3250     196    40           196        2.93        0
4  Buick Electra   7827   15   4.000000e+00       4.0     20    4080     222    43           350        2.41        0
>>> df2 = stata.pdataframe_from_data(var="price rep78", obs=range(10), missingval=-100)
>>> df2.head()
   price  rep78
0   4099      3
1   4749      3
2   3799   -100
3   4816      3
4   7827      4
>>> df3 = stata.pdataframe_from_data(var="price rep78 foreign", obs=range(10), valuelabel=True)
>>> df3.head()
   price          rep78   foreign
0   4099   3.000000e+00  Domestic
1   4749   3.000000e+00  Domestic
2   3799  8.988466e+307  Domestic
3   4816   3.000000e+00  Domestic
4   7827   4.000000e+00  Domestic
>>>
>>> # push Stata's frame to a NumPy array
>>> # copy the default frame (current dataset) to a new frame named myauto
>>> stata.run("frame copy default myauto")
>>> np1 = stata.nparray_from_frame("myauto", var="mpg price foreign", obs=range(5))
>>> np1
array([[  22, 4099,    0],
       [  17, 4749,    0],
       [  22, 3799,    0],
       [  20, 4816,    0],
       [  15, 7827,    0]])
>>> np2 = stata.nparray_from_frame("myauto", var="mpg price foreign", obs=range(5), valuelabel=True)
>>> np2
array([['22', '4099', 'Domestic'],
       ['17', '4749', 'Domestic'],
       ['22', '3799', 'Domestic'],
       ['20', '4816', 'Domestic'],
       ['15', '7827', 'Domestic']], dtype='<U11')
>>>
>>> # push Stata's frame to a pandas DataFrame
>>> df1 = stata.pdataframe_from_frame("myauto")
>>> df1.head()
            make  price  mpg          rep78  headroom  trunk  weight  length  turn  displacement  gear_ratio  foreign
0    AMC Concord   4099   22   3.000000e+00       2.5     11    2930     186    40           121        3.58        0
1      AMC Pacer   4749   17   3.000000e+00       3.0     11    3350     173    40           258        2.53        0
2     AMC Spirit   3799   22  8.988466e+307       3.0     12    2640     168    35           121        3.08        0
3  Buick Century   4816   20   3.000000e+00       4.5     16    3250     196    40           196        2.93        0
4  Buick Electra   7827   15   4.000000e+00       4.0     20    4080     222    43           350        2.41        0
>>> df2 = stata.pdataframe_from_frame("myauto", missingval=-100, valuelabel=True)
>>> df2.head()
            make  price  mpg  rep78  headroom  trunk  weight  length  turn  displacement  gear_ratio   foreign
0    AMC Concord   4099   22      3       2.5     11    2930     186    40           121        3.58  Domestic
1      AMC Pacer   4749   17      3       3.0     11    3350     173    40           258        2.53  Domestic
2     AMC Spirit   3799   22   -100       3.0     12    2640     168    35           121        3.08  Domestic
3  Buick Century   4816   20      3       4.5     16    3250     196    40           196        2.93  Domestic
4  Buick Electra   7827   15      4       4.0     20    4080     222    43           350        2.41  Domestic
>>> df3 = stata.pdataframe_from_frame("myauto", var="mpg price foreign", obs=range(5))
>>> df3.head()
   mpg  price  foreign
0   22   4099        0
1   17   4749        0
2   22   3799        0
3   20   4816        0
4   15   7827        0
>>> df4 = stata.pdataframe_from_frame("myauto", var="mpg price foreign", obs=range(5), valuelabel=True)
>>> df4
   mpg  price   foreign
0   22   4099  Domestic
1   17   4749  Domestic
2   22   3799  Domestic
3   20   4816  Domestic
4   15   7827  Domestic