Call Stata using API functions¶
You can also interact with Stata by using the config and stata modules from the pystata Python package. The config module defines functions for initializing and configuring Stata. The stata module defines functions for interacting with Stata. For more information about these two modules, see API functions.
Previously, we initialized Stata’s environment within Python. Once we have done that, we can use the stata module to call Stata. Below, we import the module:
[18]:
from pystata import stata
The run() function is used to execute Stata commands. One or multiple Stata commands can be specified.
[19]:
stata.run('sysuse auto, clear')
(1978 automobile data)
[20]:
stata.run('''
summarize
reg mpg price i.foreign
ereturn list
''')
.
. summarize
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        make |          0
       price |         74    6165.257    2949.496       3291      15906
         mpg |         74     21.2973    5.785503         12         41
       rep78 |         69    3.405797    .9899323          1          5
    headroom |         74    2.993243    .8459948        1.5          5
-------------+---------------------------------------------------------
       trunk |         74    13.75676    4.277404          5         23
      weight |         74    3019.459    777.1936       1760       4840
      length |         74    187.9324    22.26634        142        233
        turn |         74    39.64865    4.399354         31         51
displacement |         74    197.2973    91.83722         79        425
-------------+---------------------------------------------------------
  gear_ratio |         74    3.014865    .4562871       2.19       3.89
     foreign |         74    .2972973    .4601885          0          1
. reg mpg price i.foreign
      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     23.01
       Model |  960.866305         2  480.433152   Prob > F        =    0.0000
    Residual |  1482.59315        71  20.8815937   R-squared       =    0.3932
-------------+----------------------------------   Adj R-squared   =    0.3761
       Total |  2443.45946        73  33.4720474   Root MSE        =    4.5696
------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       price |   -.000959   .0001815    -5.28   0.000     -.001321    -.000597
             |
     foreign |
    Foreign  |   5.245271   1.163592     4.51   0.000     2.925135    7.565407
       _cons |   25.65058   1.271581    20.17   0.000     23.11512    28.18605
------------------------------------------------------------------------------
. ereturn list
scalars:
                  e(N) =  74
              e(sum_w) =  74
               e(df_m) =  2
               e(df_r) =  71
                  e(F) =  23.00749448574634
                 e(r2) =  .3932401256962295
               e(rmse) =  4.569638248831391
                e(mss) =  960.8663049714787
                e(rss) =  1482.593154487981
               e(r2_a) =  .3761482982510528
                 e(ll) =  -215.9083177127538
               e(ll_0) =  -234.3943376482347
               e(rank) =  3
macros:
            e(cmdline) : "regress mpg price i.foreign"
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "ols"
             e(depvar) : "mpg"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
matrices:
                  e(b) :  1 x 4
                  e(V) :  4 x 4
               e(beta) :  1 x 3
functions:
             e(sample)
.
You can use the get_return(), get_ereturn(), and get_sreturn() functions to store Stata’s r(), e(), and s() results in Python as dictionaries.
[21]:
stata.get_ereturn()
[21]:
{'e(N)': 74.0,
 'e(sum_w)': 74.0,
 'e(df_m)': 2.0,
 'e(df_r)': 71.0,
 'e(F)': 23.007494485746342,
 'e(r2)': 0.39324012569622946,
 'e(rmse)': 4.569638248831391,
 'e(mss)': 960.8663049714787,
 'e(rss)': 1482.5931544879809,
 'e(r2_a)': 0.3761482982510528,
 'e(ll)': -215.90831771275379,
 'e(ll_0)': -234.39433764823468,
 'e(rank)': 3.0,
 'e(cmdline)': 'regress mpg price i.foreign',
 'e(title)': 'Linear regression',
 'e(marginsprop)': 'minus',
 'e(marginsok)': 'XB default',
 'e(vce)': 'ols',
 'e(_r_z_abs__CL)': '|t|',
 'e(_r_z__CL)': 't',
 'e(depvar)': 'mpg',
 'e(cmd)': 'regress',
 'e(properties)': 'b V',
 'e(predict)': 'regres_p',
 'e(model)': 'ols',
 'e(estat_cmd)': 'regress_estat',
 'e(b)': array([[-9.59034169e-04,  0.00000000e+00,  5.24527100e+00,
          2.56505843e+01]]),
 'e(V)': array([[ 3.29592449e-08,  0.00000000e+00, -1.02918123e-05,
         -2.00142479e-04],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         -0.00000000e+00],
        [-1.02918123e-05,  0.00000000e+00,  1.35394617e+00,
         -3.39072871e-01],
        [-2.00142479e-04, -0.00000000e+00, -3.39072871e-01,
          1.61691892e+00]]),
 'e(beta)': array([[-0.4889233,  0.       ,  0.4172175]])}
You can also push Stata datasets to Python as NumPy arrays or pandas DataFrames. Below, we store Stata’s current dataset into a pandas DataFrame, myauto.
[22]:
myauto = stata.pdataframe_from_data()
myauto.head()
[22]:
| make | price | mpg | rep78 | headroom | trunk | weight | length | turn | displacement | gear_ratio | foreign | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AMC Concord | 4099 | 22 | 3.000000e+00 | 2.5 | 11 | 2930 | 186 | 40 | 121 | 3.58 | 0 | 
| 1 | AMC Pacer | 4749 | 17 | 3.000000e+00 | 3.0 | 11 | 3350 | 173 | 40 | 258 | 2.53 | 0 | 
| 2 | AMC Spirit | 3799 | 22 | 8.988466e+307 | 3.0 | 12 | 2640 | 168 | 35 | 121 | 3.08 | 0 | 
| 3 | Buick Century | 4816 | 20 | 3.000000e+00 | 4.5 | 16 | 3250 | 196 | 40 | 196 | 2.93 | 0 | 
| 4 | Buick Electra | 7827 | 15 | 4.000000e+00 | 4.0 | 20 | 4080 | 222 | 43 | 350 | 2.41 | 0 | 
You can instead choose to store just a subset of the data in Python. Below, we store the first 10 observations of the variables mpg and price into a pandas DataFrame.
[23]:
stata.pdataframe_from_data('mpg price', range(10))
[23]:
| mpg | price | |
|---|---|---|
| 0 | 22 | 4099 | 
| 1 | 17 | 4749 | 
| 2 | 22 | 3799 | 
| 3 | 20 | 4816 | 
| 4 | 15 | 7827 | 
| 5 | 18 | 5788 | 
| 6 | 26 | 4453 | 
| 7 | 20 | 5189 | 
| 8 | 16 | 10372 | 
| 9 | 19 | 4082 | 
On the other hand, you can read data from Python into Stata, making it the current dataset or loading it into a specific frame in Stata. Below, we load the pandas DataFrame myauto into Stata, making it the current dataset. Then, we list the first three observations. Here force is specified as True to clear Stata’s memory before the DataFrame is loaded.
[24]:
stata.pdataframe_to_data(myauto, force=True)
stata.run('list in 1/3')
     +------------------------------------------------------------------------+
  1. |        make | price | mpg | rep78 | headroom | trunk | weight | length |
     | AMC Concord |  4099 |  22 |     3 |      2.5 |    11 |   2930 |    186 |
     |------------------------------------------------------------------------|
     |     turn     |     displa~t     |     gear_ra~o     |     foreign      |
     |       40     |          121     |     3.5799999     |           0      |
     +------------------------------------------------------------------------+
     +------------------------------------------------------------------------+
  2. |        make | price | mpg | rep78 | headroom | trunk | weight | length |
     |   AMC Pacer |  4749 |  17 |     3 |        3 |    11 |   3350 |    173 |
     |------------------------------------------------------------------------|
     |     turn     |     displa~t     |     gear_ra~o     |     foreign      |
     |       40     |          258     |          2.53     |           0      |
     +------------------------------------------------------------------------+
     +------------------------------------------------------------------------+
  3. |        make | price | mpg | rep78 | headroom | trunk | weight | length |
     |  AMC Spirit |  3799 |  22 |     . |        3 |    12 |   2640 |    168 |
     |------------------------------------------------------------------------|
     |     turn     |     displa~t     |     gear_ra~o     |     foreign      |
     |       35     |          121     |     3.0799999     |           0      |
     +------------------------------------------------------------------------+