StrLConnector (sfi.StrLConnector)

class sfi.StrLConnector(*argv)

This class facilitates access to Stata’s strL datatype. The allowed values for the variable index var and the observation index obs are

-nvar <= var < nvar

and

-nobs <= obs < nobs

Here nvar is the number of variables defined in the dataset currently loaded in Stata or in the specified frame, which is returned by getVarCount(). nobs is the number of observations defined in the dataset currently loaded in Stata or in the specified frame, which is returned by getObsTotal().

Negative values for var and obs are allowed and are interpreted in the usual way for Python indexing. var can be specified either as the variable name or index. Note that passing the variable index will be more efficient because looking up the index for the specified variable name is avoided.

There are two ways to create a StrLConnector instance:

  • StrLConnector(var, obs)

    Creates a StrLConnector and connects it to a specific strL in the Stata dataset; see Data.

    var : int or str

    Variable to access.

    obs : int

    Observation to access.

    A ValueError can be raised if

    • var is out of range or not found.
    • obs is out of range.
  • StrLConnector(frame, var, obs)

    Creates a StrLConnector and connects it to a specific strL in the specified Frame.

    frame : Frame

    The Frame to reference.

    var : int or str

    Variable to access.

    obs : int

    Observation to access.

    A ValueError can be raised if

    • frame does not already exist in Stata.
    • var is out of range or not found.
    • obs is out of range.

Method Summary

close() Close the connection and release any resources.
getPosition() Get the current access position.
getSize() Get the total number of bytes available in the strL.
isBinary() Determine if the attached strL has been marked as binary.
reset() Reset the access position to its initial value.
setPosition(pos) Set the access position.

Attributes

obs Return the observation number of the attached strL.
pos Return the current position.
var Return the variable index of the attached strL.

Method Detail

close()

Close the connection and release any resources.

getPosition()

Get the current access position.

Returns:The position.
Return type:int
getSize()

Get the total number of bytes available in the strL.

Returns:The total number of bytes available.
Return type:int
isBinary()

Determine if the attached strL has been marked as binary.

Returns:True if the strL has been marked as binary.
Return type:bool
obs

Return the observation number of the attached strL.

pos

Return the current position.

reset()

Reset the access position to its initial value.

setPosition(pos)

Set the access position.

Parameters:pos (int) – The new position.
var

Return the variable index of the attached strL.

Examples

In this section, we will show you how to access Stata’s strL datatype in two ways:

  • Write to strL variables in Python.
  • Read from strL variables in Python.

We will use Stata to generate a PNG file for the purpose of this illustration. Then, we will use Python to store the PNG file into a dataset. Lastly, we will demonstrate how to read the PNG image from the dataset and store it into a new file. To begin, we generate a scatterplot and save it as sc1.png.

. sysuse auto, clear
. scatter mpg price
. graph export sc1.png, replace
. clear

In the Python script that follows, strlex1.py, we create a dataset with one variable named mystrl with one observation. Next, we create an instance of the StrLConnector class sc, which is connected to the first observation of strL. The next step is to allocate the proper size for the data that we need to store in our strL. We do this by specifying the size of our image file in the allocateStrL() method. Then, we use a loop to read the contents of the image file in 2,048 byte chunks, writing each chunk to our strL in Stata.

from sfi import Data, StrLConnector, SFIToolkit
import sys
import os

#create a new dataset with one strL variable
#mystrl and one observation
Data.addVarStrL("mystrl")
Data.setObsTotal(1)

mpv = Data.getVarIndex("mystrl")
obs = 0

#create an instance of StrLConnector
sc = StrLConnector(mpv, obs)

filename = sys.argv[1]
flen = os.path.getsize(filename)

#allocate the strL variable
Data.allocateStrL(sc, flen, True)

#read the image file and store the contents
#in the strL variable
chunk_size = 2048
with open(filename, "rb") as fin:
    b = fin.read(chunk_size)
    while b:
        Data.writeBytes(sc, b)
        b = fin.read(chunk_size)

sc.close()
SFIToolkit.displayln(filename + " stored in dataset")

Now that we have shown how to add binary data from an image file and store it into a strL, we can reverse the process by reading the data from a strL and storing it on disk. In the Python script that follows, strlex2.py, we create an instance of the StrLConnector class sc to connect to the previously stored strL data. Remember that we stored sc1.png in our strL. We read the data in our strL using 2,048 byte chunks with readBytes(), writing each of those chunks into a new image file named sc2.png.

from sfi import Data, StrLConnector, SFIToolkit
import sys

filename = sys.argv[1]

#create an instance of StrLConnector
#to connect to the strL variable
var = Data.getVarIndex("mystrl")
obs = 0
sc = StrLConnector(var, obs)

#read the image file from the strL variable and
#store the contents in a new image file
chunk_size = 2048
with open(filename, "wb") as fout:
    b = Data.readBytes(sc, chunk_size)
    while b:
        fout.write(b)
        b = Data.readBytes(sc, chunk_size)

sc.close()
SFIToolkit.displayln(filename + " retrieved from dataset")

Using Stata, we execute both Python scripts. Next, we checksum both the original image file and the file produced by reading the data from Stata, to confirm that they are equivalent.

 . python script strlex1.py, args(sc1.png)
sc1.png stored in dataset
 . python script strlex2.py, args(sc2.png)
sc2.png retrieved from dataset
. checksum sc1.png
Checksum for sc1.png = 1777154237, size = 37386
. checksum sc2.png
Checksum for sc2.png = 1777154237, size = 37386