Creating and using Stata plugins

Home / Resources & support / Stata plugins

Creating and using Stata plugins

Stata plugin interface version:

3.0

You can create, load, and execute your own Stata plugins.

Note that any new features will be backwards compatible, meaning that plugins created using this current documentation should continue to work under newer versions of the Stata plugin interface (described in 2. What is the Stata plugin interface?).

The following topics are discussed.

1.	What is a plugin?
2.	What is the Stata plugin interface?
3.	Advantages and disadvantages of plugins
4.	When are plugins most beneficial?
	4a.	Looping over the observations in your data
	4b.	Looping over the elements of a matrix
	4c.	Frequently called routines
	4d.	Linking with existing code
5.	Creating a Stata plugin
	5a.	Compiling under Windows
	5b.	Compiling under Unix
	5c.	Compiling under Mac
6.	Loading a Stata plugin
7.	Executing a Stata plugin
8.	Stata routines available to plugins
	8a.	Routines for handling data
	8b.	Routines for handling Stata matrices
	8c.	Stata macros and scalars
	8d.	Routines for displaying results in Stata
	8e.	Routines for handling missing values
9.	Compiling C++ plugins
	9a.	Under Windows
	9b.	Under Unix
	9c.	Under Mac

1. What is a plugin?

A plugin is a piece of software that adds extra features to a software package. In Stata, a plugin consists of compiled code (written using the C programming language) that you can attach to the Stata executable. This process of attaching the plugin to Stata, known as dynamically loading, creates a new customized Stata executable that enables you to perform highly specialized tasks.

Plugins can serve as useful and integral parts of Stata community-contributed commands. Because they consist of compiled code, plugins run much faster than equivalent code written in the ado-language, where each command must be interpreted each time it is executed. The amount of speed gained depends on how much interpretation time is saved; see 4. When are plugins most beneficial?

When describing plugins, one often uses the term dynamically linked library, or DLL for short. While it is common to use the terms DLL and plugin interchangeably, a DLL is a piece of code designed to be called by various applications, while a plugin is specific to one application (in this case, Stata).

2. What is the Stata plugin interface?

The Stata plugin interface (SPI) is the system by which plugins are compiled, loaded into Stata, and called from Stata. The SPI consists of four components:

A set of routines within the Stata executable that have been exposed so that they can be called from plugins.
The C source file, stplugin.c, and C header file, stplugin.h, which are used to create plugins.
The plugin option to Stata's program command, which loads a compiled plugin into Stata.
The Stata command plugin call, which executes a plugin from within Stata.

These four components work together, so a capability added to (1) would require a change to stplugin.h in (2).

A note on version control

Stata 14.1 and later uses the current version of the SPI, version 3.0. This webpage documents that version of the SPI. Earlier versions of Stata up through Stata 14.0 used version 2.0 of the SPI.

Keeping your version of the SPI current is easy. You keep current with components (1), (3), and (4) by merely keeping your Stata up to date. You keep current with (2) by consulting this web page. If the SPI version listed at the top of this page has changed, download the new stplugin.c and stplugin.h from this page, and consult this documentation for details on any newly added features.

For documentation about version 2.0 of the SPI, click here.

Plugins created under older versions of the SPI will generally continue to work under newer versions of Stata/SPI. When you create a plugin using the files in (2), the SPI version associated with those files is embedded into the plugin. When you attempt to load the plugin into Stata, the plugin's SPI version is compared with the SPI version for the Stata that you are running. If they are incompatible, Stata issues an error. An incompatibility error often means that you need to update your Stata executable.

3. Advantages and disadvantages of plugins

Advantages of plugins:

Plugins generally run faster than equivalent Stata ado-code.
Because plugins consist of compiled code, they can be used to keep proprietary any community-contributed additions to Stata. That is, you can choose to distribute the plugin but not the source code used to produce it.
Plugins can be used to link Stata to pre existing C code written to perform specific tasks.

Disadvantages of plugins:

Plugins are platform specific, whereas Stata is platform independent. A plugin compiled under Windows, for example, will not run under Linux, so, if you want to distribute programs that use plugins, you are limited to the platforms where you can compile your plugins. If you distribute the plugin source code instead, those running your program will still require access to a C compiler.
C is a much lower-level language than Stata ado. Thus the payoff in speed obtained by going to C is offset by the increased time it takes to code tasks in C.
Plugin code is harder to debug than Stata ado-code. Errors in plugin code can cause Stata to crash or badly corrupt your data, and there is really no way to prevent this.

4. When are plugins most beneficial?

The Stata ado-language is a rich language that allows programmers to easily perform a myriad of tasks: parse syntax, set options, mark observations to be used for estimation, perform vectorized calculations on your data, perform complex matrix manipulations, save estimation results, etc. For all its richness, Stata ado-code is also remarkably fast. For this reason, we recommend using plugins in only a limited set of circumstances. When considering whether to write a plugin, ask yourself

How many lines of interpreted ado-code is this plugin replacing?
How complex is the task that the ado-code is performing?

If the answer to (1) is "a lot", and the answer to (2) is "not very", then you have more to gain by using a plugin to perform the task in question. By asking yourself these two questions, you are analyzing the trade-off between time saved by a faster-running program versus the additional time spent writing your code in a lower-level C program (e.g., think of how long it would take you to write a C program that could do what syntax can). The most speed to be gained by writing a plugin is in the time saved by not having to interpret line after line of ado-code. Realize, however, that the number of lines of interpreted ado-code must be quite large for you to notice a gain in speed when using a plugin.

4a. Looping over the observations in your data

When answering (1), count how many lines of ado-code require interpretation, rather than just counting the absolute number of lines in your block of ado-code. Loops in Stata are reinterpreted at each iteration. Thus an ado-loop consisting of five lines of code that is iterated 100 times really counts as 500 lines to be interpreted, not 5 (or 7, if you count the opening and closing of the loop).

Because loops are usually employed to perform simple tasks many times, rewriting an ado-loop as a plugin can result in a much faster execution time. This is especially true if you are looping over the observations in your data, because the number of observations can become arbitrarily large and is limited only by the amount of memory in your computer.

Note, however, that Stata goes through a lot of trouble to ensure that you rarely have to loop over the observations in your data. Almost all of Stata's mathematical operators and functions are vectorized, meaning that they are overloaded to implicitly perform calculations over all the observations on your variables. When you code

        replace x = y + z

you are taking each observation in variable y, adding it to each respective observation in z, and placing the result in variable x. The looping over the observations is done implicitly in the Stata executable, and thus this command runs very quickly.

You would never code the above as

        forvalues i = 1/`=_N' {
                     replace x = y + z in `i'
             }

because this would be very slow by comparison. Similarly, you should always try first to find a vectorized solution for whatever you are calculating.

Still, loops over the observations are sometimes unavoidable and often occur in Stata programs. However, you rarely see them coded in the ado-language in programs that ship with official Stata. What you see instead is a call to an internal Stata routine, a piece of the Stata executable. A plugin is essentially the same thing; the only difference is that a plugin is written by you, the user, instead of by those of us at StataCorp.

4b. Looping over elements of a matrix

Plugins are also useful in matrix calculations. Despite all of Stata's matrix operators and functions, it is sometimes necessary to loop over all the elements of a matrix and calculate them one by one. For example, you might need to construct the Hessian matrix for a nonlinear-form likelihood with an arbitrarily large number of equations.

4c. Frequently called routines

Plugins are also useful when you are writing a routine that will be called very often from other Stata programs, including your own programs and those from official Stata. For example, suppose that you are writing a method d0 likelihood evaluator for ml. Given the number of numerical derivatives required, these types of programs are called quite often by ml. In this case, the savings in execution time come from replacing the majority of the body of the evaluator program with a call to the plugin.

4d. Linking with existing code

A plugin could also come in handy when one is taking an existing routine written in C and writing a wrapper program so that the routine can be called from Stata. The plugin code in this case would consist of the original routine coupled with the wrapper program, which would take input from Stata, perform the appropriate call to the routine, and then take the output from the routine and return it to Stata in the appropriate way. In this case, the savings in time are obvious—the routine already exists.

The above are only a few examples, as there are many other situations where plugins prove useful. In summary, when considering writing a plugin, simply ask yourself how much interpretation time you would save by resorting to precompiled code, while factoring in the additional time required to write the C code. If you are an inexperienced C programmer, it may take a lot of time to write a plugin. Also consider how often you plan on using your program. If you are performing one analysis on a set of data, a plugin is probably not worth your time.

5. Creating a Stata plugin

The process of creating a Stata plugin involves using the header file, stplugin.h; the C source file, stplugin.c; and the files containing your plugin code to form a shared library, or DLL. You should not make any changes to stplugin.h or stplugin.c. Any code you write should be confined to files other than these.

Note: If you are compiling C++ source code, see section 9. Compiling C++ plugins.

In what follows, we consider what to do when your plugin code consists of only one source file, filename.c, yet the procedure we describe also applies when you have multiple plugin source files. For example, consider the simplest of all plugin programs, hello.c, which contains the following code:

        #include "stplugin.h"

        STDLL stata_call(int argc, char *argv[])
        {
        	SF_display("Hello World\n") ;
                return(0) ;
        }

The type declaration STDLL defines the routine as one that returns a Stata return code. The name of the routine, stata_call(), is constant, as are the arguments argc and argv, but we'll get to that later. For now, just know that this is how you begin any routine you wish to call directly from Stata. Stata keeps track of multiple plugins, based on the name of the created shared library for each. The routine SF_display() displays output to the Stata results window. Thus our program simply displays the message "Hello World" to the Stata results window and returns a return code of 0 upon completion.

5a. Compiling under Windows

As previously stated, the process of creating a plugin involves taking the files stplugin.h, stplugin.c, and hello.c, and forming a DLL. The instructions for doing so depend on which C compiler you are using.

If you are using Microsoft Visual Studio 2017 you would do the following:

Select File > New > Project.
Expand Visual C++ from the left hand tree view pane, select Windows Desktop, and then Windows Desktop Wizard. Next, give the project a name (hello, for instance).
In the Windows Destkop Project Wizard, for Application type select Dynamic Link Library (.dll). You should select Empty project to open a project with no existing files.
Next, under the Solution Explorer, right-click on Source files, and select Add > Existing Items.
Add the files stplugin.h, stplugin.c, and hello.c, which you have presumably downloaded from this page.
Build the DLL by selecting Build > Build hello (for 64-bit DLL/Stata select x64 for the Solution Platform). If all goes well (it should), you will now find the file hello.dll within the x64/Debug subdirectory of your project directory.
You may rename hello.dll to hello.plugin. This will save you from having to specify the using() option when you load your plugin, but this is not required.

If you are using Cygwin and the gcc compiler 3.XX under Windows, type

        $ gcc -shared -mno-cygwin stplugin.c hello.c -o hello.plugin

Note: You must use the "-mno-cygwin" compiler flag, which causes the gcc compiler to use MinGW (Minimalist GNU For Windows). Using MinGW will instruct the gcc compiler to link to the Microsoft Run-time libraries instead of to the Cygwin dll. Linking to the Cygwin dll will cause problems when Stata attempts to reload the plugin.

If you are using a new version of Cygwin and want to compile a native Windows application/library you will need to install the MinGW compiler (both the 64-bit and 32-bit compilers) using the Cygwin package manager. Once installed, type

        $  x86_64-w64-mingw32-gcc -shared -fPIC stplugin.c hello.c -o hello.plugin

If you are using some other C compiler under Windows, follow that compiler's instructions for creating a DLL. For Windows, doubles are aligned on 0 mod 8 boundaries.

5b. Compiling under Unix

If you are using gcc, simply type

        $ gcc -shared -fPIC -DSYSTEM=OPUNIX stplugin.c hello.c -o hello.plugin

Note that in the above example, we create the shared library hello.plugin. There is no significance to the file extension .plugin other than to load the plugin more conveniently into Stata, as discussed in the next section.

If you are using a compiler other than gcc, follow the instructions for that particular compiler. Note that you need to (a) compile under ANSI C, (b) specify the compiler flag -DSYSTEM=OPUNIX and (c) specify any additional flags necessary for your plugin (such as -lm).

Also note that if you are using 64-bit versions of Stata, you must specify the appropriate compiler flags to produce a plugin compatible to 64-bit architecture.

5c. Compiling under Mac

To build a universal plugin that will work on both Macs with Apple Silicon and Macs with Intel processors, you need the latest version of Xcode and its command line utilities installed. Once Xcode and its command line utilities are installed, type

	$ clang -bundle -DSYSTEM=APPLEMAC stplugin.c hello.c -o hello.plugin.x86_64 -target x86_64-apple-macos10.12
        $ clang -bundle -DSYSTEM=APPLEMAC stplugin.c hello.c -o hello.plugin.arm64 -target arm64-apple-macos11

        $ lipo -create -output hello.plugin hello.plugin.x86_64 hello.plugin.arm64

If you are using a compiler other than clang, follow the instructions for that particular compiler. Note that you need to (a) compile under ANSI C, (b) specify the compiler flag -DSYSTEM=APPLEMAC, and (c) specify any additional flags necessary for your plugin (such as -lm).

Once you have compiled the plugin, you should have the shared library file hello.plugin. Copy or move this file to a place where it can be accessed by Stata, such as your current working directory or somewhere along your personal ado-path.

If you are using an Xcode project, you want to create the project as a BSD Dynamic Library and change the target to a bundle. After you have included the source files to the project, open the target settings for your plugin and change the following:

Linking
       Compatibility Version           (remove value)
       Current Library Version         (remove value)
       Mach-O Type                     Bundle

Packaging
       Executable Extension            plugin
       Executable Prefix               

Apple Clang - Custom Compiler Flags
       Other C Flags                   -DSYSTEM=APPLEMAC
       Other C++ Flags                 -DSYSTEM=APPLEMAC

(remove value) means to clear the text field for the setting so that its value is empty. If you do not see any Apple Clang custom compiler flag settings, it is because you have not added your source files to the project. The Other C++ Flags setting is automatically set when you set the Other C Flags setting.

We recommend that you compile the plugin using the latest version of Xcode. In the target's build settings, set the Architectures setting to include both Apple Silicon and Intel. You will build one plugin that will work on any Mac.

6. Loading a Stata plugin

Think of loading a Stata plugin as taking the plugin file (created in the previous section) and attaching it to the Stata executable. Once loaded, the plugin can be executed.

Plugins are loaded into Stata using the program command with the plugin option. For example, assume we created hello.plugin and placed it in our Stata ado-path. To interactively load this plugin into Stata, type

        . program hello, plugin

It is that simple. Also, because your "program" is really a plugin, there is no end statement following this program definition.

The syntax for loading a plugin is

        . program handle, plugin [using(filespec)]

where handle is how you want your plugin to be named within Stata. If using() is not specified, Stata will attempt to locate the file handle.plugin within your ado-path and will load it if it exists or issue an error if it doesn't.

You specify using() when either (a) your plugin has a file extension other than .plugin (such as .dll), (b) your plugin is not located within your ado-path, or (c) you wish to use a handle that does not coincide with the name of your plugin file. For example,

        . program myhello, plugin using("/home/rgg/hello.dll")

would load the hello.dll that is located in the directory /home/rgg (presumably not in the current ado-path), and this plugin would be named myhello within Stata.

program, plugin follows most of the same rules that program follows; see [P] program. For instance, if a plugin is used interactively or within a do-file, you can unload it via

        . program drop handle

You can also define a plugin as a subroutine within an ado-file. For instance, consider the ado-file sayhello.ado, which contains

        program sayhello
        	version 9.2

                command to execute hello plugin
        end

        program hello, plugin

When used in this manner, the plugin (like its parent ado-file) is loaded only as needed, namely when we run sayhello.

Although you may define a plugin as a subroutine to an ado-file, you may not define a plugin as its own ado-file. If, for instance, you had the file sayhello.ado containing nothing but

        program sayhello, plugin using("hello.plugin")

subsequently, typing sayhello from within Stata would result in the following error:

        unrecognized command:  sayhello not defined by sayhello.ado
        r(199);

The reason this doesn't work is technical and is a result of how plugins are actually executed (see the next section). In any case, having a wrapper ado-program, such as the original version of sayhello.ado, is easy enough.

7. Executing a Stata plugin

Once you have loaded your plugin, you execute it using the Stata command

        plugin call handle

where handle is the name by which Stata refers to the plugin.

For example, using the plugin hello.plugin created above, you can execute it interactively by typing

        . program hello, plugin
        . plugin call hello
        Hello World

Alternatively, you can execute the plugin from within an ado-file. If we have the file sayhello.ado, which contains

        program sayhello
        	version 9.2

                plugin call hello
        end

        program hello, plugin

then within Stata you can run sayhello, which, in turn, automatically loads and executes the plugin.

        . sayhello
        Hello World

At this point, we admit that we have been a bit simplistic in describing the syntax for plugin call. We were able to get away with this because hello.plugin wasn't very complicated.

The full syntax for plugin call is

        plugin call handle [varlist] [if exp] [in range] [,arglist]

That is, its syntax is very similar to a typical Stata estimation command.

This syntax allows Stata and your plugin to easily communicate because it allows the Stata command-line parser to do much of the work for you.

When you specify varlist, you are telling your plugin which subset of your data it is allowed to operate on, whether it is reading from it or writing to it. You can think of varlist as specifying a "data array" indexed by the variables you specify and the observation number. Your plugin would presumably operate on only the data in this array. The next section covers how you do this.
Specifying if exp or in range enables your plugin to easily operate on only a subset of your data.
Specifying arglist passes other arguments to your plugin, whether they are quoted strings, names of Stata matrices, the name of a scalar to be defined by your plugin, or some value. arglist is treated as a space-delimited set of literal strings, where binding by quotes (") and compund double quotes (`" "') is respected.

These arguments are passed to your plugin in a way that is very familiar to C programmers. The number of arguments is passed to stata_call() as argc and the arguments themselves as elements of the string vector argv. That is, the first argument in arglist is passed as argv[0], the second as argv[1], and so on.

Consider the file showargs.c, which contains the following code:

        #include "stplugin.h"

        STDLL stata_call(int argc, char *argv[])
        {
        	int i ;

        	for(i=0;i < argc; i++) {
        		SF_display(argv[i]) ;
        		SF_display("\n") ;
        	}
        	return(0) ;
        }

When executed, this program merely replays each argument supplied to plugin call via arglist.

        $ gcc -shared -fPIC -DSYSTEM=OPUNIX stplugin.c showargs.c -o showargs.plugin

        . program showargs, plugin

        . plugin call showargs, A "this is the second argument" scalarname 4.5
        A
        this is the second argument
        scalarname
        4.5

8. Stata routines available to plugins

In this section, we describe the routines that make it possible to communicate results between Stata and your plugin.

A note on Stata types

The following type definitions are within stplugin.h:

            typedef signed char     ST_sbyte ;
            typedef unsigned char   ST_ubyte ;
            typedef int             ST_int ;
            typedef unsigned        ST_unsigned ;
            typedef short int       ST_int2 ;
            typedef int             ST_int4 ;
            typedef long            ST_long ;
            typedef unsigned int    ST_uint4 ;
            typedef float           ST_float ;
            typedef double          ST_double ;
            typedef unsigned char   ST_boolean ;
            typedef int             ST_retcode ;
            typedef double *        ST_dmkey ;

Stata defines its own types so that data types within Stata, such as Stata return codes, are easy to define across different platforms.

For example, a Stata double (ST_double) is used to define Stata's double data type. This currently coincides with C's concept of a double for all currently supported platforms, yet this may not always be the case. Imagine some future awkward platform or compiler for which a Stata double is really a float. If this were to happen, the type definitions in stplugin.h would be changed accordingly, and any plugin code that referred to an ST_double would compile without any trouble on the new platform.

In other words, when communicating between Stata and your plugin, it is good programming practice to use Stata's data types so that your source code will be compatible with future versions of the SPI. This is admittedly unlikely for most platforms, but be aware of the above type definitions because the routines we describe below are presented in terms of Stata types.

8a. Routines for handling data

When you type

        . plugin call handle varlist

The variables in varlist are passed, in order, to the plugin. The following routines are used to manipulate these variables.

	ST_retcode SF_vdata(ST_int i, ST_int j, ST_double *z); 
	ST_retcode SF_vstore(ST_int i, ST_int j, ST_double val);
	
	ST_retcode SF_sdata(ST_int i, ST_int j, char *s); 
	ST_retcode SF_sstore(ST_int i, ST_int j, char s);

	ST_int     SF_sdatalen(ST_int i, ST_int j);
	ST_retcode SF_strldata(ST_int i, ST_int j, char *s, ST_int len); 

	ST_boolean SF_var_is_string(ST_int i);
	ST_boolean SF_var_is_strl(ST_int i);
	ST_boolean SF_var_is_binary(ST_int i, ST_int j);

For numeric data, SF_vdata() reads the jth observation of variable i in varlist and places this value in z. SF_vstore() takes val and stores it in the jth observation of variable i in varlist.

You can determine whether a particular variable is a numeric variable or a string variable with SF_var_is_string(). It checks the ith variable and returns 1 if the variable is a string variable (meaning str# or strL) and 0 if the variable is a numeric variable. You can further use SF_var_is_strl() to check whether a particular string variable is a str# variable or a strL variable. It returns 1 if the variable is a strL and 0 otherwise.

If you have string variables (but not very long string variables, or strLs), use SF_sdata() and SF_sstore() to access the values of the variables. These routines can handle strings up to 2045 characters in length. These routines return 0 if successful or issue a nonzero return code if an error occurs. For example, if you attempt to exceed either the number of observations in your data or the number of variables you have specified, both routines will return error code 498.

If you have strL variables, use three functions to deal with them. Values of strL variables can be text or binary, so first use SF_var_is_binary() to check the jth observation of the ith variable. It returns 1 if the value is binary and 0 otherwise.

Then use the pair of functions SF_sdatalen() and SF_strldata() to access the value in each observation of a strL variable. SF_sdatalen() returns the length, in bytes, of the jth observation of the ith variable. If the value is binary, as determined by SF_var_is_binary(), the length is the number of bytes. If the value is not binary, it is text, and the length is the number of bytes not including the terminating \0 character.

Once you know the length of the value, you use SF_strldata() to retrieve it. Each value of a strL variable can be up to 2 GB in size. So, you may wish to first obtain the length of a particular value with SF_sdatalen(), then allocate enough storage space for that value, and then retrieve the value with SF_strldata(). Or, if you know that none of your values are longer than a particular size, or you don't care to retrieve any data past that particular size, you can use the fourth argument, len, to SF_strldata() to specify the number of bytes available for storage in the third argument, s. Even if the value is longer than that, SF_strldata() will truncate it to the maximum length you specify. SF_strldata() returns the number of bytes copied into s, not including the terminating \0 character for text strings. It returns -1 if an error occurs.

All data manipulation occurs in double precision. That is, suppose that you pass an integer variable as part of varlist. When you obtain a value of this variable, you obtain it as a double. When you store a value of this variable, you store it as a double. In Stata, whatever doubles you stored are cast back to integers, the original storage type of your variable, and will probably result in a loss of precision. If you wish to retain precision upon returning to Stata, be sure to pass in any variables that you wish to store as values for doubles. If you are dealing with strings, make sure that the Stata string variable is large enough for the string value that you pass it when using SF_sdata(), and make sure you have a large enough string to store the value returned by SF_sstore().

Note that a plugin cannot create a new Stata variable but will happily fill in one that you have created in Stata and set it to either missing, zero, or whatever.

The following routines are used to obtain the dimensions of your data array:

        ST_int SF_nobs(void);   
        > ...  /* returns the number of observations in your data */
        ST_int SF_nvars(void);  
        > ...  /* returns the number of variables specified in varlist */
        ST_int SF_nvar(void);   
        > ...  /* returns the total number of variables in your data */

Alternatively, you may wish to operate only on selected observations in your data. When you specify an if condition or an in range to plugin call, these conditions are handled by the following routines:

        ST_boolean SF_ifobs(ST_int j);   
        > ...  /* evaluates to 1 when if condition is true in obs. j */
               /* 0 otherwise */
        ST_int SF_in1(void);             
        > ...  /* returns first observation number for in range */
        ST_int SF_in2(void);             
        > ...  /* returns last observation number for in range */

When you do not specify an if condition, you intend to work on all the observations in your data, so SF_ifobs() will evaluate to 1 everywhere. Similarly, if you do not specify an in range, SF_in1() returns 1, and SF_in2() returns _N.

Note that SF_ifobs() does not automatically evaluate to 0 outside any in range you specify. SF_ifobs() is concerned with only the veracity of the if condition, should you specify one.

Putting all this together, consider the following code, varsum.c, which will take the first k-1 of k specified variables, sum them across each observation, and store the results in the last specified variable. Naturally, we want to also respect any if condition or in range specified by the user.

        #include "stplugin.h"

        STDLL stata_call(int argc, char *argv[])         
        {
            ST_int          j, k ;
            ST_double       sum, z ;
            ST_retcode	  rc ;

            if(SF_nvars() < 2) { 
                 return(102) ;  	    /* not enough variables specified */
            }

            for(j = SF_in1(); j <= SF_in2(); j++) {
                if(SF_ifobs(j)) {
                    sum = 0.0 ;
                    for(k=1; k < SF_nvars(); k++) {
                        if(rc = SF_vdata(k,j,&z)) return(rc) ;	
                        sum += z ;
                    }
                    if(rc = SF_vstore(SF_nvars(), j, sum)) return(rc) ;
                }
            }
            return(0) ;
        }

We'll leave it to you to try this out.

8b. Routines for handling Stata matrices

The following routines are used to handle Stata matrices:

        ST_retcode SF_mat_el(char *mat, ST_int i, ST_int j, ST_double *z); 

             SF_mat_el() takes the [i,j] element of Stata matrix mat 
             and stores it into z.  SF_mat_el() returns a nonzero   
             return code if an error is encountered.

        ST_retcode SF_mat_store(char *mat, ST_int i, ST_int j, ST_double val);

             SF_mat_store() stores val as the [i,j] element of Stata matrix
             mat.  SF_mat_store() returns a nonzero return code if an error 
             is encountered. 	

        ST_int	 SF_col(char *mat);

             SF_col() returns the number of columns of Stata matrix mat, 
             or 0 if the matrix doesn't exist or some other error.

        ST_int	 SF_row(char *mat);

             SF_row() returns the number of rows of Stata matrix mat, 
             or 0 if the matrix doesn't exist or some other error.

As with data variables, plugins cannot create new Stata matrices, but they can be used to fill in matrices you have previously defined in Stata and set to, say, all missing.

For example,

        SF_mat_store("A", 1, 2, 3.4) ;

would store 3.4 as the [1,2] element of Stata matrix A. In general, however, a plugin would not know the name of the matrix ahead of time. For instance, suppose the matrix name was a Stata tempname. In this case, you would use the arglist feature of plugin call to pass in the name of the matrix

        . tempname mymat
        . mat `mymat' = (1,2 \ 3,4) 
        . plugin call change12, `mymat'

where change12 is a plugin that essentially performs the following:

        SF_mat_store(argv[0], 1, 2, 3.4) ;

8c. Stata macros and scalars

The following routines are used to store and use Stata macros and scalars. By macros, we mean both global macros and local macros (local to the program calling the plugin). Internally, global macros and local macros share the same namespace, with the names of local macros preceded by an underscore (_).

        ST_retcode SF_macro_save(char *mac, char *tosave); 

             SF_macro_save() creates/recreates a Stata macro named
             by the string mac and stores into it the string tosave.
             SF_macro_save() returns a nonzero return code in the case
             of an error (an invalid name, for example)

        ST_retcode SF_macro_use(char *mac, char *contents, ST_int maxlen); 

             SF_macro_use() takes the first maxlen characters of 
             what is contained in the Stata macro named mac and
             places them into the character array contents.
             SF_macro_save() returns a nonzero return code in the case
             of an error (an invalid name, for example).  SF_macro_use also
             copies a null byte into contents after these maxlen characters.
             So the size of contents needs to be at least maxlen+1.

        ST_retcode SF_scal_save(char *scal, double val);

             SF_scal_save() creates/recreates the Stata scalar named
             by the string scal and stores val in it.
             SF_scal_save() returns a nonzero return code in the case
             of an error (an invalid name, for example)

        ST_retcode SF_scal_use(char *scal, double *z);

             SF_scal_use() takes the value of the Stata scalar named
             scal and places it into z.
             SF_scal_use() returns a nonzero return code in the case
             of an error (an invalid name or nonexistent scalar, 
             for example)

As with matrix names, your plugin typically obtains the names of scalars and macros from arglist.

For example, consider a plugin that takes two arguments, a scalar name and a local macro name. The plugin takes the value of the scalar and places a string-formatted version of it into the local macro.

Below we list the code to do this, contained in scaltomac.c:

        #include <stdio.h>
        #include <string.h>

        #include "stplugin.h"

        STDLL stata_call(int argc, char *argv[])
        {
            ST_retcode rc ;
            ST_double d ;
            char macname[40] ;	    /* 32 would be enough */
            char buf[40] ;              

            if(argc != 2) { 
                return(198) ;  	    /* syntax error */
            }

            if(rc = SF_scal_use(argv[0],&d)) return(rc) ;   /* read scalar */

            strcpy(macname,"_") ;	        /* local macro  */
            strcat(macname,argv[1]) ;
	
            sprintf(buf, "%lf", d) ;   /* convert to string */

            if(rc = SF_macro_save(macname,buf)) return(rc) ; /* save macro */

            return(0) ;
        }

If this code were compiled to form scaltomac.plugin, we could do the following interactively within Stata:

        . program scaltomac, plugin
        . scalar jean = 45.999
        . plugin call scaltomac, jean marie
        . di "`marie'"
             45.99900
        . plugin call scaltomac, notdefined x1
        r(111);

Notice that the second time we called the plugin, we fed it a nonexistent scalar. As a result, SF_scal_use() complained, and our plugin passed the error code along to Stata.

We could have just as easily passed the local macro name with the underscore included in arglist, and, in fact, if we did it this way, we would be more general. We could use our plugin to define a global or local macro.

8d. Displaying results in Stata

The following routines are used to display results in Stata.

        ST_retcode 	SF_display(char *);
        ST_retcode	SF_error(char *);

SF_display() takes the given string and outputs it to the Stata results window. Before the string is printed, however, it is run through the Stata SMCL interpreter. For example,

        SF_display("for more help, see {help stcox}\n");

would result in Stata with

        for more help, see stcox

with "for more help, see" displayed in green (or your default text color) and "stcox" displayed in blue and shown as a link. See [P] smcl for more details.

In debugging your plugin code, it is useful to print out the values of the variables used in your code. To do so, use SF_display() after first printing your results into a string, such as

        char   buf[80] ;

        snprintf(buf, 80, "The value of z is %lf\n", z) ;
        SF_display(buf);

SF_error() works the same way as SF_display(), except that the output shows up even when it is running quietly from within Stata. Hence, SF_error() is ideal for error messages.

8e. Missing values

Within your plugin, calculations on the data and on Stata matrices occur in double precision. When performing mathematical operations on data and matrix elements obtained from Stata, it is good programming practice to check for missing values, because they can wreak havoc on your calculations. The routine

        ST_boolean	SF_is_missing(ST_double z);

can be used to check if z is what Stata would call "missing". Conversely, you can set data and matrix elements to missing by using the constant SV_missval, for example

        SF_vstore(i, j, SV_missval) ;

9. Creating C++ plugins

This section describes how to compile plugins written in C++. Just as with compiling C-style plugins, you will require the header file, stplugin.h; the C source file, stplugin.c; and the files containing your plugin code. You should not make any changes to stplugin.h or stplugin.c. Any code you write should be confined to files other than these.

Note: Make sure you have the most current version of stplugin.h and stplugin.c. Both of these are available on this page.

Consider the program varsum.cpp, which contains the following:

        #include "stplugin.h"

        // Note that the use of an object in this example is entirely unnecessary
        // from a design standpoint, but it is used to provide a interesting example 
        // of C++ syntax.

        // Simple class declaration with a routine to compute varsum, and a routine
        // to access the stored sum.
        //

        class VarLogic {
            public:
                VarLogic(void) { sum = 0.0 ; }  // constructor 
                ~VarLogic(void) { return ; }    // destructor
                ST_retcode computeSum(void) ;   // method defined below 

            protected:
                ST_double sum ;     // object's sum variable
        };

        // Method to compute our varsum
        //
        ST_retcode VarLogic::computeSum(void) {
            ST_int      j, k ;
            ST_double   z ;
            ST_retcode  rc ;

            if(SF_nvars() < 2) { 
                return((ST_retcode) 102) ; // not enough variables specified 
            }

            for(j = SF_in1(); j <= SF_in2(); j++) {
                if(SF_ifobs(j)) {
                    sum = 0.0 ;
                    for(k=1; k < SF_nvars(); k++) {
                        if(rc = SF_vdata(k,j,&z)) return(rc) ;  
                        sum += z ;
                    }
                    if(rc = SF_vstore(SF_nvars(), j, sum)) return(rc) ;
                }
            }
            return((ST_retcode) 0) ;
        }
        
        // Regular C-style stata_call()
        //
        STDLL stata_call(int argc, char *argv[]) { 
            VarLogic    vlogic ;        // Object with varsum logic
        
            return vlogic.computeSum() ;        // Calculate varsum
        }

9a. Compiling under Windows

As previously stated, the process of creating a plugin involves taking the files stplugin.h, stplugin.c, and varsum.cpp and forming a DLL. The instructions for doing so depend on which C++ compiler you are using.

If you are using Microsoft Visual Studio .NET, version 2017, for example, the procedure will be the same as described in 5a. Compiling under Windows, with the following exception:

stplugin.c will need to be compiled using C++ instead of C. You can do this by renaming the file to stplugin.cpp or by setting the necessary compiler option. When using Microsoft Visual Studio .NET, version 2017, this option can be set by right-clicking stplugin.c in the Solutions Explorer and selecting Properties. Once the Properties box is open, expand the Configuration Properties tree, expand the C/C++, and then select Advanced. Within the Advanced settings, you will be able to set the Compile As field to "Compile as C++ Code".

If you are using Cygwin and the minGW compiler under Windows, type

        $ x86_64-w64-mingw32-g++ -shared -fPIC stplugin.c varsum.cpp -o varsum.plugin

If you are using the Borland C compiler, type

        $ bcc32 -u- -tWD -evarsum.plugin  varsum.cpp stplugin.c

Borland uses a different linker-naming convention than Visual C++ and gcc, so the -u- compiler option must be set. By default, the compiler automatically adds an underscore character (_) in front of every global identifier. That behavior must be disabled for your plugin to work with Stata. Unfortunately, using the -u- compiler option can have an unwanted side effect; when this option is used, standard C library functions, such as strcmp, are no longer referenced correctly. To correct this problem, these library functions will need an underscore character (_) added to the beginning of the function in your source code.

If you are using some other C compiler under Windows, follow that compiler's instructions for creating a DLL. For Windows, doubles are aligned on 0 mod 8 boundaries.

9b. Compiling under Unix

If you are using gcc, simply type

        $ g++ -shared -fPIC -DSYSTEM=OPUNIX stplugin.c varsum.cpp -o varsum.plugin

9c. Compiling under Mac

If you are using gcc, type

        $ g++ -bundle -DSYSTEM=APPLEMAC stplugin.c varsum.cpp -o varsum.plugin

If you are using a compiler other than gcc, follow the instructions for that particular compiler. Note that you need to (a) compile under ANSI C, (b) specify the compiler flag -DSYSTEM=APPLEMAC, and (c) specify any additional flags necessary for your plugin (such as -lm).

Once you have compiled the plugin, you should have the shared library file varsum.plugin. Copy or move this file to a place where it can accessed by Stata, such as your current working directory or somewhere along your personal ado-path.

The project described above was supported by Grant Number R44 RR12435 from the National Institutes of Health, National Center for Research Resources. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Center for Research Resources.