Package com.stata.sfi

Class Data

java.lang.Object
com.stata.sfi.Data

public final class Data extends Object
This class provides access to the current Stata dataset. All variable and observation numbering begins at 1 unless otherwise stated.

Example:

This example shows how to handle a Stata varlist along with if and in to restrict observations. The example calculates summary statistics and displays a table similar to Stata's summarize command.


public class Examples 
{
	// call directly from integrated environment using java or java:
	public static void summarize() {
		int rc = summarize(null);
		SFIToolkit.setRC(rc);
	}
	
	// call directly from a plugin
	public static int summarize(String args[]) {
		int parsedVariables = Data.getParsedVarCount();
		long obsStart = Data.getObsParsedIn1();
		long obsEnd = Data.getObsParsedIn2();

		if (parsedVariables <= 0) {
			SFIToolkit.errorln("varlist required");
			return 100;
		}

		// display the header
		SFIToolkit.displayln("\n" +   "    " +
				"Variable {c |}        Obs        Mean    Std. Dev.       Min        Max");

		for (int v = 1; v <= parsedVariables; v++) {
			double sum = 0;
			double max = Double.NEGATIVE_INFINITY;
			double min = Double.POSITIVE_INFINITY;
			double mean = 0;
			double stddev = 0;
			long count = 0;

			// get the real variable index for the ith parsed variable
			int varIndex = Data.mapParsedVarIndex(v);

			if (!Data.isVarTypeStr(varIndex)) {

				// calculate mean
				for (long obs = obsStart; obs <= obsEnd; obs++) {
					if (! Data.isParsedIfTrue(obs)) {
						continue;
					}
					double value = Data.getNum(varIndex, obs);
					if (Missing.isMissing(value)) {
						continue ;
					}
					max = Math.max(max, value);
					min = Math.min(min, value);
					sum += value;
					count++;
				}
				mean = sum / count;

				// calculate std. dev.
				double d2sum = 0;
				for (long obs = obsStart; obs <= obsEnd; obs++) {
					if (! Data.isParsedIfTrue(obs)) {
						continue;
					}
					double value = Data.getNum(varIndex, obs);
					if (Missing.isMissing(value)) {
						continue ;
					}
					d2sum += Math.pow(value-mean,2);
				}
				stddev = Math.sqrt(d2sum/(count-1));
			}

			// write out the results
			if (v % 5 == 1) {
				SFIToolkit.displayln("{hline 13}{c +}{hline 57}");
			}
			String out = String.format("%12s {c |}%11s", 
					Data.getVarName(varIndex),
					SFIToolkit.formatValue(count, "%11.0fc"));
			if (count>0) {
				out += String.format("   %9s   %9s  %9s  %9s",
						SFIToolkit.formatValue(mean,  "%9.0g"),
						SFIToolkit.formatValue(stddev,"%9.0g"),
						SFIToolkit.formatValue(min,   "%9.0g"),
						SFIToolkit.formatValue(max,   "%9.0g"));
			}
			SFIToolkit.displayln(out);
			SFIToolkit.pollnow();
			// outer loop; poll each time to update display
			// avoid polling too often; use pollstd() when possible
		}
		return 0;
	}
}
  
From Stata...

. sysuse auto, clear
(1978 Automobile Data)

// call summarize from integrated Java environment; assumes that Examples
// class is defined in Stata do-file that has already been run
. java rep if mpg > 22 in 12/50: Examples.summarize()

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+----------------------------------------------------------
       rep78 |          8        3.25     1.38873          1          5


// call summarize as a Java plugin; assumes that Examples class
// is compiled and archived to examples.jar
. javacall Examples summarize rep if mpg > 22 in 12/50, jar(examples.jar)

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+----------------------------------------------------------
       rep78 |          8        3.25     1.38873          1          5

// compare with built-in summarize command
. summarize rep if mpg > 22 in 12/50

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       rep78 |          8        3.25     1.38873          1          5

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
     
    static final int
     
    static final int
     
    static final int
     
    static final int
     
    static final int
     
    static final int
     
    static final int
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static int
    Add a variable of type byte to the current Stata dataset.
    static int
    Add a variable of type double to the current Stata dataset.
    static int
    Add a variable of type float to the current Stata dataset.
    static int
    Add a variable of type int to the current Stata dataset.
    static int
    Add a variable of type long to the current Stata dataset.
    static int
    addVarStr(String name, int length)
    Add a variable of type str to the current Stata dataset.
    static int
    Add a variable of type strL to the current Stata dataset.
    static int
    allocateStrL(StrLConnector sc, long size)
    Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
    static int
    allocateStrL(StrLConnector sc, long size, boolean binary)
    Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
    static int
    dropVar(int var)
    Drop the variable at the specified variable index.
    static int
    getBestType(double value)
    Get the best numeric data type for the specified value.
    static String
    getFormattedValue(int var, long obs, boolean bValueLabel)
    Read a value from the current Stata dataset, applying its display format.
    static int
    Get the maximum length of a Stata string variable of type str.
    static int
    Get the maximum number of variables Stata currently allows.
    static double
    getNum(int var, long obs)
    Read a numeric value from the current Stata dataset.
    static long
    Get the first in a range of observations if Java was invoked with the in qualifier.
    static long
    Get the last in a range of observations if Java was invoked with the in qualifier.
    static long
    Get the number of observations in the current Stata dataset.
    static int
    Get the number of variables specified when Java was invoked.
    static Double
    Get the double representation of a String using Stata's real() function.
    static String
    getStr(int var, long obs)
    Read a string value from the current Stata dataset; this method can be used to read str or strL data types.
    static String
    getStrf(int var, long obs)
    Read a string value from the current Stata dataset; this method can be used to read str data types.
    static int
    getStrVarWidth(int var)
    Get the width of a variable of type str.
    static int
    getType(int var)
    Get the data type for the specified variable.
    static int
    Get the number of variables in the current Stata dataset.
    static String
    getVarFormat(int var)
    Get the format for a Stata variable.
    static int
    Look up the variable index for the specified name in the current Stata dataset.
    static String
    getVarLabel(int var)
    Get the label for a Stata variable.
    static String
    getVarName(int var)
    Get the variable name at a given variable index.
    static boolean
    isAlias(int var)
    Determine if a variable is an alias for a variable in another frame.
    static boolean
    isParsedIfTrue(long obs)
    Determine if an observation for the if expression qualifier used when Java was invoked is true or false.
    static boolean
    Determine if a varlist was specified when Java was invoked.
    static boolean
    isVarTypeStr(int var)
    Test if a variable is of type str.
    static boolean
    isVarTypeString(int var)
    Test if a variable's type is string.
    static boolean
    isVarTypeStrL(int var)
    Test if a variable is of type strL.
    static String
    makeVarName(String s, boolean retainCase)
    Attempt to form a valid variable name from a string.
    static int
    Map the variable index from the parsed varlist.
    static int
    readBytes(StrLConnector sc, byte[] b)
    Read a sequence of bytes from a strL.
    static int
    readBytes(StrLConnector sc, byte[] b, int off, int len)
    Read a sequence of bytes from a strL.
    static int
    renameVar(int var, String newname)
    Rename a Stata variable.
    static int
    setObsTotal(long obs)
    Set the number of observations in the current Stata dataset.
    static int
    setVarFormat(int var, String format)
    Set the format for a Stata variable.
    static int
    setVarLabel(int var, String label)
    Set the label for a Stata variable.
    static int
    storeBytes(StrLConnector sc, byte[] bytes, boolean binary)
    Store a byte buffer to a strL.
    static int
    storeNum(int var, long obs, double value)
    Store a numeric value in the current Stata dataset.
    static int
    storeNumFast(int var, long obs, double value)
    Store a numeric value in the current Stata dataset.
    static int
    storeStr(int var, long obs, String value)
    Store a string value in the current Stata dataset; this method can be used to store str or strL data types.
    static int
    storeStrf(int var, long obs, String value)
    Store a string value in the current Stata dataset; this method can be used to store str data types.
    static int
    storeStrfFast(int var, long obs, String value)
    Store a string value in the current Stata dataset; this method can be used to store str data types.
    static void
    Inform Stata that its data has been modified.
    static int
    writeBytes(StrLConnector sc, byte[] b)
    Write a byte buffer to a strL; the strL must be allocated using allocateStrL before calling this method.
    static int
    writeBytes(StrLConnector sc, byte[] b, int off, int len)
    Write len bytes from the specified byte buffer starting at offset off to a strL; the strL must be allocated using allocateStrL before calling this method.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Method Details

    • addVarByte

      @Synchronized public static int addVarByte(String name)
      Add a variable of type byte to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarDouble

      @Synchronized public static int addVarDouble(String name)
      Add a variable of type double to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarFloat

      @Synchronized public static int addVarFloat(String name)
      Add a variable of type float to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarInt

      @Synchronized public static int addVarInt(String name)
      Add a variable of type int to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarLong

      @Synchronized public static int addVarLong(String name)
      Add a variable of type long to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarStr

      @Synchronized public static int addVarStr(String name, int length)
      Add a variable of type str to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      length - Initial size of the variable. If the length is greater than getMaxStrLength(), then a variable of type strL will be created.
      Returns:
      Return code from Stata; 0 if successful.
    • addVarStrL

      @Synchronized public static int addVarStrL(String name)
      Add a variable of type strL to the current Stata dataset.
      Parameters:
      name - Name of the variable to be created.
      Returns:
      Return code from Stata; 0 if successful.
    • allocateStrL

      @Synchronized public static int allocateStrL(StrLConnector sc, long size)
      Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized. By default, the data will be marked as binary.
      Parameters:
      sc - The StrLConnector representing a strL.
      size - The size in bytes.
      Returns:
      Return code from Stata; 0 if successful.
    • allocateStrL

      @Synchronized public static int allocateStrL(StrLConnector sc, long size, boolean binary)
      Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
      Parameters:
      sc - The StrLConnector representing a strL.
      size - The size in bytes.
      binary - Mark the data as binary. Note that if the data are not marked as binary, Stata expects that the data be UTF-8 encoded. An alternate approach is to call storeStr, where the encoding is automatically handled.
      Returns:
      Return code from Stata; 0 if successful.
    • dropVar

      @Synchronized public static int dropVar(int var)
      Drop the variable at the specified variable index.
      Parameters:
      var - Variable to drop.
      Returns:
      Return code from Stata; 0 if successful.
    • getBestType

      @ThreadSafe public static int getBestType(double value)
      Get the best numeric data type for the specified value.
      Parameters:
      value - The value to test.
      Returns:
      The field value representing the data type; may be TYPE_BYTE, TYPE_INT, TYPE_LONG, TYPE_FLOAT, or TYPE_DOUBLE.
    • getFormattedValue

      @Synchronized public static String getFormattedValue(int var, long obs, boolean bValueLabel)
      Read a value from the current Stata dataset, applying its display format.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      bValueLabel - Use the value label when available.
      Returns:
      The formatted value as a String.
    • getMaxStrLength

      @ThreadSafe public static int getMaxStrLength()
      Get the maximum length of a Stata string variable of type str.
      Returns:
      The maximum length.
    • getMaxVars

      @ThreadSafe public static int getMaxVars()
      Get the maximum number of variables Stata currently allows.
      Returns:
      The maximum number of variables.
    • getNum

      @ThreadSafe public static double getNum(int var, long obs)
      Read a numeric value from the current Stata dataset.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      Returns:
      The value.
    • getObsParsedIn1

      @Synchronized public static long getObsParsedIn1()
      Get the first in a range of observations if Java was invoked with the in qualifier. If in was not specified, then the range will reflect the entire dataset.
      Returns:
      The first observation's number.
    • getObsParsedIn2

      @Synchronized public static long getObsParsedIn2()
      Get the last in a range of observations if Java was invoked with the in qualifier. If in was not specified, then the range will reflect the entire dataset.
      Returns:
      The last observation's number.
    • getObsTotal

      @ThreadSafe public static long getObsTotal()
      Get the number of observations in the current Stata dataset.
      Returns:
      The number of observations.
    • getParsedVarCount

      @Synchronized public static int getParsedVarCount()
      Get the number of variables specified when Java was invoked. If a varlist was not specified, then all the variables are implied.
      Returns:
      The number of variables.
    • getRealOfString

      @ThreadSafe public static Double getRealOfString(String s)
      Get the double representation of a String using Stata's real() function.
      Parameters:
      s - The string to convert.
      Returns:
      The numeric value. Returns null if an error occurs.
    • getStr

      @Synchronized public static String getStr(int var, long obs)
      Read a string value from the current Stata dataset; this method can be used to read str or strL data types.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      Returns:
      The String. Returns null if an error occurs.
    • getStrf

      @ThreadSafe public static String getStrf(int var, long obs)
      Read a string value from the current Stata dataset; this method can be used to read str data types.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      Returns:
      The String. Returns null if an error occurs.
    • getStrVarWidth

      @ThreadSafe public static int getStrVarWidth(int var)
      Get the width of a variable of type str.
      Parameters:
      var - The index of the variable to test.
      Returns:
      The width if the variable is of type str.
    • getType

      @ThreadSafe public static int getType(int var)
      Get the data type for the specified variable.
      Parameters:
      var - Variable to access.
      Returns:
      The value representing the data type; may be TYPE_BYTE, TYPE_INT, TYPE_LONG, TYPE_FLOAT, TYPE_DOUBLE, TYPE_STR, TYPE_STRL, or TYPE_UNKNOWN.
    • getVarCount

      @ThreadSafe public static int getVarCount()
      Get the number of variables in the current Stata dataset.
      Returns:
      The number of variables.
    • getVarFormat

      @ThreadSafe public static String getVarFormat(int var)
      Get the format for a Stata variable.
      Parameters:
      var - Index of the variable to look up.
      Returns:
      The variable's format.
    • getVarIndex

      @Synchronized public static int getVarIndex(String varname)
      Look up the variable index for the specified name in the current Stata dataset.
      Parameters:
      varname - Name of the variable.
      Returns:
      The variable index. If the variable does not exist, 0 is returned.

      Note: When Stata version control is less than 15.0 and the variable does not exist, the number of variables plus one will be returned.
    • getVarLabel

      @ThreadSafe public static String getVarLabel(int var)
      Get the label for a Stata variable.
      Parameters:
      var - Index of the variable to look up.
      Returns:
      The variable's label.
    • getVarName

      @ThreadSafe public static String getVarName(int var)
      Get the variable name at a given variable index.
      Parameters:
      var - Index of the variable to look up.
      Returns:
      The name of the Stata variable.
    • isAlias

      @ThreadSafe public static boolean isAlias(int var)
      Determine if a variable is an alias for a variable in another frame.
      Parameters:
      var - Variable to access.
      Returns:
      True for alias variables.
    • isParsedIfTrue

      @ThreadSafe public static boolean isParsedIfTrue(long obs)
      Determine if an observation for the if expression qualifier used when Java was invoked is true or false.
      Parameters:
      obs - The observation to test.
      Returns:
      True when the if expression evaluates to true for the specified observation. When an if expression is not specified when Java was invoked, this function will return true.
    • isVarlistSpecified

      @Synchronized public static boolean isVarlistSpecified()
      Determine if a varlist was specified when Java was invoked.
      Returns:
      True if a varlist was specified when Java was invoked.
    • isVarTypeStr

      @ThreadSafe public static boolean isVarTypeStr(int var)
      Test if a variable is of type str.
      Parameters:
      var - The index of the variable to test.
      Returns:
      True if the variable is of type str.
    • isVarTypeString

      @ThreadSafe public static boolean isVarTypeString(int var)
      Test if a variable's type is string.
      Parameters:
      var - The index of the variable to test.
      Returns:
      True if the variable is a string variable of either type str or type strL.
    • isVarTypeStrL

      @ThreadSafe public static boolean isVarTypeStrL(int var)
      Test if a variable is of type strL.
      Parameters:
      var - The index of the variable to test.
      Returns:
      True if the variable is of type strL.
    • makeVarName

      @ThreadSafe public static String makeVarName(String s, boolean retainCase)
      Attempt to form a valid variable name from a string.
      Parameters:
      s - Source string.
      retainCase - If set, the case will not be converted to lowercase.
      Returns:
      The new variable name. Returns null if a valid name was not created.
    • mapParsedVarIndex

      @ThreadSafe public static int mapParsedVarIndex(int var)
      Map the variable index from the parsed varlist. For example, if Java was invoked with three variables, loop over 1, 2, and 3. For each iteration, use this method to translate 1, 2, and 3 to the correct variable index within the dataset.
      Parameters:
      var - Parsed variable index.
      Returns:
      The actual variable index in the dataset.
    • readBytes

      @Synchronized public static int readBytes(StrLConnector sc, byte[] b) throws IOException
      Read a sequence of bytes from a strL.
      Parameters:
      sc - The StrLConnector representing a strL.
      b - The buffer into which the data are read.
      Returns:
      The total number of bytes read into the buffer, or -1 if there are no more data because the end has been reached. May return a negative Stata return code if an error occurs.
      Throws:
      IOException - Throws an IOException if an error occurs.
    • readBytes

      @Synchronized public static int readBytes(StrLConnector sc, byte[] b, int off, int len) throws IOException
      Read a sequence of bytes from a strL.
      Parameters:
      sc - The StrLConnector representing a strL.
      b - The buffer into which the data are read.
      off - The start offset in the destination array b.
      len - The maximum number of bytes read.
      Returns:
      The total number of bytes read into the buffer, or -1 if there are no more data because the end has been reached. May return a negative Stata return code if an error occurs.
      Throws:
      IOException - Throws an IOException if an error occurs.
    • renameVar

      @Synchronized public static int renameVar(int var, String newname)
      Rename a Stata variable.
      Parameters:
      var - Index of the variable to rename.
      newname - New variable name.
      Returns:
      Return code from Stata; 0 if successful.
    • setObsTotal

      @Synchronized public static int setObsTotal(long obs)
      Set the number of observations in the current Stata dataset.
      Parameters:
      obs - The number of observations to set.
      Returns:
      Return code from Stata; 0 if successful.
    • setVarFormat

      @Synchronized public static int setVarFormat(int var, String format)
      Set the format for a Stata variable.
      Parameters:
      var - Index of the variable to format.
      format - New format.
      Returns:
      Return code from Stata; 0 if successful.
    • setVarLabel

      @Synchronized public static int setVarLabel(int var, String label)
      Set the label for a Stata variable.
      Parameters:
      var - Index of the variable to label.
      label - New label.
      Returns:
      Return code from Stata; 0 if successful.
    • storeBytes

      @Synchronized public static int storeBytes(StrLConnector sc, byte[] bytes, boolean binary)
      Store a byte buffer to a strL. You do not need to call allocateStrL before using this method.
      Parameters:
      sc - The StrLConnector representing a strL.
      bytes - Bytes to store.
      binary - Mark the data as binary.
      Returns:
      Return code from Stata; 0 if successful.
    • storeNum

      @Synchronized public static int storeNum(int var, long obs, double value)
      Store a numeric value in the current Stata dataset. Variable-type promotion happens automatically if the value you are storing is larger than what the variable can currently store.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      value - Value to store.
      Returns:
      Return code from Stata; 0 if successful.
    • storeNumFast

      @ThreadSafe public static int storeNumFast(int var, long obs, double value)
      Store a numeric value in the current Stata dataset. This method does not perform variable-type promotion and does not update the modified state of the data. To mark the dataset as changed, you should make a single call to updateModified().
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      value - Value to store.
      Returns:
      Return code from Stata; 0 if successful.
    • storeStr

      @Synchronized public static int storeStr(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str or strL data types. Variable-type promotion happens automatically if the string you are storing is longer than what the variable can currently store.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      value - Value to store.
      Returns:
      Return code from Stata; 0 if successful.
    • storeStrf

      @Synchronized public static int storeStrf(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str data types. Variable-type promotion happens automatically if the string you are storing is longer than what the variable can currently store. This method will not promote the type to a strL. If the string is longer than the maximum length of a str, then the string will be truncated.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      value - Value to store.
      Returns:
      Return code from Stata; 0 if successful.
    • storeStrfFast

      @ThreadSafe public static int storeStrfFast(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str data types. This method does not perform variable-type promotion and does not update the modified state of the data. To mark the dataset as changed, you should make a single call to updateModified(). If the string is longer than the current storage length, then the string will be truncated.
      Parameters:
      var - Variable to access.
      obs - Observation to access.
      value - Value to store.
      Returns:
      Return code from Stata; 0 if successful.
    • updateModified

      @Synchronized public static void updateModified()
      Inform Stata that its data has been modified. Most methods automatically invoke this function as needed. Avoid calling this method from within a loop.
    • writeBytes

      @Synchronized public static int writeBytes(StrLConnector sc, byte[] b)
      Write a byte buffer to a strL; the strL must be allocated using allocateStrL before calling this method. The buffer size may be smaller than the allocation size for the strL so that calling this method multiple times will write the data in chunks. The current position of each write will be automatically maintained. Writing beyond the allocation size is not permitted.
      Parameters:
      sc - The StrLConnector representing a strL.
      b - The buffer holding the data to store.
      Returns:
      Return code from Stata; 0 if successful.
    • writeBytes

      @Synchronized public static int writeBytes(StrLConnector sc, byte[] b, int off, int len)
      Write len bytes from the specified byte buffer starting at offset off to a strL; the strL must be allocated using allocateStrL before calling this method. The buffer size may be smaller than the allocation size for the strL so that calling this method multiple times will write the data in chunks. The current position of each write will be automatically maintained. Writing beyond the allocation size is not permitted.
      Parameters:
      sc - The StrLConnector representing a strL.
      b - The buffer holding the data to store.
      off - The offset into the buffer.
      len - The number of bytes to write.
      Returns:
      Return code from Stata; 0 if successful.