Package com.stata.sfi

Class Data


  • public final class Data
    extends Object
    This class provides access to the current Stata dataset. All variable and observation numbering begins at 1 unless otherwise stated.

    Example:

    This example shows how to handle a Stata varlist along with if and in to restrict observations. The example calculates summary statistics and displays a table similar to Stata's summarize command.

    
    public class Examples 
    {
    	// call directly from integrated environment using java or java:
    	public static void summarize() {
    		int rc = summarize(null);
    		SFIToolkit.setRC(rc);
    	}
    	
    	// call directly from a plugin
    	public static int summarize(String args[]) {
    		int parsedVariables = Data.getParsedVarCount();
    		long obsStart = Data.getObsParsedIn1();
    		long obsEnd = Data.getObsParsedIn2();
    
    		if (parsedVariables <= 0) {
    			SFIToolkit.errorln("varlist required");
    			return 100;
    		}
    
    		// display the header
    		SFIToolkit.displayln("\n" +   "    " +
    				"Variable {c |}        Obs        Mean    Std. Dev.       Min        Max");
    
    		for (int v = 1; v <= parsedVariables; v++) {
    			double sum = 0;
    			double max = Double.NEGATIVE_INFINITY;
    			double min = Double.POSITIVE_INFINITY;
    			double mean = 0;
    			double stddev = 0;
    			long count = 0;
    
    			// get the real variable index for the ith parsed variable
    			int varIndex = Data.mapParsedVarIndex(v);
    
    			if (!Data.isVarTypeStr(varIndex)) {
    
    				// calculate mean
    				for (long obs = obsStart; obs <= obsEnd; obs++) {
    					if (! Data.isParsedIfTrue(obs)) {
    						continue;
    					}
    					double value = Data.getNum(varIndex, obs);
    					if (Missing.isMissing(value)) {
    						continue ;
    					}
    					max = Math.max(max, value);
    					min = Math.min(min, value);
    					sum += value;
    					count++;
    				}
    				mean = sum / count;
    
    				// calculate std. dev.
    				double d2sum = 0;
    				for (long obs = obsStart; obs <= obsEnd; obs++) {
    					if (! Data.isParsedIfTrue(obs)) {
    						continue;
    					}
    					double value = Data.getNum(varIndex, obs);
    					if (Missing.isMissing(value)) {
    						continue ;
    					}
    					d2sum += Math.pow(value-mean,2);
    				}
    				stddev = Math.sqrt(d2sum/(count-1));
    			}
    
    			// write out the results
    			if (v % 5 == 1) {
    				SFIToolkit.displayln("{hline 13}{c +}{hline 57}");
    			}
    			String out = String.format("%12s {c |}%11s", 
    					Data.getVarName(varIndex),
    					SFIToolkit.formatValue(count, "%11.0fc"));
    			if (count>0) {
    				out += String.format("   %9s   %9s  %9s  %9s",
    						SFIToolkit.formatValue(mean,  "%9.0g"),
    						SFIToolkit.formatValue(stddev,"%9.0g"),
    						SFIToolkit.formatValue(min,   "%9.0g"),
    						SFIToolkit.formatValue(max,   "%9.0g"));
    			}
    			SFIToolkit.displayln(out);
    			SFIToolkit.pollnow();
    			// outer loop; poll each time to update display
    			// avoid polling too often; use pollstd() when possible
    		}
    		return 0;
    	}
    }
      
    From Stata...
    
    . sysuse auto, clear
    (1978 Automobile Data)
    
    // call summarize from integrated Java environment; assumes that Examples
    // class is defined in Stata do-file that has already been run
    . java rep if mpg > 22 in 12/50: Examples.summarize()
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+----------------------------------------------------------
           rep78 |          8        3.25     1.38873          1          5
    
    
    // call summarize as a Java plugin; assumes that Examples class
    // is compiled and archived to examples.jar
    . javacall Examples summarize rep if mpg > 22 in 12/50, jar(examples.jar)
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+----------------------------------------------------------
           rep78 |          8        3.25     1.38873          1          5
    
    // compare with built-in summarize command
    . summarize rep if mpg > 22 in 12/50
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
           rep78 |          8        3.25     1.38873          1          5
    
    
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static int addVarByte​(String name)
      Add a variable of type byte to the current Stata dataset.
      static int addVarDouble​(String name)
      Add a variable of type double to the current Stata dataset.
      static int addVarFloat​(String name)
      Add a variable of type float to the current Stata dataset.
      static int addVarInt​(String name)
      Add a variable of type int to the current Stata dataset.
      static int addVarLong​(String name)
      Add a variable of type long to the current Stata dataset.
      static int addVarStr​(String name, int length)
      Add a variable of type str to the current Stata dataset.
      static int addVarStrL​(String name)
      Add a variable of type strL to the current Stata dataset.
      static int allocateStrL​(StrLConnector sc, long size)
      Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
      static int allocateStrL​(StrLConnector sc, long size, boolean binary)
      Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
      static int dropVar​(int var)
      Drop the variable at the specified variable index.
      static int getBestType​(double value)
      Get the best numeric data type for the specified value.
      static String getFormattedValue​(int var, long obs, boolean bValueLabel)
      Read a value from the current Stata dataset, applying its display format.
      static int getMaxStrLength()
      Get the maximum length of a Stata string variable of type str.
      static int getMaxVars()
      Get the maximum number of variables Stata currently allows.
      static double getNum​(int var, long obs)
      Read a numeric value from the current Stata dataset.
      static long getObsParsedIn1()
      Get the first in a range of observations if Java was invoked with the in qualifier.
      static long getObsParsedIn2()
      Get the last in a range of observations if Java was invoked with the in qualifier.
      static long getObsTotal()
      Get the number of observations in the current Stata dataset.
      static int getParsedVarCount()
      Get the number of variables specified when Java was invoked.
      static Double getRealOfString​(String s)
      Get the double representation of a String using Stata's real() function.
      static String getStr​(int var, long obs)
      Read a string value from the current Stata dataset; this method can be used to read str or strL data types.
      static String getStrf​(int var, long obs)
      Read a string value from the current Stata dataset; this method can be used to read str data types.
      static int getStrVarWidth​(int var)
      Get the width of a variable of type str.
      static int getType​(int var)
      Get the data type for the specified variable.
      static int getVarCount()
      Get the number of variables in the current Stata dataset.
      static String getVarFormat​(int var)
      Get the format for a Stata variable.
      static int getVarIndex​(String varname)
      Look up the variable index for the specified name in the current Stata dataset.
      static String getVarLabel​(int var)
      Get the label for a Stata variable.
      static String getVarName​(int var)
      Get the variable name at a given variable index.
      static boolean isParsedIfTrue​(long obs)
      Determine if an observation for the if expression qualifier used when Java was invoked is true or false.
      static boolean isVarlistSpecified()
      Determine if a varlist was specified when Java was invoked.
      static boolean isVarTypeStr​(int var)
      Test if a variable is of type str.
      static boolean isVarTypeString​(int var)
      Test if a variable's type is string.
      static boolean isVarTypeStrL​(int var)
      Test if a variable is of type strL.
      static String makeVarName​(String s, boolean retainCase)
      Attempt to form a valid variable name from a string.
      static int mapParsedVarIndex​(int var)
      Map the variable index from the parsed varlist.
      static int readBytes​(StrLConnector sc, byte[] b)
      Read a sequence of bytes from a strL.
      static int readBytes​(StrLConnector sc, byte[] b, int off, int len)
      Read a sequence of bytes from a strL.
      static int renameVar​(int var, String newname)
      Rename a Stata variable.
      static int setObsTotal​(long obs)
      Set the number of observations in the current Stata dataset.
      static int setVarFormat​(int var, String format)
      Set the format for a Stata variable.
      static int setVarLabel​(int var, String label)
      Set the label for a Stata variable.
      static int storeBytes​(StrLConnector sc, byte[] bytes, boolean binary)
      Store a byte buffer to a strL.
      static int storeNum​(int var, long obs, double value)
      Store a numeric value in the current Stata dataset.
      static int storeNumFast​(int var, long obs, double value)
      Store a numeric value in the current Stata dataset.
      static int storeStr​(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str or strL data types.
      static int storeStrf​(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str data types.
      static int storeStrfFast​(int var, long obs, String value)
      Store a string value in the current Stata dataset; this method can be used to store str data types.
      static void updateModified()
      Inform Stata that its data has been modified.
      static int writeBytes​(StrLConnector sc, byte[] b)
      Write a byte buffer to a strL; the strL must be allocated using allocateStrL before calling this method.
      static int writeBytes​(StrLConnector sc, byte[] b, int off, int len)
      Write len bytes from the specified byte buffer starting at offset off to a strL; the strL must be allocated using allocateStrL before calling this method.
    • Method Detail

      • addVarByte

        @Synchronized
        public static int addVarByte​(String name)
        Add a variable of type byte to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarDouble

        @Synchronized
        public static int addVarDouble​(String name)
        Add a variable of type double to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarFloat

        @Synchronized
        public static int addVarFloat​(String name)
        Add a variable of type float to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarInt

        @Synchronized
        public static int addVarInt​(String name)
        Add a variable of type int to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarLong

        @Synchronized
        public static int addVarLong​(String name)
        Add a variable of type long to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarStr

        @Synchronized
        public static int addVarStr​(String name,
                                    int length)
        Add a variable of type str to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        length - Initial size of the variable. If the length is greater than getMaxStrLength(), then a variable of type strL will be created.
        Returns:
        Return code from Stata; 0 if successful.
      • addVarStrL

        @Synchronized
        public static int addVarStrL​(String name)
        Add a variable of type strL to the current Stata dataset.
        Parameters:
        name - Name of the variable to be created.
        Returns:
        Return code from Stata; 0 if successful.
      • allocateStrL

        @Synchronized
        public static int allocateStrL​(StrLConnector sc,
                                       long size)
        Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized. By default, the data will be marked as binary.
        Parameters:
        sc - The StrLConnector representing a strL.
        size - The size in bytes.
        Returns:
        Return code from Stata; 0 if successful.
      • allocateStrL

        @Synchronized
        public static int allocateStrL​(StrLConnector sc,
                                       long size,
                                       boolean binary)
        Allocate a strL so that a buffer can be stored using writeBytes; the contents of the strL will not be initialized.
        Parameters:
        sc - The StrLConnector representing a strL.
        size - The size in bytes.
        binary - Mark the data as binary. Note that if the data are not marked as binary, Stata expects that the data be UTF-8 encoded. An alternate approach is to call storeStr, where the encoding is automatically handled.
        Returns:
        Return code from Stata; 0 if successful.
      • dropVar

        @Synchronized
        public static int dropVar​(int var)
        Drop the variable at the specified variable index.
        Parameters:
        var - Variable to drop.
        Returns:
        Return code from Stata; 0 if successful.
      • getBestType

        @ThreadSafe
        public static int getBestType​(double value)
        Get the best numeric data type for the specified value.
        Parameters:
        value - The value to test.
        Returns:
        The field value representing the data type; may be TYPE_BYTE, TYPE_INT, TYPE_LONG, TYPE_FLOAT, or TYPE_DOUBLE.
      • getFormattedValue

        @Synchronized
        public static String getFormattedValue​(int var,
                                               long obs,
                                               boolean bValueLabel)
        Read a value from the current Stata dataset, applying its display format.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        bValueLabel - Use the value label when available.
        Returns:
        The formatted value as a String.
      • getMaxStrLength

        @ThreadSafe
        public static int getMaxStrLength()
        Get the maximum length of a Stata string variable of type str.
        Returns:
        The maximum length.
      • getMaxVars

        @ThreadSafe
        public static int getMaxVars()
        Get the maximum number of variables Stata currently allows.
        Returns:
        The maximum number of variables.
      • getNum

        @ThreadSafe
        public static double getNum​(int var,
                                    long obs)
        Read a numeric value from the current Stata dataset.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        Returns:
        The value.
      • getObsParsedIn1

        @Synchronized
        public static long getObsParsedIn1()
        Get the first in a range of observations if Java was invoked with the in qualifier. If in was not specified, then the range will reflect the entire dataset.
        Returns:
        The first observation's number.
      • getObsParsedIn2

        @Synchronized
        public static long getObsParsedIn2()
        Get the last in a range of observations if Java was invoked with the in qualifier. If in was not specified, then the range will reflect the entire dataset.
        Returns:
        The last observation's number.
      • getObsTotal

        @ThreadSafe
        public static long getObsTotal()
        Get the number of observations in the current Stata dataset.
        Returns:
        The number of observations.
      • getParsedVarCount

        @Synchronized
        public static int getParsedVarCount()
        Get the number of variables specified when Java was invoked. If a varlist was not specified, then all the variables are implied.
        Returns:
        The number of variables.
      • getRealOfString

        @ThreadSafe
        public static Double getRealOfString​(String s)
        Get the double representation of a String using Stata's real() function.
        Parameters:
        s - The string to convert.
        Returns:
        The numeric value. Returns null if an error occurs.
      • getStr

        @Synchronized
        public static String getStr​(int var,
                                    long obs)
        Read a string value from the current Stata dataset; this method can be used to read str or strL data types.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        Returns:
        The String. Returns null if an error occurs.
      • getStrf

        @ThreadSafe
        public static String getStrf​(int var,
                                     long obs)
        Read a string value from the current Stata dataset; this method can be used to read str data types.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        Returns:
        The String. Returns null if an error occurs.
      • getStrVarWidth

        @ThreadSafe
        public static int getStrVarWidth​(int var)
        Get the width of a variable of type str.
        Parameters:
        var - The index of the variable to test.
        Returns:
        The width if the variable is of type str.
      • getVarCount

        @ThreadSafe
        public static int getVarCount()
        Get the number of variables in the current Stata dataset.
        Returns:
        The number of variables.
      • getVarFormat

        @ThreadSafe
        public static String getVarFormat​(int var)
        Get the format for a Stata variable.
        Parameters:
        var - Index of the variable to look up.
        Returns:
        The variable's format.
      • getVarIndex

        @Synchronized
        public static int getVarIndex​(String varname)
        Look up the variable index for the specified name in the current Stata dataset.
        Parameters:
        varname - Name of the variable.
        Returns:
        The variable index. If the variable does not exist, 0 is returned.

        Note: When Stata version control is less than 15.0 and the variable does not exist, the number of variables plus one will be returned.
      • getVarLabel

        @ThreadSafe
        public static String getVarLabel​(int var)
        Get the label for a Stata variable.
        Parameters:
        var - Index of the variable to look up.
        Returns:
        The variable's label.
      • getVarName

        @ThreadSafe
        public static String getVarName​(int var)
        Get the variable name at a given variable index.
        Parameters:
        var - Index of the variable to look up.
        Returns:
        The name of the Stata variable.
      • isParsedIfTrue

        @ThreadSafe
        public static boolean isParsedIfTrue​(long obs)
        Determine if an observation for the if expression qualifier used when Java was invoked is true or false.
        Parameters:
        obs - The observation to test.
        Returns:
        True when the if expression evaluates to true for the specified observation. When an if expression is not specified when Java was invoked, this function will return true.
      • isVarlistSpecified

        @Synchronized
        public static boolean isVarlistSpecified()
        Determine if a varlist was specified when Java was invoked.
        Returns:
        True if a varlist was specified when Java was invoked.
      • isVarTypeStr

        @ThreadSafe
        public static boolean isVarTypeStr​(int var)
        Test if a variable is of type str.
        Parameters:
        var - The index of the variable to test.
        Returns:
        True if the variable is of type str.
      • isVarTypeString

        @ThreadSafe
        public static boolean isVarTypeString​(int var)
        Test if a variable's type is string.
        Parameters:
        var - The index of the variable to test.
        Returns:
        True if the variable is a string variable of either type str or type strL.
      • isVarTypeStrL

        @ThreadSafe
        public static boolean isVarTypeStrL​(int var)
        Test if a variable is of type strL.
        Parameters:
        var - The index of the variable to test.
        Returns:
        True if the variable is of type strL.
      • makeVarName

        @ThreadSafe
        public static String makeVarName​(String s,
                                         boolean retainCase)
        Attempt to form a valid variable name from a string.
        Parameters:
        s - Source string.
        retainCase - If set, the case will not be converted to lowercase.
        Returns:
        The new variable name. Returns null if a valid name was not created.
      • mapParsedVarIndex

        @ThreadSafe
        public static int mapParsedVarIndex​(int var)
        Map the variable index from the parsed varlist. For example, if Java was invoked with three variables, loop over 1, 2, and 3. For each iteration, use this method to translate 1, 2, and 3 to the correct variable index within the dataset.
        Parameters:
        var - Parsed variable index.
        Returns:
        The actual variable index in the dataset.
      • readBytes

        @Synchronized
        public static int readBytes​(StrLConnector sc,
                                    byte[] b)
                             throws IOException
        Read a sequence of bytes from a strL.
        Parameters:
        sc - The StrLConnector representing a strL.
        b - The buffer into which the data are read.
        Returns:
        The total number of bytes read into the buffer, or -1 if there are no more data because the end has been reached. May return a negative Stata return code if an error occurs.
        Throws:
        IOException - Throws an IOException if an error occurs.
      • readBytes

        @Synchronized
        public static int readBytes​(StrLConnector sc,
                                    byte[] b,
                                    int off,
                                    int len)
                             throws IOException
        Read a sequence of bytes from a strL.
        Parameters:
        sc - The StrLConnector representing a strL.
        b - The buffer into which the data are read.
        off - The start offset in the destination array b.
        len - The maximum number of bytes read.
        Returns:
        The total number of bytes read into the buffer, or -1 if there are no more data because the end has been reached. May return a negative Stata return code if an error occurs.
        Throws:
        IOException - Throws an IOException if an error occurs.
      • renameVar

        @Synchronized
        public static int renameVar​(int var,
                                    String newname)
        Rename a Stata variable.
        Parameters:
        var - Index of the variable to rename.
        newname - New variable name.
        Returns:
        Return code from Stata; 0 if successful.
      • setObsTotal

        @Synchronized
        public static int setObsTotal​(long obs)
        Set the number of observations in the current Stata dataset.
        Parameters:
        obs - The number of observations to set.
        Returns:
        Return code from Stata; 0 if successful.
      • setVarFormat

        @Synchronized
        public static int setVarFormat​(int var,
                                       String format)
        Set the format for a Stata variable.
        Parameters:
        var - Index of the variable to format.
        format - New format.
        Returns:
        Return code from Stata; 0 if successful.
      • setVarLabel

        @Synchronized
        public static int setVarLabel​(int var,
                                      String label)
        Set the label for a Stata variable.
        Parameters:
        var - Index of the variable to label.
        label - New label.
        Returns:
        Return code from Stata; 0 if successful.
      • storeBytes

        @Synchronized
        public static int storeBytes​(StrLConnector sc,
                                     byte[] bytes,
                                     boolean binary)
        Store a byte buffer to a strL. You do not need to call allocateStrL before using this method.
        Parameters:
        sc - The StrLConnector representing a strL.
        bytes - Bytes to store.
        binary - Mark the data as binary.
        Returns:
        Return code from Stata; 0 if successful.
      • storeNum

        @Synchronized
        public static int storeNum​(int var,
                                   long obs,
                                   double value)
        Store a numeric value in the current Stata dataset. Variable-type promotion happens automatically if the value you are storing is larger than what the variable can currently store.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        value - Value to store.
        Returns:
        Return code from Stata; 0 if successful.
      • storeNumFast

        @ThreadSafe
        public static int storeNumFast​(int var,
                                       long obs,
                                       double value)
        Store a numeric value in the current Stata dataset. This method does not perform variable-type promotion and does not update the modified state of the data. To mark the dataset as changed, you should make a single call to updateModified().
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        value - Value to store.
        Returns:
        Return code from Stata; 0 if successful.
      • storeStr

        @Synchronized
        public static int storeStr​(int var,
                                   long obs,
                                   String value)
        Store a string value in the current Stata dataset; this method can be used to store str or strL data types. Variable-type promotion happens automatically if the string you are storing is longer than what the variable can currently store.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        value - Value to store.
        Returns:
        Return code from Stata; 0 if successful.
      • storeStrf

        @Synchronized
        public static int storeStrf​(int var,
                                    long obs,
                                    String value)
        Store a string value in the current Stata dataset; this method can be used to store str data types. Variable-type promotion happens automatically if the string you are storing is longer than what the variable can currently store. This method will not promote the type to a strL. If the string is longer than the maximum length of a str, then the string will be truncated.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        value - Value to store.
        Returns:
        Return code from Stata; 0 if successful.
      • storeStrfFast

        @ThreadSafe
        public static int storeStrfFast​(int var,
                                        long obs,
                                        String value)
        Store a string value in the current Stata dataset; this method can be used to store str data types. This method does not perform variable-type promotion and does not update the modified state of the data. To mark the dataset as changed, you should make a single call to updateModified(). If the string is longer than the current storage length, then the string will be truncated.
        Parameters:
        var - Variable to access.
        obs - Observation to access.
        value - Value to store.
        Returns:
        Return code from Stata; 0 if successful.
      • updateModified

        @Synchronized
        public static void updateModified()
        Inform Stata that its data has been modified. Most methods automatically invoke this function as needed. Avoid calling this method from within a loop.
      • writeBytes

        @Synchronized
        public static int writeBytes​(StrLConnector sc,
                                     byte[] b)
        Write a byte buffer to a strL; the strL must be allocated using allocateStrL before calling this method. The buffer size may be smaller than the allocation size for the strL so that calling this method multiple times will write the data in chunks. The current position of each write will be automatically maintained. Writing beyond the allocation size is not permitted.
        Parameters:
        sc - The StrLConnector representing a strL.
        b - The buffer holding the data to store.
        Returns:
        Return code from Stata; 0 if successful.
      • writeBytes

        @Synchronized
        public static int writeBytes​(StrLConnector sc,
                                     byte[] b,
                                     int off,
                                     int len)
        Write len bytes from the specified byte buffer starting at offset off to a strL; the strL must be allocated using allocateStrL before calling this method. The buffer size may be smaller than the allocation size for the strL so that calling this method multiple times will write the data in chunks. The current position of each write will be automatically maintained. Writing beyond the allocation size is not permitted.
        Parameters:
        sc - The StrLConnector representing a strL.
        b - The buffer holding the data to store.
        off - The offset into the buffer.
        len - The number of bytes to write.
        Returns:
        Return code from Stata; 0 if successful.