Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Stata 12 Announcement

From   "William Gould, StataCorp LP" <>
Subject   st: Stata 12 Announcement
Date   Sun, 26 Jun 2011 18:07:58 -0500

Following long tradition, we are informing Statalist first:

    Stata 12 begins shipping Monday, July 25.  

    Orders are now being accepted at

Below are some highlights.

Automatic memory management

    Automatic memory management means that you no longer have to 
    -set memory- and never again will you be told that there is no
    room because you set too little!  Stata automatically adjusts its
    memory usage up or down according to current requirements.

    The memory manager is tunable.  You can set a maximum if you wish.
    Old do-files can still -set memory-.  Stata merely responds, "-set
    memory- ignored".

    We have tested the memory manager on systems with 1 TB (the largest 
    currently available), and it is designed to scale to even more 

Import Excel files, export PDFs, and new interface features

    Importing Excel files is easy.  And the new Import Preview Tool
    lets you see the file's contents and adjust import settings before
    you import it.

    You can now directly export PDFs of graphs and logs. 

    Stata's windows are now laid out to fit wider screens better.  You
    can still get back the old layout from Edit -> Preferences.

    A new Properties window -- always available -- lets you manage
    your variables, including their names, labels, value labels,
    notes, formats, and storage types.

    The Viewer is now tabbed, and it has buttons at the top to access
    dialogs, to jump within the document, and to jump to Also See

    The Data Editor also has a new Properties window; has another tool
    that lets you Hide, Show, Filter, and Reorder the variables; and
    has the new Clipboard Preview tool, which lets you see and prepare 
    your raw data before pasting.

Structural equation modeling (SEM)

    -sem- is a new estimation command, itself the subject of
     an entire manual.  

    If you are new to SEM, you should be interested if you fit linear
    regressions, multivariate regressions, seemingly unrelated
    regressions, or simultaneous systems, or if you're interested in
    generalized method of moments (GMM).  And if you think you are
    still not interested, take a look anyway.  SEM is a remarkably
    flexible framework.

    If you know about SEM, you will be more interested in path
    analysis models, single- and multiple-factor measurement models,
    MIMIC models, latent growth models, correlated uniqueness models,
    and more, all of which can be fit by -sem-.  You will also be
    interested in -sem-'s standardized and unstandardized coefficients,
    direct and indirect effects, goodness-of-fit statistics,
    modification indices, predicted values and factor scores, and
    groupwise analysis with tests of invariance.

    You can use the GUI or command language to specify your model.
    The command language is a variation on standard path notation.
    You can type

        . sem (L1 -> m1 m2 m3)
              (L2 -> m4 m5) 
              (L1 -> L2) 

    In -sem-, lowercase names refer to variables in the data and
    uppercase names are latent variables.  The above corresponds to

              m1 = a1 + b1*L1 + e1
              m2 = a2 + b2*L1 + e2    
              m3 = a3 + b3*L1 + e3

              m4 = a4 + b4*L2 + e4
              m5 = a5 + b5*L2 + e5

              L2 = c1 + d1*L1 + e6

    Maximum likelihood (ML) and asymptotic distribution free (ADF)
    estimation methods are provided.  ADF is generalized method of
    moments (GMM).  Robust estimates of standard errors and SEs for
    clustered samples are available, as is full support for survey
    data via the -svy:- prefix.  Missing at random (MAR) data are
    supported via FIML.

Survey, cluster robust, and mixed models 

    -xtmixed- now supports sampling weights and robust and cluster-
    robust standard errors for use with survey data, although you do 
    *NOT* use the -svy:- prefix as you might have expected. 

    That is because multilevel models with survey data differ from 
    standard models in that sampling weights need to be specified at 
    each modeling level rather than just at the observation level.  
    Sampling weights must reflect selection probability conditional on 
    selection at the next highest level.

    Thus, -xtmixed- expects you to specify a weight for each level in 
    your model and warns you if you do not. 

Multiple imputation

    -mi impute- now supports

        1.  Chained equations.
            Chained equations are used to impute missing values when
            variables may be of different types and missing-value
            patterns are arbitrary.  The first variable could be
            imputed using logit, the second using linear regression,
            and the third using multinomial logistic regression.

        2.  Conditional imputation.
            Conditional imputation is customized imputation within
            group when group itself might be imputed.  You can
            restrict imputation of number of pregnancies to females
            even when female itself contains missing values and so is
            being imputed.

        3.  Imputation by groups. 
            Australians could have their missing values imputed using
            data from other Australians only.

    -mi estimate- now

        1.  Supports panel-data and multilevel models, so you can use
            -mi- with -xtreg- or -xtmixed-.

        2.  Allows you to measure the amount of simulation error in
            your final model, so you can decide whether you need more

    -mi predict- and -mi predictnl- create linear and nonlinear
    predictions in the original (m=0) data, and not just for complete
    observations but also for observations with missing values.

Time series

    Check out the

        1.  New estimators for 
                a.  GARCH
                b.  ARFIMA
                c.  UCM

        2.  New postestimation command -psdensity- to estimate the
            spectral density of a stationary process using the
            parameters of a previously estimated parametric model.

        3.  New command -tsfilter-, which filters a series to keep only
            selected periodicities (frequencies) and which can be used
            to separate a series into trend and cyclical components.

    Multivariate GARCH deals with models of time-varying volatility in
    multiple series.  These models allow the conditional covariance
    matrix of the dependent variables to follow a flexible dynamic
    structure and the conditional mean to follow a
    vector-autoregressive (VAR) structure.

    ARFIMA is a generalization of the ARMA and ARIMA models.  ARMA
    models assume short memory.  ARIMA models assume shocks are
    permanent.  ARFIMA provides the middle ground.  ARFIMA stands for
    autoregressive, fractionally integrated moving average.

    UCM stands for unobserved component model and decomposes a series
    into trend, seasonal, cyclic, and idiosyncratic components after
    controlling for optional exogenous variables.

Business calendars

    There is a new %t format:  %tb.  The b stands for business
    calendars.  Business calendars allow you to define your own
    calendars so that dates display correctly and lags and leads work
    as they should.

    You could create file lse.stbcal that records the days the London
    Stock Exchange is open (or closed) and then Stata would understand
    format %tblse just as it understands the usual date format %td.

    Once you define a calendar, Stata deeply understands it.  You can,
    for instance, easily convert between %tblse and %td values.

Constrasts and pairwise comparisons

    We were tempted to call this "Stata for Experimentalists" except
    that the features are useful to Stata users of all disciplines.

    Contrasts, pairwise comparisons, and margins plots are about
    understanding and communicating results from your model.  How does
    a covariate affect the response?  Is the effect nonlinear?  Does
    the effect depend on other covariates?

    New commands -contrast-, -pwcompare-, and -marginsplot- join

        1.  -contrast- compares effects of factor variables and their
            interactions.  It can perform ANOVA-style tests of main
            effects, simple effects, interactions, and nested effects.
            It also decomposes these effects into comparisons against
            reference categories, comparisons of adjacent levels,
            comparisons against the grand mean, orthogonal
            polynomials, and such.

            In addition to predefined standard contrasts, user-defined
            contrasts are also supported.  Consider

                 . contrast ar.educ

            The -ar.- out front is one of the new, predefined contrast
            operators.  -ar.- stands for "adjacent, reversed", and
            -contrast ar.educ- compares adjacent levels of education,
            for instance, high school to some college, some college to
            college graduate, etc.

        2.  -pwcompare- performs all (or subsets) of the pairwise
            comparisons.  This can be done for all levels of a single
            factor variable or for interactions or interactions with
            continuous variables.

        3.  -margins- now allows the new contrast operators and has a
            -pwcompare- option to perform pairwise comparisons.

        4.  -marginsplot- graphs results from -margins-.

ROC adjusted for covariates

    New command -rocreg- is like regression for ROC.  You can model
    how sensitivity and specificity depend on covariates, and you 
    can draw graphs.

Contour plots

     You just have to see one.  Visit


    There's more.  For instance -rename- has a new syntax that allows
    you to rename groups of variables.

        . rename (vara varb varc) (varc varb vara)

    swaps the names around.  

        . rename jan* *1

    renames all variables starting with jan to instead end in 1.

        . rename v# stat#

    renames v1 to be stat1, v2 to be stat2, and so on. 

        . rename v# v(##)

    renames v1 to be v01, v2 to be v02, ...

        . rename (a b c) v#, addnumber

    rename a to be v1, b to be v2, and c to be v3.  

        . rename v# (a b c)

    does the reverse.

There really is a lot more.  See

-- Bill
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index