Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -datasignature-


From   [email protected] (William Gould, Stata)
To   [email protected]
Subject   st: -datasignature-
Date   Thu, 25 May 2006 08:23:18 -0500

The most recent update to Stata was released 17may2006, 8 days ago.

Included in that update is a new command that I suspect many users will
find useful, -datasignature-.  If you've installed the update, type 
-help datasignature-.  If you haven't, or are unsure, type -update query- 
to find out, and type -update all- to install.

Here's the result of running -datasignature- on auto.dta:

        . sysuse auto
        (1978 automobile data)

        . datasig
          74:12(71728):3831085005:1395876116

That's auto.dta's data signature.  If you change the data, even just a 
little bit, the last two numbers will change:

        . replace mpg = mpg+1 in 2
        (1 real change made)

        . datasig
          74:12(71728):1616229321:1400086868

If you change the name of a variable, the last two numbers stay the same 
and the number in the parenthesis changes, 

        . rename mpg miles_per_gallon

        . datasig
          74:12(57876):1616229321:1400086868

-datasignature- is designed to help those of you who use data maintained by
others, and those of you who worry that you might yourself have accidently
changed your data.

In the latter case, you could save the signature in the dataset, 

        . datasig
          74:12(57876):1616229321:1400086868

        . note: `r(datasignature)'

and check it later, 

        . notes

        _dta:
          1.  from Consumer Reports with permission
          2.  74:12(57876):1616229321:1400086868

Another idea is to include -datasignature- at the beginning of logs:

        ------------------------------------ myfile.do ---
        log using myfile, replace 

        sysuse auto, clear
        datasig

        ...
        log close
        ------------------------------------ myfile.do ---

You can specify a varlist and if and in, so if you have a large dataset
and just want to check part of it, you can write,


        ------------------------------------ myfile.do ---
        log using myfile, replace 

        sysuse auto, clear
        datasig mpg weight price if foreign

        ...
        log close
        ------------------------------------ myfile.do ---

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index