Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: hiding the contents


From   wgould@stata.com (William Gould, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: hiding the contents
Date   Wed, 16 May 2007 14:01:20 -0500

Onur Baser <baseronu@gmail.com> privately emailed me and wrote, 

> [...] It is just one simple do-file. How could I do this with Mata 
> so that source code can be hidden but can be still run [...]

Let's say our do-file is 

        ---------------------------------------------- myfile.do ---
	sysuse auto, clear
	regress mpg weight
        ------------------------------------------------------------

We will instead substitute

        ---------------------------------------------- myfile.do ---
	mata: myfunction()
        ------------------------------------------------------------

and give the client that plus the compiled version of myfunction(), 
the source of which reads

	void myfunction()
	{
		stata("sysuse auto, clear")
		stata("regress mpg weight")
	}

We will give the client the new myfile.do along with myfunction.mo (the
compiled source).

Now in fact, smart computer persons wanting to know how this works might 
be tempted to extract the strings from binary file myfunction.mo, and if 
they did, they would find "sysuse auto, clear" and "regress mpg weight".
If we are worried about that, we will need to disguise our work.

In a translated ado-file, I wouldn't be worried about that because there 
will be enough of the code really translated into Mata -- Mata constructs 
substituted for Stata constructs -- that the few pieces they could see 
wouldn't provide much information.

In this case, however, our entire secret is "sysuse auto, clear" and 
"regress mpg weight".  They are lots of approaches we could use; I'll 
use encryption.  I'm just going to add 1 to each letter, so to write
"use", I'll write "vtf".  Later, it will be easy enough for me to subtract
one, so I'll see "vtf" and know it means "use".  That'll fool 'em.
Actually, I'm going to add 1 to each character and not worry about mapping 
ASCII character 254, so here are my encoder and decoder:

	string scalar encode(string scalar pt)
	{
        	return(char(ascii(pt):+1))
	}

	string scalar decode(string scalar et)
	{
        	return(char(ascii(et):-1))
	}

Depending on the value of the secret, Onur might want to consider substituting
better algorithms.

Anyway, now I'll use encode() to encode my strings, at home, where no one can
me:

        : encode("sysuse auto, clear")
          tztvtf!bvup-!dmfbs

        : encode("regress mpg weight")
          sfhsftt!nqh!xfjhiu

With that, I can recode myfunction():

	void myfunction()
	{
		stata(decode("tztvtf!bvup-!dmfbs"))
		stata(decode("sfhsftt!nqh!xfjhiu"))
	}

We will give to our client myfile.do, myfunction.mo, and decode.mo.  To
further confuse the client, we might even put the two .mo files in a .mlib
library.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index