Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: RE: Encryption of data


From   "Rodrigo A. Alfaro" <[email protected]>
To   <[email protected]>
Subject   Re: st: RE: RE: Encryption of data
Date   Thu, 14 Jun 2007 09:52:06 -0400

///
I sent this twice yesterday... it seems that it wasn't
distributed. Sorry for the "duplication".

Rodrigo wrote:
Very interesting discussion, I really like Maarten solution.
I was thinking in my lunch-time how to deal with the chance
of having ties in the uniform sequence (this could be important
for very large datasets). This is my small contribution.

(1) [Dealing with ties] Generate 2 or more uniform variables
and sort the main dataset with these. You can also add in the
sorting additional variables.

(2) [New key] Use Bill's idea of a long type of key. Save
the correspondence between id/key or (id time)/key in case
that you have a panel in a small dataset (such as code).

(3) [Other datasets] Hendri can impose the same keys than
before for other dataset (such as car).

Rodrigo.

/// Maarten's example (modified)
*------------ begin example -----------
sysuse auto, clear
set seed 12345 /// customized choice
gen double aux1 = uniform()
set seed 23456 /// customized choice
gen double aux2 = uniform()
sort aux1 aux2 price mpg /// dealing with ties
gen long key = _n /// Bill's idea.

preserve
sort key
drop make
save newauto /// new dataset
restore

keep make key
sort key
save secret /// codes

use newauto, clear
sort key
merge key using codes

use code, clear
sort make
merge make using car /// car is sorted by make
drop make
save newcar
*----------- end example ------------


----- Original Message ----- From: "Hendri Adriaens" <[email protected]>
To: <[email protected]>
Sent: Thursday, June 14, 2007 3:43 AM
Subject: RE: st: RE: RE: Encryption of data



Hi William,

Thanks for your information.

There is no additional security to be gained by doing that.
Ties do not
matter in this case.
It might not matter for security, but for my application it does. The
information from the master data set (that will be anonymised) will have to
be merged into a new dataset (to be anonymised with the mapping). If the
mapping contains ties, -merge- wouldn't know which of the tied records to
insert in the new dataset.

But thanks, best,
-Hendri.


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index