|From||David Airey <firstname.lastname@example.org>|
|Subject||re: st: Corr2data questions|
|Date||Mon, 29 Dec 2003 21:22:00 -0600|
1. The corr2data command is handy if, say, there is a published analysis that includes the means, correlations and sds, and you want to replicate or modify the work (e.g. add or drop variables). I do this in some classroom exercises. At the same time, you have to remember that these are not the original data, and you are very limited in what you can do, e.g. you can't analyze subsets of the data, compute interaction terms, etc. All you can do is basic correlational and regression analysis with no modifications of the data (correct?). If I had to invent a term, I would call a data set created by corr2data a pseudo-replication of the original data, but is there a standard term already in use?WRT #1, I thought about using corr2data to create data sets that violated assumptions in a reliable way, but I did not get very far! I could not find examples to help me. I wanted small data sets with the variance-covariance structure a certain type, and then to submit these to the wrong, ok, good, and best models, to see what happened. Maybe the idea was wrongheaded.
2. Is there any reason the N for the corr2data command has to be the same as in the original data? I did a little experiment where I created a data set with 200,000 cases and ran a regression. I then created a 2nd data set with N = 200 and ran a regression specifying fw=1000. Results were virtually identical. Anyway, if the original data set was monstrous, this might be a way of saving disk space and computing time.