Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: Saving 1 observation


From   "Sergiy Radyakin" <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: Saving 1 observation
Date   Wed, 28 May 2008 20:00:15 -0400

Dear Michael,

thank you very much for your suggestions. Just as you wrote, the first
one saves about half the time needed and is a good improvement. The
second one is a bit complicated, since I don't immediately see how the
labels can be declared/saved with this approach.

So, I am thinking about saving the labels first with -label save-,
then dumping the data into several files with -post-, then open them
one-by-one and apply the saved labels and resave. Would that be the
fastest way to do it?

Thank you,
   Sergiy Radyakin




On 5/28/08, Michael Blasnik <michael.blasnik@verizon.net> wrote:
> ...
>
> I have two suggestions that may be worth exploring:
>
> 1) use -restore, preserve- instead of -restore- and you will save the time
> required to preserve the dataset next time.
>
> 2) a little more tricky, but you could employ -post-  to post an observation
> to a dataset.  I'm not sure how much time this would save but it may be
> worth a try.
>
> Michael Blasnik
>
>
> ----- Original Message ----- From: "Sergiy Radyakin"
> <serjradyakin@gmail.com>
> To: <statalist@hsphsun2.harvard.edu>
> Sent: Wednesday, May 28, 2008 6:31 PM
> Subject: st: Saving 1 observation
>
>
> > Hello All!
> >
> > I have a large dataset (to be specific ~ 1mln observations, 600MB).
> >
> > I need to (repeatedly) save several small portions of it (small can be
> > as small as 1 observation) into separate files.
> >
> > So far it is done similarly to this
> >
> > preserve
> >  keep if Needed1
> >  save "Portion1"
> > restore
> >
> > preserve
> >  keep if Needed2
> >  save "Portion2"
> > restore
> >
> > ... etc ...
> >
> > where variables Needed1 and Needed2 are dummies generated earlier in the
> code.
> >
> > This works. But it is painfully slow.
> >
> > The problem is that it will necessarily have to preserve/restore the
> > whole large dataset.
> > -save-  does not support -if- and -in- modifiers, otherwise my ideal
> > choice would be:
> >
> > save "Portion1" if Needed1
> > save "Portion2" if Needed2
> >
> > As an alternative I was thinking of saving the dataset directly (by
> > generating Stata file byte-by-byte), but since I need labels to be
> > preserved together with the data, this becomes more tricky, and
> > reinventing what is already [well] done, does not sound like a good
> > idea.
> >
> > To pose a specific question: how to save one observation 1<=K<=_N
> > (with labels) to a Stata file, without having to save the whole
> > dataset?
> >
> > Version of Stata: Stata 10/ Windows
> >
> > Thank you,
> >   Sergiy Radyakin
> >
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index