Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: How to reference results from a big dataset within a program


From   "Chen,Minxing" <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: How to reference results from a big dataset within a program
Date   Wed, 28 Aug 2013 16:58:53 +0000

Thank you Phil and Christopher for the very valuable suggestions! 

-- Richard, I totally agree with you, we do learn new things everyday, even for those that we thought we alreadyknew a lot. I didn't expect that my single email will generate such many helps, this proved again that Statalist community is such as an excellent and responsible one. 

Minxing

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Richard Williams
Sent: Wednesday, August 28, 2013 8:26 AM
To: [email protected]; [email protected]
Subject: Re: st: How to reference results from a big dataset within a program

At 06:06 AM 8/28/2013, Phil Schumm wrote:
>On Aug 27, 2013, at 4:25 PM, "Chen,Minxing" <[email protected]> wrote:
> > Basically, in the program I submitted, I had to reference results
> from a big pre-simulated dataset (four variables, but around
> 400,000 observations). In my previous submission, I simply submitted 
> the pre-simulated dataset with my program, and within the program I 
> called up that simulated dataset by using code such as "
> use c:\ado\personal\simudata". I was hoping when people download the 
> program from SSC, the pre-simulated dataset will be also downloaded to 
> the directory "c:\ado\personal\".
> >
> > Now my reviewer indicated that I can't expect users to do that, I
> can't even tell the user to place the file there because such a 
> directory may not be creatable for the user (e.g. they might not have 
> a C: drive). The reviewer suggested me to find some other way to get 
> the information in my pre-simulated dataset, such as incorporating the 
> data into the program.
> >
> > I tried to copy of the simulated data within my program by using
> syntax such as "input x y z k", however, since there are so many 
> observations (a little more than 400,000), and there are system limit 
> for the maximum lines of syntax within a program (around 3500), I was 
> not able to do this way. The reviewer also mentioned that I may use 
> "Mata library" function, but I am pretty new to Stata Mata. Is there 
> anyone that may be able to help regarding this issue?
>
>
>Basically you have two options.  The first would be to deliver the 
>dataset (i.e., .dta file) automatically along with the package.  See 
>-help usersite- or [R] net for the complete details, but essentially 
>you'll want to use "F mydata.dta" rather than "f mydata.dta" to force 
>the dataset to be installed in the system directories rather than the 
>user's current working directory.  You then call the dataset with
>
>     sysuse mydata
>
>This way, everything will "just work" regardless of the user's local 
>setup, and users don't need to know (or worry) about where the file is 
>located.  This also makes it easy for you to update the file at a later 
>date, if necessary.
>
>The alternative would be to place the dataset on the web somewhere, and 
>access it from within your code using the URL.  The downside to this is 
>that your command won't work unless the user has an internet 
>connection, which would be annoying.

You learn something new every day. I would add that (a) give the data set a name that is somewhat esoteric and unlikely to be otherwise used, and (b) give it a name that will associate it with the program so that people don't wonder where it came from, e.g. myprog_data. Of course, I would make the same advice for all the files that will be installed. 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index