Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: merging two datasets: companies with daily stock data

From   Mitch Abdon <[email protected]>
To   [email protected]
Subject   Re: st: merging two datasets: companies with daily stock data
Date   Thu, 11 Nov 2010 09:25:03 +0800


It is possible to load a subset of a dataset. The syntax is:

 use [varlist] [if] [in] using filename [, clear nolabel]

See "help use" . In your case, I would think you will need the -if-
qualifier for the date conditions.

On Thu, Nov 11, 2010 at 1:59 AM, Nuno Soares <[email protected]> wrote:
> Hi everyone,
> I'm trying to implement a bit of code that I normally use in SAS, but I'm
> having some trouble to implement in Stata given the magnitude of data that
> it requires.
> Imagine you have two data files: one with a list of companies with a given
> date (say an event date) associated and the other with daily stock market
> data. The file with the companies can have the same company with multiple
> events. Problem: get the market data for a given period (say event date -
> 260 days).
> This problems seems to be easily solved by using a m:1 merge with the
> company ID as the merge variable. This would get all the daily data,
> irrespective of the event date, into the merge file and I could then delete
> those dates that are not needed.
> Now, imagine the company and event date file has 1000 observations all of
> which represent firms that have 10 years of daily market data, and I still
> just want event date - 260 days. This would mean that the merge process
> would lead to a file with 1000*10*260=2600000 observations, of which I only
> needed 10%. As both the files become increasingly bigger, the time needed to
> merge becomes longer and the memory requirements increase.
> In SAS I would only use a proc sql with the date restrictions, and it would
> get the data needed. In Stata, it seems that all the daily data file (with
> the restriction of the ID companies) is loaded into the memory and then we
> need to delete want we don't need. Is there a way of restricting the amount
> of daily data loaded into the memory in Stata using the  - merge - command,
> or a command that allows to do that? The 2600000 is not that bad, but I
> normally encounter 2 or three more times this number of observations...
> Best wishes,
> Nuno
> *
> *   For searches and help try:
> *
> *
> *


Arnelyn Abdon

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index