[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Advise withdrawn -- data manipulation |

Date |
Thu, 10 Oct 2002 20:22:46 +0100 |

The original question > > > > I have a dataset which has customer's payment amount by month > > > > by year. Month > > > > ranges from 01 to 12 for years 2000 & 2001 and from 01 to 09 > > > > for the 2002. But > > > > all customers don't have data for each month. The dataset > > > > looks like the > > > > following. > > > > > > > > customer month year amount > > > > x1 01 2001 50.45 > > > > x1 03 2001 60.00 > > > > x2 04 2001 70.00 > > > > x2 06 2001 80.00 > > > > > > > > I would like to create a data set where each customer > will have 12 > > > > observations for years 2000 & 2001 and 9 obs. for 2002, and > > > > amount will be > > > > zero for the months they don't have any original data. I > > > > tried with couple of > > > > different ways, but didn't work. Could anyone please help me? Nick Winter > > Oops. > > I misread the question. There are clearly better ways to > do this, than > to use -reshape-. Not so fast! I don't see it as that clearcut. This is a nice problem, and there are points about Stata technique which make it of wider interest. As I write, three solutions have been proposed, here summarised in order of first posting, and with second thoughts written in. 1. Nick Winter ============== -reshape wide- followed by -reshape long-. This is a good general procedure. Other applications abound. It is not going to fill in all gaps. Empirically, my guess is that is not a problem. If it is, then a little preparation will fix the problem. It is necessary and sufficient that all times be present for at least one customer. 2. Nick Cox =========== -fillin-. -fillin- is optimised for this one problem. It does nothing else. Perhaps you never heard of it. There is always a problem learning of and remembering tools you use only once in a while. It is better to learn about more general tools. It is not going to fill in all gaps. Empirically, my guess is that is not a problem. Same comment as above (and one method was proposed). 3. Tao Jiang ============ -merge- with a complete data set. No code presented, but in principle this sounds elegant. You could -contract- on -customer- and then -expand- and create a time variable. -merge- is a very good general procedure. Other applications abound. Naturally, getting one solution that works is enough. But there is a lot of evidence that different Stata users find different tools intuitive, so choose whatever appeals. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Advise withdrawn -- data manipulation***From:*"Nick Winter" <nwinter@policystudies.com>

- Prev by Date:
**Re: st: RE: Hansen test** - Next by Date:
**Re: st: Appending files when variables differ in their types** - Previous by thread:
**st: Advise withdrawn -- data manipulation** - Next by thread:
**st: RE: Hansen test** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |