Hello, Martin and Nick, thank you so much. Your suggestion works very well. I am sorry for being unclear about my question. I will clarify it now. I have a large dataset around 1 million records. There are around 20 variables. I would like to run the clogit regression, which requires me to reshape the data into the long form. Basically, it is the model of a person choosing a product from the choice set of 12. Thus, the total records would be of 12 million after reshaping. In the beginning, I tried with the smaller sample and reshape worked very slow. Then, I thought I should start with only 2 ID variables: person ID and choice ID (1-12). I created the wide form data of only 2 ID variables, reshaped it to the long form, and lastly merged other variables that are associate with person and choice respectively. However, even with the only 2 ID variables, it took a long time for reshape to finish for the small sample of the total 1 million records. That is why I try to find the faster way to create the empty dataset with 2 ID variables first. Then my next step is to merge the information about a person and the choice. I hope this is clear enough. And if you and others have other better approach to prepare the data like this, please let me know. There are a lot more for me to learn from all of you. Thank you, Anupit ----- "Nick Cox" <n.j.cox@durham.ac.uk> wrote: > Martin's advice looks good. > > But Anupit's question doesn't hang together for me. The specific > example, and even longer ones of the same form, don't strike me as > -reshape- questions at all as they involve creating new data in > structured form. > > By the way, for large datasets make sure to use -egen long- or -egen > double- if you need to. > > But if you had a -reshape- question, strict sense, I doubt you could > speed things up much by programming it yourself with -forvalues- or > -foreach-. That would, broadly speaking, mean that you were a better > Stata programmer than the Stata developers. There could well be > exceptions, but I'd guess that this statement would be true much more > often than its converse. > > Nick > n.j.cox@durham.ac.uk > > Martin Weiss > > - h egen,seq()- > > Supnithadnaporn, Anupit > > Would you please suggest me how to create data in the long form > by *not* using reshape? I would like to avoid reshape because reshape > takes very very long time. In fact, the final & total number of > records > that I have to create would be around 12,000,000. > > I think foreach and forvalues can do this work. > But, I am a novice in Stata programming and could not figure out so > far. > > In the beginning, I have only Obsid which is created by > > gen Obsid = _n > > The desired data would look like this: > > Obsid Vid Imp > 1 1 1 > 2 1 2 > 3 1 3 > 4 1 4 > 5 2 1 > 6 2 2 > 7 2 3 > 8 2 4 > > ... > > > 100 25 4 > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

