Andreas writes: "for some industryids there are no companies listed with a "length" over lets say 8, which leads to an error, and the do-file stops." Andreas doesn't write, which command is aborting with error. I assume it is -save-. This Stata command has an option 'emptyok' which allows saving a dataset even if there is no observation in it. For details see the help for -save- here: http://www.stata.com/help.cgi?save Best, Sergiy On Thu, May 2, 2013 at 5:10 AM, Andreas Dall Frøseth <Andreas.Froseth@stud.nhh.no> wrote: > I did a test run with the -count if- command, and it seems to be just what I need. > > Thank you, Nick! > > -Andreas > ________________________________________ > Fra: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] på vegne av Nick Cox [njcoxstata@gmail.com] > Sendt: 2. mai 2013 10:47 > Til: statalist@hsphsun2.harvard.edu > Emne: Re: st: Using values in an variable to save parts of an dataset > > -keep- and -drop- are just complementary. Use whatever is easier to > think with. The complementarity means that > > keep if foo > > and > > drop if !foo > > are the same, and vice versa. Here -foo- could be a variable, or it > could be pseudocode for a condition. > > So starting with > > sysuse auto > > (a) drop if foreign == 1 > > (b) keep if !(foreign == 1) > > are the same. Now, why didn't I write there > > (b') keep if foreign == 0 > > ? That _is_ equivalent _in this example_, but I want to emphasise that > -- as you get more and more conditions -- it can be helpful to > parenthesise a long compound condition using ( ) > > so the pair > > drop if (<long complicated condition, possibly compound>) > > keep if !((<long complicated condition, possibly compound>) > > are the same, as are the pair > > drop if !(<long complicated condition, possibly compound>) > > keep if ((<long complicated condition, possibly compound>) > > Negate the whole expression, once. > > Turning to your specific problem it sounds as any case with zero > observations is stopping your code. (You don't show us your exact > code!) > > You might try something like this > > count if <condition to be satisfied> > > if r(N) > 0 { > <actions if there are some data> > } > > Alternatively, check out the -capture- command. > > Nick > njcoxstata@gmail.com > > > On 2 May 2013 08:58, Andreas Dall Frøseth <Andreas.Froseth@stud.nhh.no> wrote: > >> It looks like I need some help with this code again... >> >> After running this a couple of times, for different restrictions, I experience some difficulties. In addition to -keep-ing if industryid==`industryid', I wish to drop if the value in the variable "length" is less than a certain value. >> But, for some industryids there are no companies listed with a "length" over lets say 8, which leads to an error, and the do-file stops. >> >> How can I make the code ignore those industries, and keep on splitting my dataset? > > Andreas Dall Frøseth [Andreas.Froseth@stud.nhh.no] > >> I tried the approach where you exploited a feature of -use-, and it seems to work just fine. >> Thank you. > > Nick Cox > >> Your local references are the wrong way round. >> >> foreach industryid in `industryids' { >> keep if industryid==`industryid' >> save industry_`industryid' >> } >> >> You want each statement inside the loop to take on each individual value. >> >> This is all assuming that -industryid- is numeric. >> >> However, even with this fixed your loop won't work. Second time round, >> all you have in memory is the first subset resulting from -keep-. >> >> But then you can exploit a feature of -use-: >> >> levelsof industryid, local(industryids) >> >> foreach industryid in `industryids' { >> use mydata if industryid==`industryid', clear >> save industry_`industryid' >> } > > Andreas Dall Frøseth > >>> I'm trying to divide my dataset into pieces based on the values in an variable. My set is an large panel data, containing a number of companies. Each company has a value in the variable "industryid", which allows me to identify what industry it operates in. >>> I am now trying to divide this large dataset into smaller sets for each single industry. >>> >>> The reason why I'm struggling is that I wish to apply this separation for a number of different sets, which might contain a different amount of industries, without having to identify the values in the industry-variable myself. >>> I have tried to make a macro with the values using the "levelsof" command, and then apply it with: >>> >>> foreach industryid in `industryids' { >>> keep if industryid==`industryids' >>> save industry_`industryids' >>> } >>> >>> >>> But this runs back as "invalid '10'". >>> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

