Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Andreas Dall Frøseth <Andreas.Froseth@stud.nhh.no> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
SV: st: Using values in an variable to save parts of an dataset |

Date |
Thu, 2 May 2013 09:10:10 +0000 |

I did a test run with the -count if- command, and it seems to be just what I need. Thank you, Nick! -Andreas ________________________________________ Fra: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] på vegne av Nick Cox [njcoxstata@gmail.com] Sendt: 2. mai 2013 10:47 Til: statalist@hsphsun2.harvard.edu Emne: Re: st: Using values in an variable to save parts of an dataset -keep- and -drop- are just complementary. Use whatever is easier to think with. The complementarity means that keep if foo and drop if !foo are the same, and vice versa. Here -foo- could be a variable, or it could be pseudocode for a condition. So starting with sysuse auto (a) drop if foreign == 1 (b) keep if !(foreign == 1) are the same. Now, why didn't I write there (b') keep if foreign == 0 ? That _is_ equivalent _in this example_, but I want to emphasise that -- as you get more and more conditions -- it can be helpful to parenthesise a long compound condition using ( ) so the pair drop if (<long complicated condition, possibly compound>) keep if !((<long complicated condition, possibly compound>) are the same, as are the pair drop if !(<long complicated condition, possibly compound>) keep if ((<long complicated condition, possibly compound>) Negate the whole expression, once. Turning to your specific problem it sounds as any case with zero observations is stopping your code. (You don't show us your exact code!) You might try something like this count if <condition to be satisfied> if r(N) > 0 { <actions if there are some data> } Alternatively, check out the -capture- command. Nick njcoxstata@gmail.com On 2 May 2013 08:58, Andreas Dall Frøseth <Andreas.Froseth@stud.nhh.no> wrote: > It looks like I need some help with this code again... > > After running this a couple of times, for different restrictions, I experience some difficulties. In addition to -keep-ing if industryid==`industryid', I wish to drop if the value in the variable "length" is less than a certain value. > But, for some industryids there are no companies listed with a "length" over lets say 8, which leads to an error, and the do-file stops. > > How can I make the code ignore those industries, and keep on splitting my dataset? Andreas Dall Frøseth [Andreas.Froseth@stud.nhh.no] > I tried the approach where you exploited a feature of -use-, and it seems to work just fine. > Thank you. Nick Cox > Your local references are the wrong way round. > > foreach industryid in `industryids' { > keep if industryid==`industryid' > save industry_`industryid' > } > > You want each statement inside the loop to take on each individual value. > > This is all assuming that -industryid- is numeric. > > However, even with this fixed your loop won't work. Second time round, > all you have in memory is the first subset resulting from -keep-. > > But then you can exploit a feature of -use-: > > levelsof industryid, local(industryids) > > foreach industryid in `industryids' { > use mydata if industryid==`industryid', clear > save industry_`industryid' > } Andreas Dall Frøseth >> I'm trying to divide my dataset into pieces based on the values in an variable. My set is an large panel data, containing a number of companies. Each company has a value in the variable "industryid", which allows me to identify what industry it operates in. >> I am now trying to divide this large dataset into smaller sets for each single industry. >> >> The reason why I'm struggling is that I wish to apply this separation for a number of different sets, which might contain a different amount of industries, without having to identify the values in the industry-variable myself. >> I have tried to make a macro with the values using the "levelsof" command, and then apply it with: >> >> foreach industryid in `industryids' { >> keep if industryid==`industryids' >> save industry_`industryids' >> } >> >> >> But this runs back as "invalid '10'". >> * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Using values in an variable to save parts of an dataset***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**References**:**SV: st: Using values in an variable to save parts of an dataset***From:*Andreas Dall Frøseth <Andreas.Froseth@stud.nhh.no>

**Re: st: Using values in an variable to save parts of an dataset***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Using values in an variable to save parts of an dataset** - Next by Date:
**Re: st: gradient and the inverse of the information matrix** - Previous by thread:
**Re: st: Using values in an variable to save parts of an dataset** - Next by thread:
**Re: st: Using values in an variable to save parts of an dataset** - Index(es):