Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: describe using (problem with abbrev ) |

Date |
Wed, 4 Dec 2013 15:42:22 -0500 |

Nick, the inner workings of Stata are not known, but what Alan is asking about should be possible. We have a similar situation with the -use- command, which supports subset of data: use mpg using auto.dta, clear Here the varlist (mpg) is the 'future' varlist, not the one in the memory now, right? You may argue that what happens behind the scenes is that Stata: 1) clears the memory 2) loads full dataset 3) unabbreviates the variable list 4) drops the variables that were not mentioned. However Stata seems to unabbreviate the list of variables without loading the whole dataset into memory: version 9.0 clear set mem 10m set obs 200000 forval i=1/99 { capture generate byte x`i'=`i' } describe tempfile t save `t' clear set mem 3m use x3-x5 using `t' describe Obviously the test above makes sense in Stata before version 12.0, which came out with an automatic memory manager. The idea is that it can load x3,x4,x5 despite the full dataset does not fit into memory, hence we should conclude that the header information is processed separately, which is exactly what Alan is asking about in his question. It seems that specifically for the -use- command the variables list is treated specially, but although the same code is applicable to -describe- it is simply not reused there. I am yet to see any command that supplies a varlist (in the expected place after command name) referring to the future state of the data and it is not a built-in command (I am dying to see one). I would imagine that could also be implemented with a few tricks with -anything- in the syntax. I would go with a two-step solution, firstly getting a full description of the dataset, then filtering it for variables of interest. Ideally StataCorp could have provided a possibility to delay expansion of the varlist after parsing and an unab(s1,s2) string function, where s1 is a string to be treated as abbreviated varlist, and s2 is a string universe of variables. The result is a string of full variable names from s2 that satisfy s1. This is of course even currently possible to do yourself, but imho only if one dares to rewrite the -syntax- command. Best, Sergiy Radyakin On Tue, Dec 3, 2013 at 7:50 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Good catch by Daniel here. > > The reason that varlists with dashes are not allowed is presumably > that Stata can't expand what it doesn't know about. That is, the > dataset would have to be read in before Stata could expand a variable > name range, and that's the point: the dataset is being accessed > remotely. > > Nick > njcoxstata@gmail.com > > > On 3 December 2013 12:40, daniel klein <klein.daniel.81@gmail.com> wrote: >> Alan, >> >> this behavior is documented in -help describe-. >> >> "The varlist in the describe using syntax differs from standard Stata >> varlists in two ways. First, you cannot abbreviate variable names; >> that is, you have to type displacement rather than displ. However, you >> can use the wildcard character (~) to indicate abbreviations, for >> example, displ~. Second, you may not refer to a range of variables; >> specifying age-income is considered an error." >> >> Here is a sketch how you could allow the dash character >> >> *! version 1.0.0 03dec2013 Daniel Klein >> >> pr descdash >> vers 11.2 >> >> syntax anything using [, * ] >> >> m : st_local("uservars", stritrim(st_local("anything"))) >> loc uservars : subinstr loc uservars "- " "-" ,all >> loc uservars : subinstr loc uservars " -" "-" ,all >> >> qui d `using' ,varl >> loc allvars `r(varlist)' >> >> token `uservars' >> forv j = 1/`: word count `uservars'' { >> loc var : subinstr loc `j' "-" " " ,c(loc dsh) >> if (`dsh') { >> loc f : list posof "`: word 1 of `var''" in allvars >> loc t : list posof "`: word 2 of `var''" in allvars >> if (`t' < `f') { >> di as err "variables out of order" >> e 111 >> } >> m : st_local("var", /// >> invtokens(tokens(st_local("allvars"))[(`f'..`t')])) >> } >> loc varlist `varlist' `var' >> } >> >> d `varlist' `using' ,`options' >> end >> e >> >> descdash y1-y2 using ajit_112213 >> >> Best >> Daniel >> >> -- >> Hi _ In Stata 13 (and also in Stata 12), it appears that the >> abbreviation with a dash "-" does not work with -describe using >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: describe using (problem with abbrev )***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**Re: st: describe using (problem with abbrev )***From:*daniel klein <klein.daniel.81@gmail.com>

**Re: st: describe using (problem with abbrev )***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: question about confirmatory factor analysis using the sem command in Stata** - Next by Date:
**Re: st: constant in -xtreg (yes, again!)** - Previous by thread:
**Re: st: describe using (problem with abbrev )** - Next by thread:
**Re: st: describe using (problem with abbrev )** - Index(es):