Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: describe using (problem with abbrev )


From   Nick Cox <njcoxstata@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: describe using (problem with abbrev )
Date   Wed, 4 Dec 2013 21:53:19 +0000

Quite so. I didn't explain my point well enough, or indeed accurately
enough. -describe using- as it now is doesn't include expansion of
varlists; that is not to say that it could not be extended by
StataCorp to do so. As a programmer you will be familiar with projects
of your own that only got so far.

I doubt very much that

1. Commands like -describe using- actually read in the data from the
other dataset into memory, even temporarily. I suspect that
purpose-built C code reads the file from afar.

2. Stata's commands based on C code need pay any attention to syntax
in the sense of -syntax-, just as Mata pays no attention to that.

but we are trading guesses.


Nick
njcoxstata@gmail.com


On 4 December 2013 20:42, Sergiy Radyakin <serjradyakin@gmail.com> wrote:
> Nick,
>
> the inner workings of Stata are not known, but what Alan is asking
> about should be possible. We have a similar situation with the -use-
> command, which supports subset of data:
>
> use mpg using auto.dta, clear
>
> Here the varlist (mpg) is the 'future' varlist, not the one in the
> memory now, right?
>
> You may argue that what happens behind the scenes is that Stata:
> 1) clears the memory
> 2) loads full dataset
> 3) unabbreviates the variable list
> 4) drops the variables that were not mentioned.
>
> However Stata seems to unabbreviate the list of variables without
> loading the whole dataset into memory:
>
> version 9.0
> clear
> set mem 10m
>
> set obs 200000
> forval i=1/99 {
>   capture generate byte x`i'=`i'
> }
>
> describe
> tempfile t
> save `t'
> clear
> set mem 3m
> use x3-x5 using `t'
> describe
>
> Obviously the test above makes sense in Stata before version 12.0,
> which came out with an automatic memory manager.
> The idea is that it can load x3,x4,x5 despite the full dataset does
> not fit into memory, hence we should conclude that the header
> information is processed separately, which is exactly what Alan is
> asking about in his question.
>
> It seems that specifically for the -use- command the variables list is
> treated specially, but although the same code is applicable to
> -describe- it is simply not reused there. I am yet to see any command
> that supplies a varlist (in the expected place after command name)
> referring to the future state of the data and it is not a built-in
> command (I am dying to see one). I would imagine that could also be
> implemented with a few tricks with -anything- in the syntax.
>
> I would go with a two-step solution, firstly getting a full
> description of the dataset, then filtering it for variables of
> interest. Ideally StataCorp could have provided a possibility to delay
> expansion of the varlist after parsing and an unab(s1,s2) string
> function, where s1 is a string to be treated as abbreviated varlist,
> and s2 is a string universe of variables. The result is a string of
> full variable names from s2 that satisfy s1. This is of course even
> currently possible to do yourself, but imho only if one dares to
> rewrite the -syntax- command.
>
> Best,
>   Sergiy Radyakin
>
>
> On Tue, Dec 3, 2013 at 7:50 AM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Good catch by Daniel here.
>>
>> The reason that varlists with dashes are not allowed is presumably
>> that Stata can't expand what it doesn't know about. That is, the
>> dataset would have to be read in before Stata could expand a variable
>> name range, and that's the point: the dataset is being accessed
>> remotely.
>>
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 3 December 2013 12:40, daniel klein <klein.daniel.81@gmail.com> wrote:
>>> Alan,
>>>
>>> this behavior is documented in -help describe-.
>>>
>>> "The varlist in the describe using syntax differs from standard Stata
>>> varlists in two ways. First, you cannot abbreviate variable names;
>>> that is, you have to type displacement rather than displ. However, you
>>> can use the wildcard character (~) to indicate abbreviations, for
>>> example, displ~. Second, you may not refer to a range of variables;
>>> specifying age-income is considered an error."
>>>
>>> Here is a sketch how you could allow the dash character
>>>
>>> *! version 1.0.0 03dec2013 Daniel Klein
>>>
>>> pr descdash
>>>  vers 11.2
>>>
>>>  syntax anything using [, * ]
>>>
>>>  m : st_local("uservars", stritrim(st_local("anything")))
>>>  loc uservars : subinstr loc uservars "- " "-" ,all
>>>  loc uservars : subinstr loc uservars " -" "-" ,all
>>>
>>>  qui d `using' ,varl
>>>  loc allvars `r(varlist)'
>>>
>>>  token `uservars'
>>>  forv j = 1/`: word count `uservars'' {
>>>   loc var : subinstr loc `j' "-" " " ,c(loc dsh)
>>>   if (`dsh') {
>>>    loc f : list posof "`: word 1 of `var''" in allvars
>>>    loc t : list posof "`: word 2 of `var''" in allvars
>>>    if (`t' < `f') {
>>>     di as err "variables out of order"
>>>     e 111
>>>    }
>>>    m : st_local("var", ///
>>>    invtokens(tokens(st_local("allvars"))[(`f'..`t')]))
>>>   }
>>>   loc varlist `varlist' `var'
>>>  }
>>>
>>>  d `varlist' `using' ,`options'
>>> end
>>> e
>>>
>>> descdash y1-y2 using ajit_112213
>>>
>>> Best
>>> Daniel
>>>
>>> --
>>> Hi _ In Stata 13 (and also in Stata 12), it appears that the
>>> abbreviation with a dash "-" does not work with -describe using
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index