Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: comparing xtdes-like patterns for variables


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: comparing xtdes-like patterns for variables
Date   Thu, 1 Nov 2012 12:39:03 +0000

If you had several variables you could try something like this

local y = 1
gen long obsno = _n

foreach v of var <whatever> {

           gen y`y' = `y' if missing(`v')
           local which : var label `v'


On Thu, Nov 1, 2012 at 1:10 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> You could create variables like
>
> gen yxmiss = missing(y) - missing(x)
> gen long obs = _n
>
> scatter yxmiss obs if missing(y, x)
>
> On Wed, Oct 31, 2012 at 7:39 PM, László Sándor <sandorl@gmail.com> wrote:
>> Thanks, Nick.
>>
>> The values definitely don't line up that neatly, but that's a worry
>> for another day.
>>
>> Basically my problem is, if I know I can expect differences between
>> the variables, is there a neat way to compare their missing patterns
>> (one always starting early, or one mistakenly having the years in
>> reverse order)?
>>
>> On Wed, Oct 31, 2012 at 3:15 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>>> If # different versions of the same data should be the same, there
>>> will be # duplicates of everything in a combined dataset.
>>>
>>> This applies to missings too.
>>>
>>> -duplicates- is therefore something that springs to mind. Panels are
>>> no problem, as panel identifiers are just other variables
>>>
>>> Naturally, if the combined dataset is extremely large, this won't be
>>> very practical. .
>>>
>>> Nick
>>>
>>> On Wed, Oct 31, 2012 at 7:02 PM, László Sándor <sandorl@gmail.com> wrote:
>>>
>>>> I have a panel-data cleaning problem that probably has some neat
>>>> solution, probably already out there. I am happy to try any solutions
>>>> for Stata 12.1 MP.
>>>>
>>>> Background: I had to try to look up supposedly the same data from
>>>> multiple sources. (Financial data for the same securities, but
>>>> different data sources were expected to cover different subsets of my
>>>> universe, or for different time periods.)
>>>>
>>>> But now I have a panel where I would like to cross-check different
>>>> version of the same data, and most crucially, I would like to verify
>>>> that I got the years correctly for each version. (FYI: financial data
>>>> sources can be opaque about how they handle missing data if you ask
>>>> for "end-of-year prices for the last 15 calendar years", and whether
>>>> they give years in ascending or descending order). For this, I would
>>>> like to compare what periods I have non-missing values for a family of
>>>> variables, say, bloomberg_price and reuters_price.
>>>>
>>>> Presumably, if I got the start and the end years right, I could hope
>>>> -compare- those, (e.g. -compare *_price_first- ). And hope that the
>>>> patterns will be clear.
>>>>
>>>> That said, I'm afraid some more nuanced analysis of missing value
>>>> patterns might be justified. What are good tools for that? (How can I
>>>> "xtdes by variable"? Or "misstable pattern in a panel"?)
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index