Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
László Sándor <sandorl@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: upper limit on fweights? overflowing into missing values? |

Date |
Tue, 30 Jul 2013 08:34:06 -0400 |

Thanks, Richard. Stata tech support got back to me and suggested something similar: that some operations with fweights do overflow with such large weights, others don't. I am not sure whether we shall call it hard-coded as a restriction on some number somewhere or simply the C implementation of -mf_quadcross- or something. I think I tried to describe my use case: I wanted to calculate stats on portfolios, and it makes sense to weight by the size of them. As pwcorr does not allow iweights, and pweights and aweights do something completely different, I thought I'd use fweights. It blows up unless I rescale the portfolios into thousands, millions or billions. Not a big deal, but Stata's (non-existent) error message, help and documentation were not exactly helpful in resolving this. StataCorp says they will address this. I think what an observation is is a semantic issue here, not very helpful. Is an entire portfolio "one observation" or a single share in each, or each dollar behind each? I am not sure this should matter neither for us nor Stata. Best, Laszlo On Mon, Jul 29, 2013 at 9:53 AM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > Just to sum up my current thinking/guesses on this: > > * the maximum number of observations in Stata is 2,147,483,647 > * Nonetheless, fweighted data sets can have more observations than that > * However, not all routines will work when the fweighted data has more than > 2,147,483,647 cases. You can do some simple descriptive things, but you > can't do more complicated things like regression or correlations. > * As to why that is, I am guessing that some routines have the 2,147,483,647 > limit hardcoded in. Or, maybe there just isn't enough precision to handle > calculations when the N is larger than that. > * Given that most people don't have more than 2,147,483,647 cases (and even > if they did, their computer memory couldn't handle them) StataCorp probably > hasn't spent a lot of time worrying about this. > * Still, an added sentence or two in the fweights documentation or elsewhere > warning about limits might be a good idea. > > I am curious what the original author is doing that requires analyzing 4 > billion+ cases. Some sort of genetic research maybe? I've certainly never > heard of any kind of Survey research having an N that large. > > > > At 06:53 PM 7/28/2013, Nick Cox wrote: >> >> This is interesting, but in principle I don't see that Stata's limit >> on # of observations has any bearing on how big frequency weights can >> be. I can imagine people wanting to use frequency weights to subvert >> the limit on number of observations. >> >> A different point is that if there is a limit on how big weights can >> be it should be documented e.g. at -help limits-. >> Nick >> njcoxstata@gmail.com >> >> >> On 29 July 2013 00:46, Richard Williams <richardwilliams.ndu@gmail.com> >> wrote: >> > According to -help limits-, the maximum number of observations is >> > 2,147,483,647. Your weights give you more than 4 billion cases, well above >> > that. Further, the help also says that this is a theoretical maximum; memory >> > availability will certainly impose a smaller maximum. >> > >> > On my computer, I specified [fw = 1073741823] on the pwcorr command and >> > it ran. Then I specified [fw = 1073741824] and it did not run. These numbers >> > put you just below and just above the maximum number of cases that Stata >> > allows. >> > >> > So in short, it appears that your fweighted cases can't exceed the 2 >> > billion+ that Stata allows, and memory restrictions may hold you to even >> > less than that. >> > >> > Also, you probably need to specify that the fweight variable is type >> > long, e.g. >> > >> > input y x long fw >> > >> > Sent from my iPad >> > >> > On Jul 27, 2013, at 12:36 PM, László Sándor <sandorl@gmail.com> wrote: >> > >> >> Hi, >> >> If you care, here is an example that silently produces missing values. >> >> I notified Stata Support. >> >> >> >> input y x fw >> >> 2 1 2147483621 >> >> 1 2 2147483621 >> >> end >> >> de >> >> pwcorr y x [fw=fw] >> >> exit >> >> >> >> Thanks, >> >> >> >> Laszlo >> >> >> >> On Sun, Jul 21, 2013 at 5:08 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> >>> I'd suggest documenting your problems with a reproducible example and >> >>> sending Stata tech support. >> >>> >> >>> >> >>> Nick >> >>> njcoxstata@gmail.com >> >>> >> >>> >> >>> On 21 July 2013 21:55, László Sándor <sandorl@gmail.com> wrote: >> >>>> Hi, >> >>>> in Stata/MP 12.1 I am getting missing values with using -pwcorr- with >> >>>> -fweights- though the feature works fine with other data or if I >> >>>> scale >> >>>> my weights down. Is it possible to simply have too large fweights, >> >>>> e.g. if they cannot be of type -long- anymore? >> >>>> >> >>>> If so, why doesn't Stata warn me about this? >> >>>> >> >>>> I vaguely remember some Statalist of Stata blog discussion of this, >> >>>> but I could not even Google it up, and Stata still did not warn me… >> >>>> >> >>>> Actually, why didn't Stata complain that I did not have integer >> >>>> fweights if obviously the variable wasn't of type byte, int or long? >> >>>> >> >>>> Thanks, >> >>>> >> >>>> Laszlo >> >>>> >> >>>> * >> >>>> * For searches and help try: >> >>>> * http://www.stata.com/help.cgi?search >> >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> >>>> * http://www.ats.ucla.edu/stat/stata/ >> >>> >> >>> * >> >>> * For searches and help try: >> >>> * http://www.stata.com/help.cgi?search >> >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> >> * >> >> * For searches and help try: >> >> * http://www.stata.com/help.cgi?search >> >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> >> * http://www.ats.ucla.edu/stat/stata/ >> > >> > * >> > * For searches and help try: >> > * http://www.stata.com/help.cgi?search >> > * http://www.stata.com/support/faqs/resources/statalist-faq/ >> > * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > > ------------------------------------------- > Richard Williams, Notre Dame Dept of Sociology > OFFICE: (574)631-6668, (574)631-6463 > HOME: (574)289-5227 > EMAIL: Richard.A.Williams.5@ND.Edu > WWW: http://www.nd.edu/~rwilliam > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: upper limit on fweights? overflowing into missing values?***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: upper limit on fweights? overflowing into missing values?***From:*László Sándor <sandorl@gmail.com>

**Re: st: upper limit on fweights? overflowing into missing values?***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: upper limit on fweights? overflowing into missing values?***From:*László Sándor <sandorl@gmail.com>

**Re: st: upper limit on fweights? overflowing into missing values?***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: upper limit on fweights? overflowing into missing values?***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: upper limit on fweights? overflowing into missing values?***From:*Richard Williams <richardwilliams.ndu@gmail.com>

- Prev by Date:
**st: RE: RE: assertion is false error: mvencode** - Next by Date:
**Re: st: RE: RE: assertion is false error: mvencode** - Previous by thread:
**Re: st: upper limit on fweights? overflowing into missing values?** - Next by thread:
**Re: st: upper limit on fweights? overflowing into missing values?** - Index(es):