# Re: st: how to check the reliability and validity of the data

 From Daniel Schneider To statalist@hsphsun2.harvard.edu Subject Re: st: how to check the reliability and validity of the data Date Sun, 23 Mar 2008 01:45:30 -0700

You cannot assess representativeness in any such way. And it has nothing to do with reliability or validity (well, not a lot). What you COULD do is to evaluate how likely it is that your two branches are different from all branches by mere chance - but for that you would need to define a hypothesis about all branches. For example, if you said 'I believe the average time is 10 minutes' you can a) investigate what the mean time is in your two branches and how likely it is that b) that mean is different from 10 minutes (that is whether the difference is statistically significant). Now, with two out of 129 branches you will not have a lot of power to do this and you should consider estimators that incorporate the fact that your population is finite (because 129 is pretty small number). You could say that you have more than 2 observations because you have many observations within those two branches, so you are actually looking at a multi-level model or at least something that adjusts for the clustering.

All of this might sound like gibberish. I am sorry if it does, but what you are asking is much more a question about statistics in general than a question about Stata. you need to clearly define what you hypothesis is and then some of us might be able to give you an indication for what you command you should use. But you also first need to clearly understand the concepts of reliability, validity and representativeness.

George Huang wrote:

Dear Dave,

Thanks for your opinion. Actually, 72,878 observations are measured by stopwatches from 2 branches (out of 129 branches). They represent 2,800 activity time of a bank. I would like to know whether the data close to mean (that is, whether these data are representative). However, I don't know how to use stata to get the proper results.

Thank you,

George

----- Original Message ----- From: "David Airey" <david.airey@vanderbilt.edu>
To: <statalist@hsphsun2.harvard.edu>
Sent: Sunday, March 23, 2008 11:29 AM
Subject: Re: st: how to check the reliability and validity of the data

.

I will bet that others will also say you have not provide enough information to give a good answer. What do you mean be reliability and validity, exactly?

-Dave

On Mar 22, 2008, at 10:04 PM, George Huang wrote:

Dear all:

I have a database with 72,878 observations which can be grouped into 2,800 average time of different activities. Therefore, there are 2,800 means and 2,800 S.D. for the activity time. How can I know the reliability and validity of each activity time? Should I use nonparametric statistics or chisquare test?

Thanks a lot for your help!

George Huang
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```