Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to do cluster analysis

From	Alfonso Sanchez-Penalver <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: How to do cluster analysis
Date	Thu, 5 Dec 2013 19:21:57 -0500

Hi again,

Exactly why do you think that needs any fixing? Let's say you do the quintile boundaries I mentioned before. So one year one country is in the top quintile and next year it moves into the second quintile. There is nothing wrong with that in principle. This will allow you to estimate if any factors make a country improve or worsen its ranking. 

Alfonso Sanchez-Penalver

> On Dec 5, 2013, at 6:57 PM, Xixi Lin <[email protected]> wrote:
> 
> Hi Alfonso,
> You are absolutely right that I want different boundaries for different periods. This will generate the problems that one country will be in a different group in different periods. And I don't know how to solve this issue.
> 
> Best,
> Xixi Lin
> 
> Sent from my iPhone
> 
>> On Dec 5, 2013, at 4:00 PM, Alfonso Sánchez-Peñalver <[email protected]> wrote:
>> 
>> Hi Xilin,
>> 
>> let us be more specific so we can clarify this and find the solution because I can think of several ways you want to do this. The idea is that for each period you want to break the countries in 9 groups. Will the return boundaries for the 5 groups be the same for each period, or will they depend on the performances across countries in each period? If the boundaries are the same for each period then the
>> 
>> gen cat1 = return > 0.8
>> 
>> works. This will set the cat1 equal to 1 for all observations that have a return greater than 0.8 and 0 otherwise. This will most likely have different number of countries in each category at the different periods. If, on the other hand, you want the boundaries to vary across periods so that, for example, you want to have the returns split up into each year quintiles then I suggest you download -egenmore- from SSC and then you can do
>> 
>> egen rtncats = xtile(return), by(year) nq(5)
>> 
>> This will create a variable rtncats where you have for each year 5 groups, but a country may be in one category one year and in another category the year after.
>> 
>> Best,
>> 
>> Alfonso.
>> 
>>> On Dec 5, 2013, at 3:17 PM, Xixi Lin <[email protected]> wrote:
>>> 
>>> Hi Alfonso,
>>> 
>>> I plan to evaluate it by period. however, if it is by period, then I
>>> have to group them period by period, which means I will have 500
>>> different group arrangements in 500 periods and those group
>>> arrangements may be contradicted.---this is a big headache. Hopefully
>>> I have explained my situation clearly. ^_^
>>> 
>>> I know it will be easier if just group them based on their average number.
>>> 
>>> Best,
>>> Xixi Lin
>>> 
>>> On Thu, Dec 5, 2013 at 3:04 PM, Alfonso Sánchez-Peñalver
>>> <[email protected]> wrote:
>>>> Hi Xixi,
>>>> 
>>>> would the basis of the grouping would be per year, or by complete overall performance? I mean how would you evaluate return performance: by year or by some sort of average. If it’s by year then I suggest
>>>> 
>>>> gen cat1 = return > 0.80
>>>> 
>>>> where 80% is the benchmark you want to generate the category for. Now, if it’s by overall performance you can generate the mean returns by country
>>>> 
>>>> bysort(country): egen meanreturns = mean(return)
>>>> 
>>>> and then create the dummy variables for the categories you want
>>>> 
>>>> gen cat1 = meanreturns > 0.80.
>>>> 
>>>> Best regards,
>>>> 
>>>> Alfonso
>>>> 
>>>> 
>>>>> On Dec 5, 2013, at 2:28 PM, Xixi Lin <[email protected]> wrote:
>>>>> 
>>>>> Hi Nick,
>>>>> 
>>>>> I checked out the cluster help; however still don't know what to use.
>>>>> The thing here is that I have 45 countries, each country has a time
>>>>> series return for 500 weeks. So basically it is a panel data with 45
>>>>> countries, and each has 500 return numbers. I want to break the 45
>>>>> countries into several groups, for example 5 groups, based on their
>>>>> time series return performance.
>>>>> 
>>>>> Thanks.
>>>>> 
>>>>> Best,
>>>>> Xixi Lin
>>>>> 
>>>>>> On Tue, Nov 26, 2013 at 7:01 PM, Nick Cox <[email protected]> wrote:
>>>>>> You can classify on one variable. In many ways it is easier just to
>>>>>> look at a graph of the distribution and check for gaps. If you want a
>>>>>> more formal method, check out -group1d- from SSC as well as -cluster-.
>>>>>> 
>>>>>> Nick
>>>>>> [email protected]
>>>>>> 
>>>>>> 
>>>>>>> On 26 November 2013 21:55, Xixi Lin <[email protected]> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I am wondering how to break down to clusters. For example, I have 45
>>>>>>> countries, and I want to break those countries into several clusters
>>>>>>> based on country returns. Does anyone know how to do that in stata?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Xixi Lin
>>>>>>> *
>>>>>>> *   For searches and help try:
>>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> *   For searches and help try:
>>>>>> *   http://www.stata.com/help.cgi?search
>>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>> 
>>>> 
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: How to do cluster analysis
  - From: Xixi Lin <[email protected]>

References:
- Re: st: How to do cluster analysis
  - From: Xixi Lin <[email protected]>
- Re: st: How to do cluster analysis
  - From: Alfonso Sánchez-Peñalver <[email protected]>
- Re: st: How to do cluster analysis
  - From: Xixi Lin <[email protected]>
- Re: st: How to do cluster analysis
  - From: Alfonso Sánchez-Peñalver <[email protected]>
- Re: st: How to do cluster analysis
  - From: Xixi Lin <[email protected]>

Prev by Date: Re: st: How to do cluster analysis
Next by Date: Re: st: Append multiple files from .txt file with "file read"
Previous by thread: Re: st: How to do cluster analysis
Next by thread: Re: st: How to do cluster analysis
Index(es):
- Date
- Thread