Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: match string variables


From   "Lukas Bösch" <L.Boesch@gmx.de>
To   statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu
Subject   Re: st: match string variables
Date   Thu, 14 Apr 2011 22:56:33 +0200

Thank you all

Now that I restricted the species list to the first two words everything works fine. I really should have noticed that only the first two words are the taxonomic classification but i am not a biologist...


-------- Original-Nachricht --------
> Datum: Thu, 14 Apr 2011 09:20:06 -0400
> Von: "Dimitriy V. Masterov" <dvmaster@gmail.com>
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: match string variables

> Lukas,
> 
> After standardizing the strings as Nick suggested, I think you might
> try using the user written command -strgroup- from ssc. It does not
> work on 64-bit Windows and you will still have to do a lot of manual
> checking, but it may make your project a bit easier. I would try
> merging (or perhaps nearmrging) the two datasets and using strgroup on
> the _merge!=3 group.
> 
> DVM
> 
> On Wed, Apr 13, 2011 at 5:18 PM, "Lukas Bösch" <L.Boesch@gmx.de> wrote:
> > Dear Stata Community.
> >
> > I am working with the CITES trade data and my aim is to analyze the
> export of 130 countries from 1990 to 2009 with a logistic model. CITES
> regulates the international trade in endangered species. The export data for
> Afghanistan, for example, looks like this:
> >
> > year       taxon          term       unit        country
>         value
> >
> > 1990      Falco Cherrug   live         -            AF  
>            0
> > 1991      Falco Cherrug   live         -            AF  
>            0
> > 1992      Falco Cherrug   live         -            AF  
>            0
> > 1993      Falco Cherrug   live         -            AF  
>            0
> >
> > In the case of Afghanistan, the data contains 180 rows, with nine
> different taxon. In some cases it contains up to 8000 rows with 2000 taxon. I
> know that I could also have shown the data in a wide form with much fewer
> rows...
> >
> > Now I want to create a variable “indigenous” with 1 if the exported
> taxon exists in the country or 0 if not. In order to get this I copied the
> species lists for all 130 countries from the CITES homepage, which looks
> like this (again for Afghanistan):
> >
> > Accipiter badius (Gmelin, 1788)
> > Accipiter gentilis (Linnaeus, 1758)
> > Accipiter nisus (Linnaeus, 1758)
> > Acinonyx jubatus (Schreber, 1775)
> > Acipenser nudiventris Lovetzky, 1828
> >
> > This list contains 131 taxon and I sorted it out in order to get rid of
> the years, the commas and so on.
> >
> > Accipiter badius
> > Accipiter gentilis
> > Accipiter nisus
> > Acinonyx jubatus
> > Acipenser nudiventris Lovetzky
> >
> > I have tried different variations of merge and joinby, I looked at the
> ado files _gsoundex, nearmrg, nmatch and reclink but I haven’t been able
> to create the “indigenous” variable so far.
> > There are two major problems. The first one is that the taxon in the
> species list doesn’t always match exactly with the taxon in the export data.
> For example, Falco cherrug, in the export data is listed as Falco Cherrug
> Gray in the species list. The second problem is that the species list has a
> different number of observations than the export data and they dont fit
> logically together.
> > I need something like: If the taxon from the export data is on the
> species list, then “indigenous” = 1, if the taxon from the export data is
> not on the species list, then “indigenous” = 0.
> >
> > Maybe someone has an idea and can give me a hint on how to do this.
> >
> > Thank you very much
> >
> > Lukas
> >
> > --
> > Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
> > belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index