Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
AW: st: Binary Variables
"Martin Weiss" <firstname.lastname@example.org>
AW: st: Binary Variables
Wed, 2 Jun 2010 18:53:14 +0200
This is an inefficient way to generate dummies. You could simply add -i.REGION- to your regress call, and Stata would do the work for you. Or expand via -xi-, as dicussed in another thread today.
- generate SchleswigHolstein=REGION == 1- gets you to a true dummy in one step, should you still wish to create them yourself.
Von: email@example.com [mailto:firstname.lastname@example.org] Im Auftrag von Natalie Trapp
Gesendet: Mittwoch, 2. Juni 2010 18:42
Betreff: Re: st: Binary Variables
Thank you very much to all of you for helping me!!
Well, I will try explain the problem again:
First I generated the dummy variables:
.encode region, generate (REGION)
. generate SchleswigHolstein= REGION if REGION == 1
(43657 missing values generated)
. replace SchleswigHolstein=0 if SchleswigHolstein== .
(43657 real changes made)
. generate Saarland= REGION if REGION == 2
(43918 missing values generated)
. replace Saarland=1 if Saarland== 2
(119 real changes made)
Then, I regressed the models:
. reg fertilisers_Var Saarland Brandenburg MeckPomm Sachsen
SachsenAnhalt Thueringen IleDeFrance Champagne Picardie HauteNormandie
Centre BasseNormandie Bourgogne NordPasDeCalais LorraineAlsace
FrancheComte PaysDeLaLoire Bretagne PoitouCharentes Aquitaine
MidiPyrenees Limousin RhonesAlpes Auvergne LanguedocRoussillon
ProvenceAlpesCote Corse ValleDAoste Piemonte Lombardia Trentino
AltoAdige Veneto FriuliVenezia Liguria EmiliaRomagna Toscana Marche
Umbria Lazio Abruzzo Niedersachsen Molise Campania Calabria Puglia
Basilicata Sicilia Sardegna Belgium Vlaanderen Wallonie Luxembourg
Netherlands Denmark Ireland EnglandNorth EnglandEast EnglandWest Wales
Scotland NorthernIreland MakedoniaThraki IpirosPeoponissos Thessalia
StereaEllas NRW Galicia Asturias Cantabria PaisVasco Navarra LaRioja
Aragon Cataluna Baleares CastillaLeon Madrid CastillaLaMancha Valencia
Murcia Extremadura Andalucia Canarias Hessen EntreDouroEMinho
TrasOsMontes RibatejoEOeste AlentejoEDoAlgarve Acores Austria EtelaSuomi
SisaSuomi PohjanmaaRheinlandPfalz PohjoisSuomi Slattbygdslan
SkogsOchMellan LanINorra Czech Estonia KoezepMagyarorszag KoezepDunantul
NyugatDunantul DelDunantul EszakMagyarorszag EszakAlfoeld DelAlfoeld
Latvia Lithuiana PomorzeAndMazury WielkopolskaAndSlask
MazowszeAndPodlasie BadenWuerttemberg MalopolskaAndPogorze Slovakia
Slovenia Bayern ClimateVariables Oats Barley Rye Wine *(and many more
Crops and Variables)*
@ Steve: The omitted regions do change, when I change the independent
variables. And I do have the same problem with some other binary
variables like farmsizes and farmtypes.
There are no missing data for the regions and I can most certainly say
that each farm in this dataset has a different value for the fertiliser
input because it's measured in €.
I also thought the regions cannot be similar, because they have
different temperatures, soil qualities, precipitation rates and so on.
Therefore the South of France must be different from the North of Germany.
I also couldn't sort out how to make the "xtreg" command work. It gives
me the error "not sorted r(5);", even when I sorted the data and then
typed the "xtreg" command (maybe because there are too many variables or
maybe because I have cross sectional data?!).
Thank you very much once again for your patience and kind assistance,
On 6/2/2010 4:14 PM, Neil Shephard wrote:
> On Wed, Jun 2, 2010 at 2:03 PM, Natalie Trapp<email@example.com> wrote:
>> Hi Neil,
>> I use Stata 11 and do a normal OLS estimation (with the "reg" command):
>> y = dependent variable (agricultural inputs)
>> x = independent variables (climate variables, crops, etc.) and dummy
>> variables that represent the 150 regions within the EU
> This is _not_ showing what you are typing, if you are using -regress-
> then I would expect you to have included something along the lines
> regress agricultural_inputs temperature crops i.region
>> The coefficients of the dependent variable within each region are very
>> divers and significant for about 120 regions.
>> My Problem is for instance, when "Schleswig Holstein" is my reference
>> group, Stata adittionally omits Valle d'Aoste, Vlaanderen and Ile de France.
> Could it be that there is missing data within your data structure such
> that most observations for these regions are omitted and the few that
> remain all have the same value of "agricultural_inputs".
>> Still, I don't quite understand why Stata does it, because the regions
>> (Germany, France, Netherlands) do not seem to be similar to me.
> "Seems" is a vague term and is based on your subjective interpretation
> of what you are expecting, and it need not be because you have the
> data, you can look at it. Check the patterns of missing data that
> exist and how these pan out within the regions, in particular those
> that are being omitted.
> Stata will be omitting them for a reason (and it will often indicate
> why a particular category has been dropped).
> So again, pasting the _exact_ command you are entering and the
> resulting output would be very informative to other list members. You
> can copy and paste from the Results window directly into an email.
* For searches and help try:
* For searches and help try: