Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: Binary Variables

From   "Martin Weiss" <>
To   <>
Subject   AW: st: Binary Variables
Date   Wed, 2 Jun 2010 18:53:14 +0200


This is an inefficient way to generate dummies. You could simply add -i.REGION- to your regress call, and Stata would do the work for you. Or expand via -xi-, as dicussed in another thread today.

- generate SchleswigHolstein=REGION == 1- gets you to a true dummy in one step, should you still wish to create them yourself.


-----Ursprüngliche Nachricht-----
Von: [] Im Auftrag von Natalie Trapp
Gesendet: Mittwoch, 2. Juni 2010 18:42
Betreff: Re: st: Binary Variables

Thank you very much to all of you for helping me!!

Well, I will try explain the problem again:
First I generated the dummy variables:

.sort region
.encode region, generate (REGION)

. *SchleswigHolstein
. generate SchleswigHolstein= REGION if REGION == 1
(43657 missing values generated)
. replace SchleswigHolstein=0 if SchleswigHolstein== .
(43657 real changes made)

. *Saarland
. generate Saarland= REGION if REGION == 2
(43918 missing values generated)
. replace Saarland=1 if Saarland== 2
(119 real changes made)

Then, I regressed the models:

. reg fertilisers_Var Saarland Brandenburg MeckPomm Sachsen 
SachsenAnhalt Thueringen IleDeFrance Champagne Picardie HauteNormandie 
Centre BasseNormandie Bourgogne NordPasDeCalais LorraineAlsace 
FrancheComte PaysDeLaLoire Bretagne PoitouCharentes Aquitaine 
MidiPyrenees Limousin RhonesAlpes Auvergne LanguedocRoussillon 
ProvenceAlpesCote Corse ValleDAoste Piemonte Lombardia Trentino 
AltoAdige Veneto FriuliVenezia Liguria EmiliaRomagna Toscana Marche 
Umbria Lazio Abruzzo Niedersachsen Molise Campania Calabria Puglia 
Basilicata Sicilia Sardegna Belgium Vlaanderen Wallonie Luxembourg 
Netherlands Denmark Ireland EnglandNorth EnglandEast EnglandWest Wales 
Scotland NorthernIreland MakedoniaThraki IpirosPeoponissos Thessalia 
StereaEllas NRW Galicia Asturias Cantabria PaisVasco Navarra LaRioja 
Aragon Cataluna Baleares CastillaLeon Madrid CastillaLaMancha Valencia 
Murcia Extremadura Andalucia Canarias Hessen EntreDouroEMinho 
TrasOsMontes RibatejoEOeste AlentejoEDoAlgarve Acores Austria EtelaSuomi 
SisaSuomi PohjanmaaRheinlandPfalz PohjoisSuomi Slattbygdslan 
SkogsOchMellan LanINorra Czech Estonia KoezepMagyarorszag KoezepDunantul 
NyugatDunantul DelDunantul EszakMagyarorszag EszakAlfoeld DelAlfoeld 
Latvia Lithuiana PomorzeAndMazury WielkopolskaAndSlask 
MazowszeAndPodlasie BadenWuerttemberg MalopolskaAndPogorze Slovakia 
Slovenia Bayern ClimateVariables Oats Barley Rye Wine *(and many more 
Crops and Variables)*

@ Steve: The omitted regions do change, when I change the independent 
variables. And I do have the same problem with some other binary 
variables like farmsizes and farmtypes.

There are no missing data for the regions and I can most certainly say 
that each farm in this dataset has a different value for the fertiliser 
input because it's measured in €.

I also thought the regions cannot be similar, because they have 
different temperatures, soil qualities, precipitation rates and so on. 
Therefore the South of France must be different from the North of Germany.

I also couldn't sort out how to make the "xtreg" command work. It gives 
me the error "not sorted r(5);", even when I sorted the data and then 
typed the "xtreg" command (maybe because there are too many variables or 
maybe because I have cross sectional data?!).

Thank you very much once again for your patience and kind assistance,

On 6/2/2010 4:14 PM, Neil Shephard wrote:
> On Wed, Jun 2, 2010 at 2:03 PM, Natalie Trapp<>  wrote:
>> Hi Neil,
>> I use Stata 11 and do a normal OLS estimation (with the "reg" command):
>> y = dependent variable (agricultural inputs)
>> x = independent variables (climate variables, crops, etc.) and dummy
>> variables that represent the 150 regions within the EU
> This is _not_ showing what you are typing, if you are using -regress-
> then I would expect you to have included something along the lines
> of....
> regress agricultural_inputs temperature crops i.region
>> The coefficients of the dependent variable within each region are very
>> divers and significant for about 120 regions.
>> My Problem is for instance, when "Schleswig Holstein" is my reference
>> group, Stata adittionally omits Valle d'Aoste, Vlaanderen and Ile de France.
> Could it be that there is missing data within your data structure such
> that most observations for these regions are omitted and the few that
> remain all have the same value of "agricultural_inputs".
>> Still, I don't quite understand why Stata does it, because the regions
>> (Germany, France, Netherlands) do not seem to be similar to me.
> "Seems" is a vague term and is based on your subjective interpretation
> of what you are expecting, and it need not be because you have the
> data, you can look at it.  Check the patterns of missing data that
> exist and how these pan out within the regions, in particular those
> that are being omitted.
> Stata will be omitting them for a reason (and it will often indicate
> why a particular category has been dropped).
> So again, pasting the _exact_ command you are entering and the
> resulting output would be very informative to other list members.  You
> can copy and paste from the Results window directly into an email.
> Neil

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index