Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Model Building

 From Stefan Nijssen To statalist@hsphsun2.harvard.edu Subject st: Model Building Date Tue, 26 Apr 2011 14:07:38 +0200

```Dear Statalist members,

I am trying to find the best indicators (independent variables,
corporate fundamental data) to predict the dependent variable (the
risk factor; 'oas'). I want to create a model with as few variables as
possible to best predict 'oas'. So I need to make a selection of all
variables available, see below. However, first thing to do seems to be
finding the correct form with which the independents interact with the
dependent. Plotting the variables separately with 'oas' I find some of
them to be linear, some clearly quadratic and others fractional
polynomial. To me it seems logical to converge the variables into
their correct form as to best work with them, puzzling which to
include and which not. I thought of using the .reg function to test
the function. I do not see how to create a quadratic variable (other
than generating var*var), and fractional polynomial will be even more
difficult, before using them in the regression. Can anyone provide
suggestions on what to do?

After doing so, I was thinking of using Multiple Discriminant Analysis
to reduce the number of variables.

Yours,

Stefan Nijssen

The dataset:

oas	ebit	sales	intcov	totass	totdeb	ltdeb	totcap	worcap	margin	operex	dtoe	equity	lev	rating2
85	5380643	28251654	13.19	43681540	14452051	10497732	30656052	6427401	14.15	24461264	71.32	20265092	0.52	2
303	281856.9	2665473	3.54	3796766.31	1508258.5	1315873.5	3223983	68471	7.31	2550578.3	79.66	1893310.5	0.7	6
463	1364563	7401812.5	3.08	22017880	7853000	6879250	14402250	291750	7.88	6122000	107.53	7302908.5	0.94	4
198	2647250	13573438	3.09	44021380	17441250	15316000	27260250	-1575500	8.91	11377000	148.59	11738033	1.3	4
269	3704500	14422250	2.07	35880445.63	18933500	17032000	25258250	2136250	3.23	11825250	448.16	4224695.5	4.03	4
427	173537.5	6206356	2.37	4271637.5	639125	635425	1476400	1042875	0.56	5841525	77.84	821048.94	0.77	5
135	1641008	19334660	12.26	21852596.13	3422023.5	1842737.8	9437302	838316.31	5.18	18850387	45.76	7479015.5	0.25	4
145	4551831	44197292	3.18	72634330.63	19065608	15252120	40214560	8678020	7.76	40345364	77.59	24572248	0.62	3
150	4194342	14754657	5.45	28717164	13835568	11898647	19494606	3414371.8	18.76	11092639	218.77	6324361.5	1.88	3
140	1438059	3687984.5	1.49	24024376.88	17841316	16468203	21458064		12.31	3140013.3	521.25	3422778.3	4.81	3
220	203325.6	676520.69	1.15	7101090.5	3507231	2940892	6296144		13.52	618380.5	134.85	2600887	1.13	4
256	4631625	12534625	4.86	50502630	13211750	12422000	31765500	979500	20.93	8783500	70.73	18680452	0.66	4
138	261468	829771.25	1.51	7099677.69	3731297	3351377	6441615		34.88	710694.5	122.2	3053434.5	1.1	4
148	261468	829771.25	1.51	7099677.69	3731297	3351377	6441615		34.88	710694.5	122.2	3053434.5	1.1	4
695	82975.38	904324.31	1.14	1665280.06	775164.75	741634.5	1455926.9	-8168.0601	0.28	899990	113.89	680625.81	1.09	6
94	216452.8	1611485.3	3.15	2646279.44	1288793.6	1276829.4	2134875.8	424008	4.87	1539896	151.24	852151.31	1.5	6
394	71108.94	470080.44	1.45	790061.69	497222.25	497222.25	691091.25	125072.25	3.06	405778.5	269.81	184284.36	2.7	6
121	3494500	15628314	6.26	35815069.38	9345333	8841667	20532666	-706000	11.92	12375667	79.9	11696775	0.76	4
340	205203.3	593503.31	9.72	1973646.38	934802.25	923577.25	1677409.8	-89585.25	18.62		122.96	760249.06	1.21	6
166	889048.6	1494797.3	2.82	11312311.38	6567463.5	5966816.5	10616108		38.11	444483.75	164.13	4001440	1.49	4
583	176160.1	308049.94	4.86	1076181.94	591622.5	349243.75	743215		27.91	1214527.3	159.18	371663.03	0.94	5
180	1447188	18371130	11.08	41088878.75	2391000	2151000	7259000		5.24	4664822	49.77	4803857.5	0.45	4
450	541709	6160482.5	2.85	8880104	4277432	3606781	6622840	1553109.5	2.34		223.4	1914718	1.88	5
298	3863553	77601392	2.44	85151294.19	31085222	18770020	34849728	6661573.5	1.52	17478500	208.22	14929207	1.26	5
128	1534585	9863095	4.59	10252978.56	4923419.5	4713170	6494518	777069.5	8.59	6020838.8	298.73	1648123.8	2.86	4
274	460957.8	4364300	10.08	14315121.25	1396855	883192.75	4599718		8.27	8585884.3	39.18	3564997.3	0.25	4
53	2605694	18664554	9.39	18346915	6482850	5699125	11169575	1040000	9.47	7014269.5	131.28	4938091.5	1.15	3
181	525481.3	2691643.8	38.87	3703062.5	800775	800775	3187250	629400	13.14	3851485.5	33.25	2408527	0.33	4
434	304058.6	8118761	7.02	3632280.88	1160131.8	1101497	2577599	1261795.5	1.47	16293825	79.96	1450958.1	0.76	4
286	531000	6554125	1.26	14792500	6698000	6041000	8780000	719666.69	1.59	2532000	280.79	2385441	2.53	5
122	1228381	9417844	7.81	12637412.5	4106456.3	3509531.3	8877925	611050	9.21	5949666.7	79.18	5186393	0.68	3
135	7207145	26375682	5	76487644	41771588	36765752	66292060	-5526529	11.74	8008593.8	151.55	27562908	1.33	4
153	7169938	29623442	2.66	132017312.5	71461000	45556248	58932752		10.66		548.05	13039079	3.49	3
138	2290250	10252062	16.78	15527253.13	1828187.5	1676500	11110500	3320000	14.48	21547524	19.38	9433068	0.18	3
357	1276806	15411878	1.81	24708428.75	11674496	5704248	12501244	3870498.5	2.23	25365250	173	6748263.5	0.85	5
198	175700.8	1007057.3	2.63	3038026.31	1233305.8	1148982.5	2275610.8	133983.5	13.72		108.62	1135431.5	1.01	4
148	302575	3533750	6.92	3164375	957000	817000	2513250	548750	4.61	8107937.5	56.22	1702165.5	0.48	4
475	99679.25	1465177	1.81	1858617.19	748078.25	742457.75	1568498.5	313015.75	-0.95	14234758	90.51	826491.63	0.9	6
289	2436011	27316448	4.6	28189748.88	8609682	7675119	20332920	3673255.8	5.42	865382.5	69.78	12338324	0.62	4
103	2452813	10048750	5.11	26310131.25	7940250	7482000	16065750	207500	13	1933380.3	92.79	8557226	0.87	4
135	476165.8	1923645	7.62	2265192.5	1438719.3	1311237.5	1466039.4	514408.94	18.31	25573162	297.72	483245.78	2.71	4
123	2574000	11116812	10.43	153419556.3	33804752	5405250	19493000		5.13	7645500	254.22	13297439	0.41	3
253	830500	11135625	3.18	15503938.75	3523000	3214500	9670000	1689500	7.75	79675488	55.41	6358632	0.51	4
210	5349000	20837010	4.72	60373010					16.07	34782286				3
90	1939410	4605238	25.42	14894076.5	2775585.3	2702085.3	11760652	240683	26.16	1486366.5	29.26	9486748	0.28	3
295	210577	880666.25	0.79	8760320.75	5247260	4437147.5	7444190		-2.57	8077500	225.08	2331312.8	1.9	5
220	1270822	26866490	4.22	13348663.75	3895161.5	3553098.3	9396833	213236	2.27	9792500	65.25	5969596	0.6	4
185	1270822	26866490	4.22	13348663.75	3895161.5	3553098.3	9396833	213236	2.27	9792500	65.25	5969596	0.6	4
237	1213502	7020566	8.06	8162819.31	3180375	3008315	6756995	342366.75	8.14	3416891.3	86.02	3697358	0.81	4
111	891058	3396809.8	4.04	9885560	3760772.8	3623771.3	8834517	526434.5	10.02	809219.5	83.96	4479243.5	0.81	4
135	2280688	12996000	2.96	40561443.13	11723500	10693500	21409250	1287500	8.76	26055367	122.16	9596644	1.11	3
394	764500	4659875	0.72	23815319.38	13790250	13193500	18387000	-646000	3.06	26055367	431.41	3196571.8	4.13	5
395	148101	1340106.8	4.86	1958709.31	505149.5	490077	1696372.8	532718.75	7.45	5944816	41.81	1208202.6	0.41	5
242	883022.4	7233008.5	4.46	9438240.25	6107244.5	6066794.5	10428866	238687.25	5.2	2475224.3	146.53	4167819.3	1.46	4
172	2340448	11540884	3.61	35344855	11808253	10971290	19517114	1249746.5	10.48	10630750	143.54	8226454.5	1.33	4
234	121671.3	1099781.5	3.15	3085560.06	803425.75	698507.63	2078448.5	992426.06	7.4	2785500	58.16	1381510	0.51	5
184	2841823	15281568	3.86	40004167.94	17852298	14944255	24606332	-4342231.5	13.36	1272444	226.75	7873097	1.9	4
151	7169938	29623442	2.66	132017312.5	71461000	45556248	58932752		10.66		548.05	13039079	3.49	3
397	657883.1	2219391.3	1.8	7770782.31	5632487	5559101	7325540	34978	9.15	9446279	752.49	748515.69	7.43	5
403	-149778.2	1765832.5	-3.06	779485.94	337975.5	315692.5	468990	200225.5	-14.17	943569.19	437.64	77226.828	4.09	5
111	1874938	5725187.5	20.17	16386063.13	1866250	1808250	16321750	4051000	51.21	15559530	13.25	14087564	0.13	4
254	1450000	18632000	2.37	22328000					3.01	25365250				5
495	167043.8	3405106.3	0.93	4236662.5	2610375	2593175	3129425	354350	-1.69	1880957.8	660.03	395491.88	6.56	6
282	489044.6	1045975.8	1.53	11641917.63	5937713	5483499.5	11472368		41.14	1852443.5	120.51	4927358	1.11	4
88	6313813	75159320	11.51	46344378.75	11193250	9788250	28615312	2874750	4.8	4830750	60.11	18622440	0.53	4
673	208767.2	1667810.8	3.63	4546824.94	1645275	1597527.8	3205995.3	223982	2.09		119.55	1376223.4	1.16	6
174	3437022	16215220	20.79	22440903.44	2870735.8	2700069.5	15749543	553035.44	15.19	3464900	22.07	13010359	0.21	4
291	620937.5	4919500	1.58	12033875	5722750	5328250	11385000		5.69	899246	99.49	5752085.5	0.93	5
251	635663.2	5151291.5	2.3	11796880.63	4114660	3877029	7108709	625312	4.83	66319563	128.3	3207061.5	1.21	3
167	2272879	4815855	3.23	23966016.5	15652777	13834359	19795184	-1222408.3	17.89	4711750	315.43	4962361.5	2.79	3
362	1003063	12248125	2.37	19027438.75	8657250	8311250	11229000	1662750	3.42	28858140	304.93	2839094.3	2.93	4
331	463412.5	6453806	2.42	11235206.25	2038125	1763075	4383075	655125	2.54	2585588.2	103.16	1975788.9	0.89	5
103	5324313	20861712	5.36	32989254	13284815	11314448	22144602	-1795919.3	14.67	11037750	125.72	10566881	1.07	3
141	1015455	2310940.5	3.55	10346050.81	5348817	4468707	6851732.5	-1116425	25.63	6050225	226.44	2362108.3	1.89	4
249	2698201	66511396	5.52	55476209.69	8321952	6954835	21678672	5420935	1.6		67.71	12290354	0.57	4
177	308608.8	4755793.5	5	7111222.56	2697911.3	2422615.5	4696201.5	155167.75	2.97	45529518	122.28	2206399.3	1.1	4
158	208231.3	1145226.5	2.61	5489118.69	3711619.5	2487834	4474727.5	290544.31	7.84	4836920.7	215.55	1721909.8	1.44	4
228	214846.5	1954261.5	14.83	1382212.69	303568.69	211785.5	919402.44	365864.19	7.69	24998013	42.97	706466.56	0.3	2
380	289061.9	2952063.8	13.45	41551293.13	1763195	1763195	4019890		6.47		65.41	2695604.8	0.65	4
62	1500929	6863926.5	33.75	8073281.06	1510279.3	1256916.8	6292660	2462186.5	15.58	45862000	29.81	5066563.5	0.25	3
124	4245063	64668880	8.07	55944753.75	10268500	9524250	12723500	-412500	4.1	67165934	141.78	7242559	1.32	3
90	934812.5	13388062	3.38	13214000	5620333.5	5001000	8263666.5	792000	3.54	79633947	295.58	1901437.9	2.63	3
110	1939410	4605238	25.42	14894076.5	2775585.3	2702085.3	11760652	240683	26.16	1486366.5	29.26	9486748	0.28	3
66	3574500	33506944	7.84	33455756.25	7570750	5821250	14858750	2200250	6.89	1747963.5	86.97	8705514	0.67	3
126	1519000	11223500	4.23	30125940.63	6946000	6210750	24405500	780000	14.75	2601187	38.49	18045074	0.34	3
139	249274.1	517294.5	2.53	3403218.81	2355742.8	2355742.8	3261016.3	46849.5	18.34	5501541.6	276.13	853120.44	2.76	4
135	1006275	10111125	9.08	9886887.5	2134675	922675	4769675	699100	7	60788000	57.17	3733744.3	0.25	4
158	1009592	8659594	11.25	5911334.31	2466590.8	2236040.8	3724760.8	1979503.1	6.41	12409000	177.08	1392895	1.61	4
190	285465.6	803813.44	1.39	9263513.5	4316608	3688907.8	8473394		34.31	3416891.3	117.32	3679423.5	1	4
161	1869978	43084524	3.59	20332604.38	5643430.5	4631572	12001716	1488398.4	4.53	29747000	89.03	6338796.5	0.73	4
138	797618.9	2740491.5	2.19	12481629.25	7343072	6740469.5	8567194	70079.938	10.97	9313000	414.8	1770257.4	3.81	4
378	460957.8	4364300	10.08	14315121.25	1396855	883192.75	4599718		8.27	8585884.3	39.18	3564997.3	0.25	4
382	117131.9	3833936.3	0.83	4487341.06	1745101.8	1701066.8	2285959.3	-563982.13	-0.03		142.1	1228074.6	1.39	5
151	208231.3	1145226.5	2.61	5489118.69	3711619.5	2487834	4474727.5	290544.31	7.84	4836920.7	215.55	1721909.8	1.44	4
116	4890063	42352820	3.52	56689880	31003250	20736750	29766500	6520750	6.26	36745457	382.07	8114494	2.56	3
82	3574500	33506944	7.84	33455756.25	7570750	5821250	14858750	2200250	6.89	1747963.5	86.97	8705514	0.67	3
524	24693.75	1917095.8	0.3	4842910.5	2551828.8	2541828.8	4099848	254309.75	-7.89	60745750	187.89	1358132.3	1.87	6
303	4118070	23985514	3.67	51368068.13	21624580	17601584	39221240	334194.25	7.86	3851485.5	112.13	19284846	0.91	4
228	192375	25146066	0.66	25793691.88	8960750	8366500	15105750	1661812.5	-2.61		152.57	5873326	1.42	5
184	737062.5	12185125	3.76	7940062.5	2796750	2672500	4059250	744000	2.48	61749079	202.87	1378575.3	1.94	4
423	87325	3822237.5	5.05	3747943.75	1763925	1663400	2597250	317500	-1.23	1315678.1	248.3	710393.56	2.34	6
379	466187.5	6489375	2.23	9104875	2264000	2219750	5570750	1026000	2.77		68.16	3321474.5	0.67	4
239	910430.2	4651478.5	11.83	10495100.56	4125771.5	3549718.8	8499786	943826.5	10.23	38088000	84.04	4909587	0.72	4
161	873406.3	7511375	2.17	18747916.25	7086275	5844600	10740275	-1059200	3.25	29747000	144.98	4887676	1.2	4
299	128830.6	211611	2.34	2496541.13	1058476.3	943312.5	2455482.3		62.8	2124596.3	74.81	1414886	0.67	4
132	3148250	30979816	3.47	58350691.88	13507750	12823500	40106752	7081500	6.43	19916489	51.82	26067302	0.49	4
401	249530.3	6599636	2.39	5629625	2183225	2073350	3227037.5	540962.5	-1.34	1606497	220.87	988477.25	2.1	6
188	618418.8	22678324	3.19	10654518.75	4798500	3664000	7591000	-94500	1.53	23114125	126.37	3797333	0.96	4
194	363687.5	1506875	2.52	4541187.5	2739250	2584250	4162500	229000	17.08	11248250	176.67	1550533.5	1.67	4
436	425768.3	2743437.3	2.18	6383088.81	3429510.5	3356849	5539840.5	734827.25	4.5	926003	157.96	2171160.3	1.55	6
109	1116125	12794875	7.67	13415937.5	3587250	2831750	6713750	2612500	4.7	3482675	97.8	3668132.3	0.77	4
199	1549438	7481125	3.38	22025568.13	9329000	8336250	14727500	480000	12.11	5877000	152.48	6118280	1.36	4
170	1077101	8174526.5	6.1	13861322.38	3615055.8	2985167.5	11307444	2278129	7.77	3665766.3	45.31	7978934.5	0.37	4
165	453263.9	5882251	3	6683572.69	2720900	2405980	3921994.5	-8847	2.89	6552875	182.63	1489822.5	1.61	4
334	258826.6	2780762.3	6.16	2541417.94	908823.25	870188.25	2027089.8	638870.25	4.53	148330	82.8	1097645.8	0.79	5
374	113606.7	1290720.5	1.78	2288817	940772.25	940772.25	2498224.8	-4847.25	3.39	27476250	83.56	1125864.4	0.84	6
256	4011163	9243569	12.94	29933565.63	13595250	10916250	28680500	564750	35.54	6614862.5	83.17	16346830	0.67	4
257	217241.4	3432849	3.89	3286244.25	972853.5	914663	2056043.9	804996.38	3.33	23099750	87.58	1110769.4	0.82	4
184	783656.3	5001044	3.98	12044120.63	5744125	5262175	11262750	-1042300	6.64	1095000	101	5687111.5	0.93	4
124	3056379	14357249	4.59	33342150.75	10234541	8194759	28248912	-2202470.3	12.42	11947750	53.22	19229048	0.43	4
216	1507000	4753312.5	2.42	22919816.88	10170750	8937750	16571000	-1147000	22.37	5857250	150.64	6751580.5	1.32	4
458	21857.75	593619.38	0.73	1671891.94	527680.25	527680.25	1276642	-20649.5	-10.31	7815249.8	72.47	728136.13	0.72	6
418	246312.5	2375625	1.69	3257500	1373000	1111250	1177500	256000	-7.26	5295897.3	150.4	912883.75	1.22	6
172	2724610	10471538	3.23	143644875	28758000	17949500	33301500		12.73	2519033.8	200.86	14317435	1.25	3
67	2368938	11843812	36.17	15418313.13	3845750	3457250	10297500	3752750	15.22	1126684.5	57.37	6703416.5	0.52	3
97	4194342	14754657	5.45	28717164	13835568	11898647	19494606	3414371.8	18.76	11092639	218.77	6324361.5	1.88	3
164	4685525	31993766	16.93	38542137.5	8315263.5	5365503	30877336	8984776	10.4	3126697.1	33.66	24701248	0.22	3
410	301443.8	3836112.5	7.63	3715100	818400	807975	2295050	1714600	5.18	4770100	53.81	1520906.9	0.53	5
316	341568.8	5010344	3.9	5824387.5	1349100	1158800	3171350	659925	3.69	35305000	67.53	1997778.8	0.58	5
404	578698.4	5225987	8.39	4608613.69	2322451	2096578.3	3923313.5	957641.25	5.96	12276334	130.45	1780372.1	1.18	5
103	280608.7	15248190	2.12	21072253.19	258442	41405.75	9153858	3116438.8	6.43	3336750	2.99	8643545	0	2
329	436017.8	2581652.5	2.03	7615322.38	3674574.5	3289975.5	5682647	-310964.25	7.84	2066750	162.23	2265040	1.45	4
93	2730482	6604610	20.19	19711947.31	6410276	4252670.5	15795402	-220168.5	23.81	9042250	55.98	11452034	0.37	4
187	256625	12630312	0.86	17435253.75	2786937.5	2722625	11590875	3395437.5	-2.54	9675750	32.06	8692372	0.31	4
310	420312.5	26868316	1.49	10527062.5	2925562.5	2717812.5	7616187.5	2085500	0.15	28906844	61.2	4780379.5	0.57	5
356	213760.8	5300887	0.91	6745382.88	1771588.3	1716921.3	4461061.5	1109570.6	-5.02	4636325	66.38	2668959	0.64	5
103	4000375	48595944	12.84	34264943.13	10314000	8615500	17293250	2912750	4.84		124.19	8305351	1.04	2
682	50437.5	4427812.5	-0.36	4294000	1836000	1784000	3107000	817250	-8.4	5169180	189.68	967933.25	1.84	6
516	2156274	7025522.5	10.23	16587407.25	6431164	5183735	15193346	4809816.5	8.81	13227272	153.69	4184486.8	1.24	5
232	1701713	10523356	2.53	25882452.5	8290750	8074750	16699000	817750	5.6		110.06	7533279	1.07	4
162	325877.5	1335745	3.8	3263892.19	2455750	2341250	3755373	-28002	21.52	2146270.5	72.58	3383391.3	0.69	4
503	-330050.9	846301.31	-0.4	4443735.75	1975863.8	1975863.8	3686029	161809	-17.04	4655089.4	164.26	1202906.3	1.64	6
600	468432.4	894199	18.1	3639089.63	1161216.8	939615.31	2960341		18.6	11051563	77.18	1504654	0.62	5
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```