I made an error with the bilateral data yesterday and underestimated the number of observations in my project. I have 150 countries, so in a bilateral dataset, thats 150*150-150=22,350 country pairs. each pair can theoretically export 772 goods, which means my bilateral trade dependent variable for one year is 17.2 million. I have 5 years of data which means i should have 86.2 million observations. wow…
Underestimated
June 26, 2008Structural Changes
June 21, 2008Last batch of regression replications. The magnitudes are a bit different from the big paper, but the signs are correct. Basically, the product space variables can predict what goods a country exports…
Its too big…
June 16, 2008As part of my empirical prep, i tried to (roughly) replicate one regression in this WP. Trying to replicate should be part of every program, and i am saddened at the fact that i was not asked to replicate in any of our classes. At the very least, i was able to confirm the basic message of the paper, while at the same time question my own understanding and exercise some old fashioned doubt.
The point: each good has a ‘value’ associated by the development of the countries that specialize in them. The ‘value’ of the country’s export basket is called ExpY. The theory says that if you look at value of the export basket of a good is high, the country will exhibit subsequent economic growth.
The first practical lesson is that i coded the log of ExpY incorrectly. The numbers were too huge compared to the paper. The correct range should be from 7 to 10, in natural log scale. the regression coefficient, controlling for log of initial income is 0.4 — since a standard deviation of lnExpY is around .4 also, a country moving 1 standard deviation up will experience an increase of 0.16 in its growth rate, (simple growth rate).
Looking at the graph, you may this is making a mountain of a molehill, but the y axis is squeezed– a value of 2 is basically a doubling of real GDP per person (source, PWT 6.2). To make a more attractive graph, i log transformed growth, and the fitted line is more noticeable upward sloping:
Replication
June 13, 2008Using sitc4, the exporter and import fixed effects are strange, but using the shared data, the specification works…
With the original data, the signs of the regressors are sometimes wrong, if we don’t include exporter and importer dummies. This implies inherent heterogeneity in the data in terms of average trade volume. That makes total sense.
Serial Correlation
January 27, 2008The reason why AR1 is particularly pernicious is because of the possibility of serial correlation. AR1 is consistent and unbiased with least squares (but Gauss Markov fails) given uncorrelated disturbances. But with correlated disturbances, then we lose consistency and unbiasedness.
In the preceeding post, we show that the correlation coefficient of AR1 is pretty darn close to its true value of 0.9. Here, we show what happens to the estimate of the correlation coefficient changes when we increase the sample size.
n=100 B=0.9942
n=300 B=0.9935
n-500 B=0.9936
n=1000 B=0.9934
As we can see increasing the sample size does nothing to bring the estimate closer to its true value. In fact, the persistence of the error is 0.9, and it can be shown that the limit of the correlation coefficient is 1 as the persistence term approaches 1.
Without serial correlation, more information should provide more information:
B=0.797
B=.9248
B=0.8951
B=0.9028
…which in fact is what happened.
AR1
January 26, 2008Back to basics: lets say you have a time series data set:
Above, we have 700 observations. You suspect that it is a simple Autoregressive process. (In fact it is!, this is how i generated it in Matlab, the errors are random normal around zero with standard deviation of 1). You graph y(t) on a one period lag and see that its highly correlated. in fact, a regression gives a corellation coefficient of 0.8536
. [in fact, its 0.9].
I transfered this ‘data’ to Stata to do a autocorrelation and partial autocorrelation.
.. and as expected, the autocorrelation function smoothly drops to zero and the partial autocorrelation drops to zero after the first lag.
Posted by outinfour
Posted by outinfour 

Posted by outinfour 
