With interest rate in the UK so pathetically low, I thought I might take some chance by making use of a Stocks ISA account in the UK. The problem though, is that I have no knowledge about the London stock market nor do I have the time to follow it. So I wrote a program to pick some Exchange Traded Funds (ETFs) with a primary goal to minimise risk and opportunity costs.
Here are what I wanted to achieve:
- only a few trades at most a year
- less risk than FTSE100
- more yield than a laddered government bonds portfolio
- require less than an hour per month of maintenance
Basically, this is a computer-assisted passive investment portfolio.
The first step is to scrape all the ETF symbols from London Stock Exchange on these pages. I use
XML package in R to select the relevant data from the HTML page with XPath.
page <- getURL(url, curl=curl) tree <- htmlTreeParse(page, useInternalNodes=TRUE) xpath <- "//table[@class = 'table_dati']/tbody" node <- getNodeSet(tree, xpath)
This program only considers ETF because this portfolio is to diversify risk and not pick winning stocks. ETFs provide convenient exposure to various asset classes such as equities, bonds, and commodities at low costs.
Next is to scrape profile information for each symbol from Yahoo. We want data such as the fund's expense ratio and asset class category.
url <- paste("http://finance.yahoo.com/q/pr?s=", symbol, "+Profile", sep="") tree <- htmlTreeParse(url, useInternalNodes=TRUE) xpath <- "//table[contains(concat(' ', @class, ' '), ' yfnc_datamodoutline1 ')]/tr/td/table" node <- getNodeSet(tree, xpath) operation <- tryCatch(readHTMLTable(node[]), error = function(e) NA) overview <- tryCatch(readHTMLTable(node[]), error = function(e) NA)
Once the funds' fundamental data are fetched, we can do a preliminary screening. I am filtering for:
- Actively traded funds,
- Sufficient age (3 years), and
- Only the best 3 expense ratio efficiency from each class
That last point is particularly important as illustrated in this plot.
The above plot couldn't fit in this frame but it shows that expense ratios are all over the place. What matters is that the raw data is available for use.
The plot below is clearer. It shows expense ratio by the fund's issuer. You can see that Vanguard funds generally have the best expense ratio as is commonly known.
The initial ETF list has 667 funds in 104 categories. The screened list narrows it down to 20 funds in 18 categories. Most that were screened are niche funds such as Islamic Global Equity and regional real estate funds.
Out of that 20 funds, I apply the popular Modern Portfolio Theory to minimise risk using historical quotes data with
quantmod's Yahoo data fetcher. Given the expected returns of each asset,
er and their covariance matrix,
cov.mat, a long-only efficient portfolio weighting of those assets can be solved with quadratic programming like so.
Dmat <- 2*cov.mat dvec <- rep.int(0, N) Amat <- cbind(rep(1,N), er, diag(1,N)) bvec <- c(1, target.return, rep(0,N)) result <- solve.QP(Dmat=Dmat,dvec=dvec,Amat=Amat,bvec=bvec,meq=2)
To get these weightings,
IGLT.L MIDD.L INXG.L EQQQ.L SLXX.L IBGS.L IUKP.L LUK2.L 0.603023 0.122829 0.116879 0.084122 0.037906 0.017975 0.014051 0.003215
But here's the catch, this mean-variance optimisation approach which I'm using does not work in the real-world. The problem is that it optimises for historical data under simplistic assumptions. For potential improvements on this model, start with this Q&A on StackExchange but be warned that it's a rabbit hole to go down in.
Knowing that I shouldn't trust this model much, I do this a couple times under different scenarios on the efficient frontier and union the top weighted assets from each run as a compensation by sampling.
The result is a suggestion of six ETFs.
Symbol Name category BRIC.L ISHARESII 50 BRIC Equity EQQQ.L POWERSHS EQQQ US Large-Cap Growth Equity IGLS.L ISHARESIII 0-5£ GBP Government Bond INXG.L GBP IDX-LNK GLT GBP Inflation-Linked Bond MIDD.L ISHARESFTSE250 UK Mid-Cap Equity SLXX.L ISHSIII IBX £CB GBP Corporate Bond
Out of these I hand picked IGLS.L and MIDD.L for a conservative 80% bonds and 20% equity portfolio. This plot below shows the annualised return versus risk of ISF (FTSE100), an equal-weighted portfolio of the pre-screened 20 ETFs, and this final portfolio of two ETFs. Notice the historic risk of this final portfolio is a third of FTSE100.
Not surprisingly, what my program derived from scratch is similar to the commonly suggested portfolio balance of bonds, local equities, and emerging market blend. What this program offers is picking out the specific ETFs from the hundreds of ETFs traded on London Stock Exchange for a balanced asset allocation.
The complete R source code for this project is available on Github.