Finding under-valued suburbs in Cape Town
We’ve seen tremendous property price increases in South Africa over the past two decades, with many people in my generation feeling like it’s increasingly difficult to get onto the property ladder. However, with the continuing energy crisis, sluggish economy and political uncertainty in the country, I was intrigued to see how this might be impacting house prices.
I also wanted to find a way to create a visually impactful data science project that uses publicly available data that could have economic value to someone. So, I sat down and created a map of average house prices, coupled with the average change in house prices for most of the suburbs in Cape Town over the time period 2011–2019. The idea behind looking at the change in house prices is that one could potentially find an area that traditionally has seen strong growth in prices, but is currently undervalued; this would be seen through a historic growth in prices followed by a decline in prices over a recent period. I chose to focus on Cape Town because I wanted to use something from the City of Cape Town Open Data Portal and also because I recently moved back to Cape Town from Joburg to pursue a Masters in Data Science at UCT.
You can see the result of this project here.
The Process
When recently looking for a place to stay in Cape Town, I noticed that Property 24 had a graph on the main page for each suburb indicating suburb price trends. I created a web-scraper using BeautifulSoup in Python to scrape this data (see Github for the code of this web-scraper). It took a few attempts to ensure that I got the majority of suburbs in Cape Town and it took a bit of searching to find the JavaScript-generated data points I was after, but the data was relatively easy to get.
I also downloaded the shapefiles for the boundaries of suburbs in Cape Town as well as the boundaries of larger sub-councils from the City of Cape Town (CoCT) Open Data Portal which I used to group the suburbs and display the data.
From here, I moved over to R to take advantage of the ease of plotting spatial data with the SF and Leaflet packages and to make the data easy to navigate with a Shiny App. The scraped data was processed to remove NA values followed by calculating the annualized property price growth (or decline) per suburb for each pair of years (e.g. 2011 & 2012, 2011 & 2013… 2018 & 2019), similarly, the average house price was also calculated for these periods. The scraped data and CoCT data sets were then joined to be displayed in the Shiny App, with differences in the names of some of the suburbs adjusted on the CoCT data set to enable the joining of the datasets. The shapefile for the CoCT sub-councils was also used to group suburbs together; this was necessary to add an option to the data display that would load more quickly than the “all suburbs” option.
In terms of functionality, the Shiny app allows users to view the change in house prices (as an annualised %) or the average house prices over a time period specified by a slider. One is also able to select specific areas (groups of suburbs) or individual suburbs to compare.
Extracting insights from the data
A useful exercise to perform using the app is to look for the suburbs that have shown the greatest historic (2011–2017) price increase; as there are many suburbs in Cape Town that have shown an annual price increase of between 10% to 25%, we’ll pick a subset of these here for this analysis.
Based on this, we can isolate suburbs which grew by more than 10% per year between 2011 and 2017 and take a look at their yearly price change from 2017 to 2019. For this example, I picked suburbs in the Southern Suburbs and Atlantic, CBD and Surrounds areas to start the search and then picked a number of these suburbs where prices had grown by more than 10% a year (which was most of them). I then adjusted the time slider to see which of these properties had experienced the greatest declines in prices over the past two years (2017 to 2019); Vredehoek, Gardens, Woodstock, Bo Kaap, Camps Bay, Cape Town City Centre, Three Anchor Bay and Bishopscourt had all experienced annual price declines over these two years.
One can see that Three Anchor Bay has seen an average 12% annualised decline in prices between 2017 and 2019, with Camps Bay showing an even steeper price reduction. Depending on your viewpoint, this either suggests that properties are currently undervalued in these suburbs and could potentially be good purchases, or that prices were historically over inflated.
If you’re looking for places that meet your budget, a similar exercise could also be carried out by looking at the average house price for the past two years on the map and seeing where you can afford to live. All else being equal, one could then look at the change in house prices to get a sense of which areas are overpriced and which areas might be experiencing a short-term dip in prices before looking for a property in that area.
Challenges with the data
The way Property24 organises suburbs and the way CoCT organises suburbs (i.e. their respective suburb hierarchies) isn’t always the same. This has led to some gaps on the map as well as some potential errors in average suburb prices. For example, Property24’s data for “Noordhoek” comprises the average of the average house price for a number of suburbs within that area, which when joined with the City of Cape Town data gives an inaccurate picture of house prices in the area; this is exacerbated if one of the areas on Property24 comprises numerous socioeconomic levels within the area.
Secondly, in a minority of cases the Property24 average house prices swing wildly from year to year. This may be due in some cases to actual wild fluctuations of house prices in an area, or perhaps more likely, to an outlier price (such as an apartment block going on sale) which skews the mean.
Wrap up
This project showed how an exploratory data analysis approach can help to uncover high level insights in a data set that can point one towards insights in the data (for example, where have prices gone down recently). It also allowed me to use some of the great libraries across Python and R and to use publicly available data to answer a question I had for myself, namely, “Does it appear that the downturn in the economy is affecting house prices”.