Color combination analysis with chained clustering

Original Graphic

For a recent project I needed to find a method to determine which colour palettes were present in a large dataset of artworks, ideally without any input from myself! This task aligned itself to the unsupervised technique of clustering and can be divided into two steps:

  1. Extracting commonly occurring colours.

Both of these can be accomplished with K-Means clustering, which simply clusters points by the distance between points and cluster centers.

1. Extracting commonly used colours.

To effectively use K-means to extract colour combinations it is important to have the euclidean distance between colour data points be interpretable as…


Pandemics, bushfires and economic crisis. Not only the menu for the last year but a small subset of the phenomena that can be modelled under the framework of agent based modelling (ABM).

Image from Unsplash

ABM is well suited to understanding how complex behavior emerges in systems based on simple interactions between the system’s individual participants. ABMs have shown particular strength in explaining general observations, referred to as ‘stylized facts’, such as observed neutron scattering patterns or the distribution of asset returns. These agents could be anything from people or companies to animals in an ecosystem or atoms in a gas. …


Exploring interacting agent modelling in Python

Pandemics, bushfires and economic crisis. Not only the menu for the last year but a small subset of the phenomena that can be modelled under the framework of agent based modelling (ABM).

Image from Unsplash

ABM is well suited to understanding how complex behavior emerges in systems based on simple interactions between the system’s individual participants. ABMs have shown particular strength in explaining general observations, referred to as ‘stylized facts’, such as observed neutron scattering patterns or the distribution of asset returns. These agents could be anything from people or companies to animals in an ecosystem or atoms in a gas. …


Colour combination analysis with chained clustering

For a recent project I needed to find a method to determine which colour palettes were present in a large dataset of artworks, ideally without any input from myself! This task aligned itself to the unsupervised technique of clustering and can be divided into two steps:

  1. Extracting commonly occurring colours.

Both of these can be accomplished with K-Means clustering, which simply clusters points by the distance between points and cluster centers.

1. Extracting commonly used colours.

To effectively use K-means to extract colour combinations it is important to have the euclidean distance between colour data points be interpretable as…


Access to groceries, banks and other basic services in Victoria varies with income, but not the way you may think.

Image by Jon Tyson via Unsplash.com

The range and accessibility of basic services such as supermarkets and banks varies along with average income between neighboring postcodes in Victoria, Australia.

However, unlike the US where “banking deserts” are leaving low-income communities without access to financial services and western Sydney where “food deserts” are limiting access to groceries, Victoria seems to demonstrate the opposite pattern.

Lower income communities have significantly more schools, banks, supermarkets and health vendors within walking distance than their high income counterparts. Conversely, high income suburbs do have significantly more options within a larger driving distance.

  • A high income suburb has 40% fewer banking institutions…

This task was undertaken as part of a proof of concept for Look and Learn, an online library of high definition historical pictures.

In the previous story, I developed a model to encode the art style of an image into a vector in higher dimensional space, where the euclidean distance between vectors represents how visually similar the images are.

For an image recommendation system to use these embeddings, the vector for every image in the data set needs to be stored on disk and the distance between two or more vectors will need to calculated at some point. The more dimensions that are present in each vector, the greater the storage and computational needs of the system, increasing costs and search times.

Due…


Keep it simple

Document Classification: The task of assigning labels to large bodies of text. In this case the task is to classify news articles into different labels, such as sport or politics. The data set used wasn’t ideally suited for deep learning, having only low thousands of examples, but this is far from an unrealistic case outside larger firms.

Now normally this type of technical article would run through a few models, before concluding with a comparison of results and an overall evaluation, but today I thought I’d save you a scroll and start off with the unexpected results.

Simple models worked…


This task was undertaken as part of a proof of concept for Look and Learn, an online library of high definition historical pictures.

The aim of this exercise is to find a function that transforms images into embedding vectors where the euclidean distance between vectors represents how visually similar the images are. This allows a nearest neighbors search on one image’s embedding to return images that are visually similar, empowering image recommendation and clustering.


Sometimes Simplicity Wins

Document Classification: The task of assigning labels to large bodies of text. In this case the task is to classify BBC news articles to one of five different labels, such as sport or tech. The data set used wasn’t ideally suited for deep learning, having only low thousands of examples, but this is far from an unrealistic case outside large firms.

Now normally this type of technical article would run through a few models, before concluding with a comparison of results and an overall evaluation, but today I thought I’d save you a scroll and start off with the unexpected…


But it doesn’t need to stay that way.

Don’t get me wrong, I love Google Maps. Its a borderline miraculous service that has given me superb driving directions, public transport timetables, and walking routes. All over the world. For free.

However, when I tried to get cycling directions to a store a short distance across town I encountered some issues. I plugged in my headphones, put away my phone, and was promptly directed to take a left onto a four lane highway.

Grant Holtes

AI and Analytics Consultant in Melbourne. www.grantholtes.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store