Building a mixed-data neural network in Keras to predict accident locations

When used in the right situation, neural networks can be an awesome solution to your learning problem. Neural networks allow you to feed in structured data (numerical and categorical data), wait for some magic to happen (note: not actual magic — it’s just maths), and out pops your answer — for example, maybe you’re trying to predict the result of an election from data from news sources. Convolutional neural networks allow you to do something similar but for images — for example, maybe you’re trying to predict whether an image is a hotdog or a not-hotdog.

But what if you have both structured data and image data. Maybe you have data from news sources and images of hotdogs and not-hotdogs, and you want to predict… the result of a hotdog-related vote?

Or, possibly more likely, you have structured data on a patient and their condition, and you have their X-ray results. So you want to combine the two data sources to make an even more accurate prediction than you could using only one type of data.

Predicting the location of traffic accidents

I’ve previously written about a project I created with Sabatino Chen to try and predict the location of traffic accidents in London. That post went into more detail of the project goals and how we collected and prepared our data, so feel free to have a quick read of that for some more background.

The TL;DR is that we used satellite imagery and other data about different areas in the UK to predict how safe a given area is. The full code can be found in my GitHub repo (the mixed-data model is in notebook 7 and 8).

We built three types of models in total to predict the location of traffic accidents :

Using structured data on traffic accidents, population density, and traffic to try to predict the severity of an accident.
Using satellite imagery to try to predict whether an area was ‘safe’ (no traffic accidents) or ‘dangerous’ (traffic accidents)
A fancy mixed-data model (also known as a mixed-input model) that combines the outputs of a multi-layer perceptron trained on structured data, and a convolutional neural network trained on satellite imagery into a final neural network layer head.

My previous post was about model 2. This post is about model 3, where we tried to beat the performance of our structured data-only (1) and satellite imagery data-only (2) models, by combining the two types of data.

Quick summary of the methodology

See previous post for more details.

We gathered structured data on traffic accidents from the UK Department for Transport’s (DfT) Road Safety Data, traffic levels from the DfT’s Road Traffic Counts, and population density from the UK’s 2011 Census. Data from the Greater London Authority Datastore on administrative areas was used to match some of the datasets together.
We drew a big square around London that included a variety of road systems and urban/suburban/rural areas, and divided the city up into squares of 0.0005 latitude x 0.0005 longitude. This resulted in grid squares (technically rectangles) of 56m high (latitude) and 35m wide (longitude).
We labeled each square with whether it was safe (no traffic accidents in the last five years) or dangerous (one or more accidents).
We used the Google Static Maps API to get satellite images of a random sample of 10,000 ‘safe’ squares and 10,000 ‘dangerous’ squares.

Keras Functional API

Most people’s first introduction to Keras is via its Sequential API — you’ll know it if you’ve ever used model = Sequential(). The Sequential class is used when you want to build a simple feedforward neural network, where data flow through the network in one direction (from inputs to hidden nodes to outputs).

Keras also has a Functional API, which allows you to build more complex non-sequential networks. For example, building a recurrent neural network requires the use of the Functional API, so that you can build in connections between nodes to allow data to be passed to the next stage of the network.

The Functional API also allows us to build neural networks that can accept multiple inputs — and these inputs can be of different data types. Below, I’ll show you how I built a mixed-data neural network in Keras by building what is (essentially) a neural network of neural networks.

Preparing the data for a mixed-data neural network

A mixed-data neural network is built by creating a separate neural network for each data type that you have. You then treat these as input branches, and combine their outputs into a final glorious combined neural network. The output of this final chunk of neural network is then your answer — in this case, the probability of the area being safe.

As is so often the case in data science, actually building the model is pretty quick, but getting the data into the right form is what takes the most time.

When you’re building a standard neural network, as long as you’ve prepared your data correctly, you can chuck it in your neural network in whatever order you fancy. For a mixed-data neural network, however, the order matters. You need the output for the 7,834th data point (in this case, grid square) to be the 7,834th output of the structured data neural network and for the convolutional neural network, so that they’re fed into the final combined neural network at the same time. Here’s how I did that:

from keras.preprocessing.image import ImageDataGenerator
import pandas as pd

# Getting the images and rescaling
image_folder = 'model3_images/'
image_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        image_folder, shuffle=False, class_mode='binary',
        target_size=(128, 128), batch_size=20000)
images, labels = next(image_generator)
# Output: Found 20000 images belonging to 2 classes.

# Checking the labels
image_generator.class_indices
# Output: {'danger': 0, 'safe': 1}

# Getting the ordered list of filenames for the images
image_files = pd.Series(image_generator.filenames)
image_files = list(image_files.str.split("\", expand=True)[1].str[:-4])

# Sorting the structured data into the same order as the images
df_sorted = df.reindex(image_files)

It’s then worth quick double checking to make sure that the image you’re expecting is the same that’s produced with plt.imshow():

Building a mixed-data neural network in Keras

In order to iterate on model versions, it’s good practice to do this in the form of functions. I wrote the following functions to pre-process the structured data and create the mixed-data neural network architecture.

Pre-processing the structured data:

from sklearn.preprocessing import MinMaxScaler

def process_structured_data(df, train, test):
    """
    Pre-processes the given dataframe by minmaxscaling the continuous features
    (fit-transforming the training data and transforming the test data)
    """
    continuous = ["population_per_hectare", "bicycle_aadf", "motor_vehicle_aadf"]
    cs = MinMaxScaler()
    trainX = cs.fit_transform(train[continuous])
    testX = cs.transform(test[continuous])
    return (trainX, testX)

Using the Keras Sequential API to create a two-layer multi-layer perceptron (a simple feedforward neural network) for the structured data branch:

from keras.models import Sequential
from keras.layers.core import Dense

def create_mlp(dim, regularizer=None):
    """Creates a simple two-layer MLP with inputs of the given dimension"""
    model = Sequential()
    model.add(Dense(8, input_dim=dim, activation="relu", kernel_regularizer=regularizer))
    model.add(Dense(4, activation="relu", kernel_regularizer=regularizer))
    return model

Creating a convolutional neural network for the image data branch (adapted from here):

from keras.layers import Flatten, Input, concatenate
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers.core import Activation, Dropout, Dense
from keras.layers.normalization import BatchNormalization
from keras.models import Model

def create_cnn(width, height, depth, filters=(16, 32, 64), regularizer=None):
    """
    Creates a CNN with the given input dimension and filter numbers.
    """
    # Initialize the input shape and channel dimension, where the number of channels is the last dimension
    inputShape = (height, width, depth)
    chanDim = -1
 
    # Define the model input
    inputs = Input(shape=inputShape)
 
    # Loop over the number of filters 
    for (i, f) in enumerate(filters):
        # If this is the first CONV layer then set the input appropriately
        if i == 0:
            x = inputs
 
        # Create loops of CONV => RELU => BN => POOL layers
        x = Conv2D(f, (3, 3), padding="same")(x)
        x = Activation("relu")(x)
        x = BatchNormalization(axis=chanDim)(x)
        x = MaxPooling2D(pool_size=(2, 2))(x)
        
    # Final layers - flatten the volume, then Fully-Connected => RELU => BN => DROPOUT
    x = Flatten()(x)
    x = Dense(16, kernel_regularizer=regularizer)(x)
    x = Activation("relu")(x)
    x = BatchNormalization(axis=chanDim)(x)
    x = Dropout(0.5)(x)
 
    # Apply another fully-connected layer, this one to match the number of nodes coming out of the MLP
    x = Dense(4, kernel_regularizer=regularizer)(x)
    x = Activation("relu")(x)
 
    # Construct the CNN
    model = Model(inputs, x)
 
    # Return the CNN
    return model

After train-test splitting, labelling and pre-processing the data, the final steps are to build the branches, concatenate the outputs, and use them as inputs in the final neural network layer head:

from keras.models import Sequential
from keras.optimizers import Adam # Other optimisers are available

# Create the MLP and CNN models
mlp = create_mlp(trainAttrX.shape[1])
cnn = create_cnn(128, 128, 3)
 
# Create the input to the final set of layers as the output of both the MLP and CNN
combinedInput = concatenate([mlp.output, cnn.output])

# The final fully-connected layer head will have two dense layers (one relu and one sigmoid)
x = Dense(4, activation="relu")(combinedInput)
x = Dense(1, activation="sigmoid")(x)

# The final model accepts numerical data on the MLP input and images on the CNN input, outputting a single value
model1 = Model(inputs=[mlp.input, cnn.input], outputs=x)

# Compile the model 
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model1.compile(loss="binary_crossentropy", metrics=['acc'], optimizer=opt)
 
# Train the model
model1_history = model1.fit(
  [trainAttrX, trainImagesX], 
  trainY, 
  validation_data=([testAttrX, testImagesX], testY), 
  epochs=5, 
  batch_size=10)

Iteration and evaluation

I experimented with a few different versions of this model, using different numbers and sizes of layers, types of optimisers and regularisation, and training lengths.

Behold, my final mixed-data neural network model architecture:

The best model I was able to produce had an average F1 score (the harmonic mean of precision and recall — chosen because both measures of accuracy are important in this case) of 0.8, and no overfitting.

This mixed-data model performed slightly better than the best version of the CNN-only model (0.77), which only used satellite images as inputs and did not include the extra structured data. Augmenting the satellite imagery data with the structured data that was also available for the areas did indeed improve accuracy.

However, if a model had to be chosen to put into production, based on the current best versions, it might be preferable to use the CNN-only model, due to the extra data processing and computation power involved in the mixed-data model.

However, there is plenty of room for improvement in the mixed-data model, including tuning of both the CNN and MLP components (e.g. using pre-trained weights in the CNN), as well as the final fully-connected layers. So the accuracy could be increased even further.

Summary

Keras has some cool functionality in its Functional API for building neural networks that can take multiple different forms of data as inputs. I’ve shown an example here of combining both structured data and image data to predict the locations of traffic accidents.

There’s plenty to play around with in Keras beyond this — there’s no reason to limit yourself to only two branches, or to structured and image data. Feel free to fork my GitHub repo for this project and have a play around, or have a read through the Keras docs to learn more.

Although mixed-data neural networks are a bit more complicated to build — and are certainly a more niche area of general data science — they can be a great solution to your more complex neural network needs.

Thanks for reading — I hope you found this an interesting topic. If you’d like to read more about my data science projects, you can subscribe to my blog here on Medium.

Building a mixed-data neural network in Keras to predict accident locations

Combining satellite imagery and structured data to predict the location of traffic accidents with a neural network of neural networks