Neural Network in C++

For one of my modules at University I had an assignment which required me to design and build a neural network which could learnt to replicate a three-input exclusive-or (XOR) gate. To develop my C++ skills further I decided to build the program (previously made in Fortran) in C++.

The truth table for the gate is shown on the right here, the “output” is the behaviour I want to train the network to reproduce for the given inputs.

I was free to choose the geometry of the neural network as I wish, with the knowledge that a standard feed-forward multi-layer perceptron with at least two hidden nodes can reproduce the table.

As for the learning algorithm, I was required use the Metropolis algorithm and not the more standard back-propagation rule.

Input 1	Input 2	Input 3	Output
0	0	0	0
0	0	1	1
0	1	0	1
0	1	1	0
1	0	0	1
1	0	1	0
1	1	0	0
1	1	1	1

The truth table for a three-input XOR gate.

The network design I settled on was as shown below:

The three input, fully connected, feed-forward neural network layout. Input values shown in white (I1, I2, I3), hidden layer
nodes (H4,H5,H6) shown in blue, green and orange respectively. Weights for connections to the hidden nodes are coloured according to their corresponding node direction. The output O7 in white.

I chose this design with three hidden nodes rather than the minimum of two as I found that the two hidden nodes were unreliable in reproducing the behaviour.

All the weights of the system are stored in a vector starting with the bias for each node from 4 to 7 followed by the weights into the first, second and third hidden nodes, and finally the weights for inputs into the output node.

I chose this over reconstructing the problem into an object oriented approach, with a node class each containing their own weights and outputs, as the algorithm requires choosing a random weight from all the networks weights (a vector of weights like shown on the right) which essentially make the abstraction to node objects redundant.

Diagram Label	Vector Element
w₄₀	[0]
w₅₀	[1]
w₆₀	[2]
w₇₀	[3]
w₄₁	[4]
w₄₂	[5]
w₄₂	[6]
w₅₁	[7]
w₅₂	[8]
w₅₃	[9]
w₆₁	[10]
w₆₂	[11]
w₆₃	[12]
w₇₁	[13]
w₇₂	[14]
w₇₃	[15]

A Note on the Metropolis Algorithm

The method of altering the weights according to the Metropolis Algorithm is as follows:

A random number is called to generate a number between 1 and 16 (representing
all the elements of the weights vector) .

Using this number we select a random weight from and assign
it with a new value equal to another random number between -10 and 10.

The Metropolis Algorithm is then implemented by comparing the global error for the network with the original and the newly generated weights, E_current and E_new respectively.

If the error of the new configuration is lower than the current, the new weights are kept.

If E_new is not smaller, then we call a random number between 0 and 1. If this value is less than the probability of making a small jump, p = exp((E_current – E_new)β), we still keep the new configuration, (this is in an effort to reduces the chances of being stuck in a local minimum). If this jump is not probable we discard the newly generated weights and keep the original configuration.

The value of β (representing a kind of inverse temperature if we were minimizing energy) is initially set at 0.1, however, as we repeat this process of generating new weights and comparing errors for 1,000,000 times. Each iteration the value of β is increased by 0.001 (in the energy analogy the system is cooling).
With this complete the final weights produced from this algorithm reproduce the correct behaviour of a
XOR gate.

A downloadable sample of this code is available here:

https://github.com/Ezra-Mason/XOR-Network-CPP

Neural Network in C++

A Note on the Metropolis Algorithm

Published by ezramason

Leave a comment Cancel reply

A Note on the Metropolis Algorithm

Share this:

Related

Published by ezramason

Leave a comment Cancel reply