/ machine learning

# Perceptron and sigmoid neurons

In my ongoing quest to learn R and machine learning, I'm working my way through a most excellent book titled Neural Networks and Deep Learning, which sports a most intuitive URL ;) It's written really well, relatively easy to follow, and at least for the time being, I think is within my bounds of understanding.

This post attempts to work through the first exercise Michael includes in the first chapter. The subject matter is on how perceptron and sigmoid neurons compare and contrast.

This is what my simulation network looks like. I'm hardcoding the values so it's easier to see what's going on.

The first thing I needed to do was model the behaviour of these neurons.

# calculates a neuron's output
process.neuron <- function(x, w, b, constant = F) { # x and w are vectors
if(constant) {
w <- w * constant; b <- b * constant;
}
r <- sum(x * w) + b
r <- ifelse(normalizing, normalize(r), r)
r <- ifelse(sigmoidizing, sigmoidize(r), r)
}

# normalizes the value of a perceptron
normalize <- function(input) ifelse(input <= 0, 0, 1)

# calculates the output of a sigmoid neuron
# given the denormalized result of a perceptron
sigmoidize <- function(z) 1 / (1+exp(-z))


Because I'm playing with both kinds of neurons, the process.neuron() function can behave like perceptrons if we're normalizing, and sigmoids if we're ... sigmoidizing?

I should also mention at this point that the code is optomized for me running various scenarios while I'm playing. So if you're thinking, "why didn't he write it like ...", that's why ... and because the better way didn't dawn on me.

And this works. No matter what constant $c \gt 0$ I use, the 3 input neurons always output 0, 1, 0.

The next bit of the exercise aims to show the same thing for sigmoids. This time I wanted to see the sigmoid waveform by passing large negative and positive constant values to illustrate the behaviour when $-\infty \leftarrow C \rightarrow \infty$. Here's the bit of code I use to invoke the process.neuron() function with each constant:

# processes the neurons in a level
process.layer <- function(i, x, w, b, constant) {
correction <- ifelse(sigmoidizing, 0.5, 0)
ifelse(constant == 0, correction, process.neuron(x[,i], w[,i], b[i], constant))
}

# defines and processes the input layer
input.layer <- function(constant = F) {
# 3 neurons, 2 inputs and 1 bias each, 1 weight per input
x <- matrix(c(.4,.9, .5,.7, .5,.7), ncol = 3)
w <- matrix(c(.7,.3, .4,.6, .5,.5), ncol = 3)
b <- matrix(c(-.56, -.61, -.62), ncol = 3)
i <- seq(1, length(b)) # neuron iterator
sapply(i, process.layer, x, w, b, constant) # results
}


There's also a function similar to input.layer() for the hidden.layer() and the output.layer() that I'll leave out for brevity (but you can see the whole script here if you're interested).

In order to process the layer multiple times with different constant values:

apply.constant <- function(constant = F) {
# comment out the latermost layers you're not interested in
il <- input.layer(constant)
# hl <- hidden.layer(il, constant)
# ol <- output.layer(hl, constant)
}

# configuration
normalizing <- F
sigmoidizing <- T

# Scenarios for different constants
constants <- seq(-400, 400, by = 10)

# process layers - defined by layers in apply.constant() above
layer <- sapply(constants, apply.constant)


And that creates neuron output values that look like this:

I have to say, for someone new to this stuff, I am lovin' it. So interesting! For extra credit ;) I wanted to see the impact this had on the result of the hidden layer. That was even more interesting.

Of course I tweaked the input values, weights and biases to get interesting waveforms. I figured that was fair since there isn't any learning going on ... yet.

#### Don Smith

Progressive. Global citizen. Seeker of balance and harmony. Patient. Intolerant of suffering, greed and exploitation. Educator. Imperfect. Forgiving. Software developer. Vulnerable. Loves people.