Feedforward Neural Networks (FNNs) in R

Last Updated : 29 Aug, 2024

Feedforward Neural Networks (FNNs) are a type of artificial neural network where connections between nodes do not form a cycle. This means that data moves in one direction—forward—from the input layer through the hidden layers to the output layer. These networks are often used for tasks such as classification and regression because of their ability to model complex relationships between inputs and outputs.

Structure of Feedforward Neural Networks

A Feedforward Neural Network consists of three main parts which are:

Input Layer: This layer receives the input data. Each neuron in this layer represents a feature or variable in the dataset.
Hidden Layers: These layers perform computations based on the input data. The number of hidden layers and the number of neurons in each layer can vary. Each neuron in a hidden layer applies a mathematical function to the inputs and passes the result to the next layer.
Output Layer: This layer produces the final result, which could be a category label (in classification tasks) or a numeric value (in regression tasks).

What are Activation Functions?

In FNNs, neurons in hidden layers use activation functions to introduce non-linearity into the model. This helps the network learn from complex data. Common activation functions such as:

ReLU (Rectified Linear Unit): Outputs the input directly if it is positive; otherwise, it outputs zero.
Sigmoid: Converts the input into a value between 0 and 1, useful for binary classification.
Tanh: Similar to Sigmoid but outputs values between -1 and 1, often used in tasks where the input data is centered around zero.

How to Train a Feedforward Neural Network

Training an FNN involves adjusting the weights of connections between neurons to minimize the error between the predicted and actual outputs. This process is called backpropagation and it works like:

Forward Pass: The input data passes through the network, and the output is calculated.
Loss Calculation: The difference between the predicted output and the actual output is measured using a loss function. For example, mean squared error is commonly used for regression tasks.
Backpropagation: The network adjusts the weights by propagating the error backward from the output layer to the input layer, updating the weights to reduce the loss.
Optimization: Algorithms like Stochastic Gradient Descent (SGD) are used to update the weights iteratively, gradually improving the model’s performance.

Now we implement step by step Feedforward Neural Networks (FNNs) in R Programming Language.

Step 1: Install and Load Required Packages

First, we will install and load the required packages.

# Install necessary packages if you haven't already
install.packages("keras")
install.packages("tensorflow")

# Load the keras and tensorflow libraries
library(keras)
library(tensorflow)

# Install TensorFlow backend
install_keras()

Step 2: Prepare the Data

Now we load the dataset and break into trainning and test sets by normalize and convert the data to categorical format.

# Load the MNIST dataset
mnist <- dataset_mnist()

# Separate the dataset into training and test sets
X_train <- mnist$train$x
y_train <- mnist$train$y
X_test <- mnist$test$x
y_test <- mnist$test$y

# Reshape and normalize the input data
X_train <- array_reshape(X_train, c(nrow(X_train), 784)) / 255
X_test <- array_reshape(X_test, c(nrow(X_test), 784)) / 255

# Convert the labels to categorical format
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)

Step 3: Building the FNN Model

Next build and compile the model.

# Initialize a sequential model
model <- keras_model_sequential()

# Add layers to the model
model %>%
  layer_dense(units = 128, activation = 'relu', input_shape = c(784)) %>%
  layer_dropout(rate = 0.4) %>%
  layer_dense(units = 64, activation = 'relu') %>%
  layer_dropout(rate = 0.3) %>%
  layer_dense(units = 10, activation = 'softmax')

# Compile the model
model %>% compile(
  optimizer = 'adam',
  loss = 'categorical_crossentropy',
  metrics = c('accuracy')
)

Step 4: Training the Model

Then train the model.

# Train the model
history <- model %>% fit(
  X_train, y_train,
  epochs = 20,
  batch_size = 128,
  validation_split = 0.2
)

Step 5: Visualize the model

Now visualize the trained model.

# Prepare data for plotting
plot_data <- data.frame(
  epoch = rep(seq_along(history$metrics$loss), 2),
  value = c(history$metrics$loss, history$metrics$val_loss,
            history$metrics$accuracy, history$metrics$val_accuracy),
  type = rep(c("Loss", "Accuracy"), each = length(history$metrics$loss) * 2),
  dataset = rep(c("Training", "Validation"), each = length(history$metrics$loss), times = 2)
)

# Plot the data
ggplot(plot_data, aes(x = epoch, y = value, color = interaction(type, dataset), shape = type)) +
  geom_line() +
  geom_point() +
  labs(title = "Training and Validation Loss & Accuracy",
       x = "Epoch",
       y = "Value") +
  scale_color_manual(name = "Legend", 
                     values = c("Loss.Training" = "blue", "Loss.Validation" = "red",
                                "Accuracy.Training" = "green", "Accuracy.Validation" = "orange"),
                     labels = c("Training Loss", "Validation Loss",
                                "Training Accuracy", "Validation Accuracy")) +
  scale_shape_manual(name = "Legend", values = c(16, 17, 18, 19),
                     labels = c("Training Loss", "Validation Loss",
                                "Training Accuracy", "Validation Accuracy")) +
  theme_minimal()

Output:

Screenshot-2024-08-28-135554 — Plot the train model

Step 6: Evaluating the Model

Now evaluate the model performance.

# Evaluate the model on the test data
score <- model %>% evaluate(X_test, y_test)

# Print test loss and accuracy
cat('Test loss:', score[[1]], '\n')
cat('Test accuracy:', score[[2]], '\n')

Output:

Test loss: 0.08210932 
Test accuracy: 0.9775

Step 7: Making Predictions

Now we will make the predictions.

# Predict probabilities for the test set
predictions_prob <- model %>% predict(X_test)

# Convert probabilities to class labels
predictions <- apply(predictions_prob, 1, which.max) - 1

# Print the first 10 predictions
print(predictions[1:10])

# Print the corresponding true labels
print(mnist$test$y[1:10])

Output:

[1] 7 2 1 0 4 1 4 9 5 9
[1] 7 2 1 0 4 1 4 9 5 9

This shows that the model is performing well on these samples, successfully identifying the correct digit in each case.

Step 8: Saving and loading the model

Now save the final model load the model.

# Save the model to a file
model %>% save_model_hdf5("mnist_fnn_model.h5")

# Load the model from the file
loaded_model <- load_model_hdf5("mnist_fnn_model.h5")

# Print a message confirming successful loading
cat("Model successfully loaded from 'mnist_fnn_model.h5'.\n")

Output:

Model successfully loaded from 'mnist_fnn_model.h5'.

Applications of FNNs

Here we are discuss the main Feedforward Neural Networks:

Image Classification: FNNs can be used for tasks like recognizing objects in images. While more advanced networks like Convolutional Neural Networks (CNNs) are often preferred for image data, FNNs can still be effective for simpler image classification tasks.
Natural Language Processing: FNNs can be applied to text classification tasks, such as spam detection or sentiment analysis. Although Recurrent Neural Networks (RNNs) and Transformer-based models are more commonly used in NLP, FNNs can serve as a baseline model.
Predictive Analytics: FNNs are employed in predictive analytics to forecast trends or outcomes, such as predicting stock prices or customer behavior based on historical data.
Medical Diagnosis: FNNs can assist in diagnosing diseases by analyzing medical data, such as patient symptoms and test results, to predict the likelihood of a particular condition.

Conclusion

Feedforward Neural Networks (FNNs) are a foundational architecture in machine learning, widely used for tasks such as image classification, regression, and more. Implementing FNNs in R using the keras package allows for an accessible and flexible approach to building and training models. Through the use of layers, activation functions, and optimizers, FNNs can learn complex patterns in data and generalize well to unseen inputs.