[Learn about machine learning from the Keras] — 12. Custom Layer

Czxdas
3 min readSep 22, 2023

--

The layer plays a very important role in the model and is also the executor of tensor operations during model training. The layer also needs to go through the build action to generate corresponding initial weights for use in training. During training, relevant calculations and inner products of tensors will also be performed through the Call functions of each layer.

Review the previous use of keras.layers.core.dense, such as the Sequence article training process:

If the model has not gone through the build action, when the model training is performed for the first time, it will be checked and the model build will be executed first, and the build action will be performed on each layer one by one.

After having a general understanding of how initialization and training work using keras’s built-in layers, try to see how custom layers will work.
Examples are as follows:

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model

class SimpleDense(layers.Layer):

def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units

def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)

def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b

from keras.models import Sequential
model = Sequential([
SimpleDense(512),
layers.Dense(10, activation="softmax")
])

model.compile(optimizer="rmsprop",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])

from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

model.fit(train_images, train_labels, epochs=1, batch_size=128)

The operation process is roughly similar to using only the built-in Dense Layer, the difference is in the blue area.

When you want to build each layer, if it is a custom Layer (SimpleDense), in the keras.engine.sequential._build_graph_network_for_inferred_shape function, it will be judged whether the custom Layer is in Functional construction mode. Because it is judged to be True, it will enter keras.engine.base_layer.Layer._functional_construction_call, after keras.engine.base_layer.Layer._keras_tensor_symbolic_call will go to keras.engine.base_layer.Layer._infer_output_signature, where keras.engine.base_layer.Layer._maybe_build -> SimpleDense.build will be executed directly. The custom build function uses keras.engine.base_layer.Layer.add_weight to initialize SimpleDense.w and SimpleDense.b.
In the process below, when the operation of each layer is actually executed, through iteration, when the customized SimpleDense is reached, its own call function will be called, and then executed
tf.matmul(inputs, SimpleDense.w) + SimpleDense.b
, the final output result is sent to the Input of the next iteration layer.

The above is the process of operating the custom Layer in the Sequence Model.

--

--