[Learn about machine learning from the Keras] — 14.optimizer And learning_rate

Czxdas
3 min readSep 22, 2023

This section will discuss the initial setting methods, operation, and impact of optimizer and learning rate.

Example from the previous section:

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model

class SimpleDense(layers.Layer):

def __init__(self, units=32):
super(SimpleDense, self).__init__()
self.units = units

def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)

def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b

from keras.models import Sequential
model = Sequential([
SimpleDense(512),
layers.Dense(10, activation="softmax")
])

import tensorflow.keras.optimizers as optimizers
model.compile( optimizer= optimizers.get( {"class_name": "rmsprop",
"config": {"learning_rate": 0.0001} } ) ,
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])

model.build(input_shape=(None,784))

from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

model.fit(train_images, train_labels, epochs=1, batch_size=128)

There are three ways to set optimizer:

(1)
Here, set the optimizer in the model’s compiler, using the name “rmsprop” as a parameter. Keras will find the module with the corresponding name, but the module will use the default value of 0.001.

Refer to the notes on this parameter on the keras official website:
learning_rate: Initial value for the learning rate: either a floating point value, or a tf.keras.optimizers.schedules.LearningRateSchedule instance. Defaults to 0.001.

The above figure is the process of obtaining entities by specifying names. It mainly relies on the keras.engine.training.Model._get_optimizer function. After being set to dict internally, the corresponding entities are obtained through keras.optimizers.deserialize and returned. Entity to model. The corresponding method is to build a corresponding table, as mentioned in the Compiler chapter, the content is as follows:

“adadelta”: keras.optimizers.adadelta.Adadelta
“adagrad”: ‘keras.optimizers.adagrad.Adagrad
“adam”: ‘keras.optimizers.adam.Adam
“adamax”: keras.optimizers.adamax.Adamax
“experimentaladadelta”: keras.optimizers.adadelta.Adadelta
“experimentaladagrad”: keras.optimizers.adagrad.Adagrad
“experimentaladam”: keras.optimizers.adam.Adam
“experimentalsgd”: keras.optimizers.sgd.SGD
“nadam”: keras.optimizers.nadam.Nadam
“rmsprop”: keras.optimizers.rmsprop.RMSprop
“sgd”: keras.optimizers.sgd.SGD
“ftrl”: keras.optimizers.Ftrl
“lossscaleoptimizer”: keras.mixed_precision.loss_scale_optimizer.LossScaleOptimizerV3
“lossscaleoptimizerv3”: keras.mixed_precision.loss_scale_optimizer.LossScaleOptimizerV3
“lossscaleoptimizerv1”:keras.mixed_precision. loss_scale_optimizer.LossScaleOptimizer

(2)
optimizers.get( {“class_name”: “rmsprop”, “config”: {“learning_rate” : 0.0001} } )
Parameters can be passed in this way, and the model compiler program is modified to pass in learning_rate:

import keras.optimizers as optimizers
model.compile( optimizer= optimizers.get( {"class_name": "rmsprop",
"config": {"learning_rate" : 0.0001} } ) ,
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])

(3)
Define the object and give parameters

model.compile(  optimizer= optimizer.rmsprop("learning_rate" : 0.0001) ,
loss="sparse_categorical_crossentropy",
metrics=["accuracy"])

In this example, the optimizer is RMSprop. The following shows that as the learning rate is adjusted downward, the accuracy may not necessarily get better and better.

learning rate set 0.001 (default)

learning rate set 0.0001

learning rate set 0.00001

The above are the different setting methods of optimizer, and their important relationship with learning_rate during training is recorded here.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Czxdas
Czxdas

Written by Czxdas

Keep looking for Prajna wisdom

No responses yet

Write a response