Learning Deep Learning Week 1

Week 12 of documenting my AI/ML learning journey (Dec 29 - Jan 4)

What was discussed last week…

  • I learned more about some tensor-modifying and image-editing functions such as .unsqueeze_(), which adds a dimension to an image tensor directly.

  • I talked about how I had to find additional training data (i.e. images, because it was a computer vision model) to improve a model’s accuracy and to help it correctly classify an image it had previously misclassified.

I love reflections, and this new year, I hope to keep on making these newsletters about AI; I truly believe that this field is the future for humanity, and this revolutionary movement is something that I want to be a part of!

Also, I want to stress how learning OOP, especially in the context of Python, is essential for learning how to make models with AI, ML, and DL these days, because of the sheer proportion of popular modules, frameworks, and libraries that use Python and OOP.

Wednesday, January 1st, 2025 (New Year’s Day)

So…on to the next course in my AI Engineering pathway!

Now, I’m going to dive into the world of Deep Learning with Keras, an open-source library (for Python of course) that is growing in popularity for creating ANNs because of its effective and intuitive design.

To clarify, I am working with Keras’ Functional API, not its Sequential API, because of how the Functional API is more flexible, while also being intuitive ;)

And when I say “intuitive”, I mean it:

from tensorflow.keras.layers import Dropout, Dense, Input
from tensorflow.keras.models import Model

# Define the input layer
input_layer = Input(shape=(20,))

# Add a hidden layer
hidden_layer = Dense(64, activation='relu')(input_layer)

# Add a Dropout layer
dropout_layer = Dropout(rate=0.5)(hidden_layer)

# Add another hidden layer after Dropout
hidden_layer2 = Dense(64, activation='relu')(dropout_layer)

# Define the output layer
output_layer = Dense(1, activation='sigmoid')(hidden_layer2)

# Create the model
model = Model(inputs=input_layer, outputs=output_layer)

# Summary of the model
model.summary()

The syntax here, as you could guess, is that the layer in the parentheses at the end of each layer’s declaration is that layer that feeds into the layer that is being defined (except for the first one, ofc). Also, the layer type can be clearly seen as the first word in a layer’s definition, with its necessary parameters; see the below pseudocode for a simpler example:

hidden_layer1 = layer_type(necessary_parameters)(input_layer)

hidden_layer2 = layer_type(necessary_parameters)(hidden_layer1)

hidden_layer3 = layer_type(necessary_parameters)(hidden_layer2)

Layers, layers, and layers!

And given its flexibility, concepts like shared layers and multiple inputs can be easily programmed in Keras:

Shared Layers

input = Input(shape=(28, 28, 1))

conv_base = Dense(64, activation='relu')

processed_1 = conv_base(input)
processed_2 = conv_base(input)

model = Model(inputs=input, outputs=[processed_1, processed_2])

Multiple Inputs

inputA = Input(shape=(64, ))
inputB = Input(shape=(128, ))

x = Dense(16, activation='relu')(inputA)
x = Dense(4, activation='relu')(x)
x = Model(inputs=inputA, outputs=x)

y = Dense(32, activation='relu')(inputB)
y = Dense(4, activation='relu')(y)
y = Model(inputs=inputB, outputs=y)

combined = concatenate([x.output, y.output])

z = Dense(2, activation='relu')(combined)
z = Dense(1, activation='linear')(z)

final_model = Model(inputs=[x.input, y.input], outputs=z)

Note that when a Model object is created in Keras, its inputs can be referred to in two ways: directly by their original input tensors (e.g., inputA), or through the Model object using the .input attribute (e.g. my_model.input).

A Recipe for the Best “Layer Cycle”!

Conv/Dense → BatchNormalization → Activation (already included in most layer calls) → Dropout → Next Layer

Thursday, January 2nd

Today, I learned how to implement and program custom layers in Keras; custom layers are exactly what they could like, and are powerful and flexible tools when it comes to optimizing a model’s performance, enhancing flexibility, improving readability and exploring new ideas. Here’s an example of how a custom layer would be made with Keras:

class CustomDenseLayer(Layer):
    def __init__(self, units=32):
        super(CustomDenseLayer, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='random_normal',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='zeros',
                                 trainable=True)
    def call(self, inputs):
        return tf.nn.relu(tf.matmul(inputs, self.w) + self.b)

super().__init__()

super() is used in cases where a class is inheriting from another class (in this case, TensorFlow’s Layer class), where it initializes the inherited attributes and methods. This is in the code because TF’s Layer class is so amazing due to its ability to handle the bulk of the functionality for a layer, and if the super() initialization wasn’t there, the custom layer won’t inherit the essential setup.

The units parameter here defines how much neurons the layer is going to have (i.e. units = neurons), but the number of units doesn’t define the shape of inputs the layer receives, for that is dynamically determined in the build() method.

def build()

This method is called once the layer is used with an input tensor, where the weights “self.w” (weights) and “self.b” (biases) are dynamically created based on the input_shape parameter that it's given: self.w is a weight matrix with dimensions (input_features, units), initialized randomly, and the self.b is a bias vector (aka list) with dimensions (units,), initialized to zeros.

def call()

This method defines the forward pass of the layer in the return statement: the tf.matmul(inputs, self.w) performs matrix multiplication between the input and the weight matrix, and adds the bias (note that the bias can also be negative), and the ReLU activation function is applied using tf.nn.relu. Essentially, in pseudocode the return line would be:

return ReLU((inputs * weights) + bias)

I also reviewed how to use TensorFlow, and dove into the details of how it can integrate with Keras; it turns out TensorFlow has already integrated Keras as its high-level API—I mean, just look at the import statements used for Keras: notice how it has “tensorflow” as the root module!

from tensorflow.keras.model import Model

Also, TensorFlow’s environment is so diverse and big, it has multiple ecosystems:

  • TensorFlow Lite: TF for mobile and embedded devices.

  • TensorFlow.js: Exactly what it sounds like, it’s TF for JS users.

  • TensorFlow Extended (TFX): Deploys productions machine learning pipelines, so this ecosystem is like the “enterprise” or “company” version of TF.

  • TensorFlow Hub: TF’s take on GitHub, but the repositories are specifically about reusable ML models and modules.

  • TensorBoard: A visualization ecosystem for TF, so it can prove useful if you don’t like matplotlib

Lessons Learned

I mainly learned about Keras’ Functional API, and how it’s structured versus Keras’ Sequential API.

And within Keras’ Functional API, I learned how to implement different types of layers, including custom layers!

Resources

Course I followed: