The Hidden Layer
Posts
Summer Days and NumPy Arrays

Summer Days and NumPy Arrays

Lessons from my Science Fair Project, Part 14

Brandon Kim
January 25, 2026

What was discussed last week…

Using test data is absolutely necessary for confirming a model’s performance: it’s like how a unit test is more accurate in accessing a student’s knowledge instead of a bunch of different, easier classwork assignments what are only focused on the current topic.
Patch sizes can make machine learning volatile, at least, in my case when dealing with segmentation models.

Patch Sizes in 2D and 3D data

As discussed in my last post, I described how patch sizes are a crucial part of the training process in terms of how quick it takes to train a machine learning model, as well as being an important factor to consider when managing resources effectively: in short, patch sizes are the amounts of data that a machine learning model processes at once within a file.

Specifically for machine learning models that take in 3D files, I thought of another analogy of how to explain patch sizes this week that I want to share with you all: patch sizes can be thought of as cube watermelon slices that you’d have during the summer!

The 3D file can be thought of as the whole watermelon, and the patch sizes are the little “cube subdivisions” that the model slices the file into, in order to know how much data the model should process at one time when running through training runs with such 3D files.

The watermelon “cubes” signify the different “patches” of the whole watermelon

Increasing a model’s patch size, if you have enough computer resources and storage to do so, generally improves the model’s accuracy.

However, this doesn’t mean that the highest patch size possible will always be beneficial for a machine learning model. As I trained some more machine learning models for my science fair, I learned that lesson the hard way.

I used Google Colab, Google’s PaaS (Platform as a Service) for providing AI and machine learning for the common people, like me, to use Google’s computing power to train models. When experimenting the larger and larger patch sizes, I kept on getting errors for trying to train models that didn’t use a certain patch size: the “working” patch sizes tended to all be even numbers, but other than that, there were no other patterns, like being perfect cubes or multiples of 4.

These errors were caused by PyTorch’s DYNAMO library, which complains that it’s unable to process the file (I don’t have the specific error message), so patch sizes aren’t as simple as I expected: to this day, I still don’t know why certain patch sizes don’t work.

Lessons Learned

Even though a greater patch size for training a machine learning model is likely to improve its performance, certain patch sizes may not work due to PyTorch’s DYNAMO library.