Learning Computer Vision Week 6

Week 6 of documenting my AI/ML learning journey (Oct 13 - Oct 19)

What was discussed last week…

  • Strategies to how CV models conduct object detection

  • The reliability of Haar-Cascade Classifiers

  • The trend of how almost any difficult math concept can be used in CS “under the hood”

This week’s newsletter is a short one, sorry guys. 😓

Tuesday, October 15

For the last part of my Computer Vision learning, I experimented with Faster R-CNNs, in which the “R” stands for “region-based”. This type of CNN reduces the running time of object detection networks by combining RPNs (region proposal networks) and CNNs, along with other components; GeeksForGeeks provides a great explanation for how Faster R-CNNs work, and here is the actual proposal for the new model.

For making a Faster R-CNN, the IBM lab I was in used torchvision, a library that is a part of PyTorch.

model_ = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

This function pretrained on the COCO dataset, has a few parameters but the most notable ones are:

  • pretrained bool (as shown): where you can decide to use the pretrained version of torchvision’s Faster R-CNN (which uses COCO).

  • weights FasterRCNN_ResNet_FPN_Weights: the function uses no pre-trained weights by default, but torchvision includes a FasterRCNN_ResNet_FPN_Weights parameter value that you can just insert in there if you wanted to have a set of pre-trained weights.

  • num_classes int: The number of output classes of the model.

Lessons Learned

  • I learned about the COCO dataset and learned about the .fasterrcnn_resnet50_fpn() function that uses it

Resources

Course I followed: