TheresWaldo
There's Waldo contains a fully-convolution densenet approach to solving "Where's Waldo"
Install / Use
/learn @bckenstler/TheresWaldoREADME
There's Waldo!
This repository contains an implementation of the fully convolutional Tiramisu network for semantic segmentation, applied to the task of playing Where's Waldo?.
Where’s Waldo (or Wally) is a series of children’s books that challenges the reader with finding the eponymous character and his compadres in densely illustrated images.
<figure> <img src="waldo_split.png" alt=".." /> <figcaption>Actual Predicted Segementation Indicated by Transparency Mask</figcaption> </figure>Please see my recent blog post Where's Waldo: Terminator Edition for an overview of my approach.
I've also provided a complete end-to-end walk-through of my There's Waldo project in this notebook.
You can find all training / testing images and corresponding predictions here
Set Up
I've provided for you the original images and bounding box xml's necessary to recreate my process.
First, unpack them:
$ unzip imgs_boxes.zip
To create the binary labels:
$ python make_targets.py
The next step is to create the Waldo sub-images.
$ python create_subimages.py
Training
I didn't put together a script for training the model; training neural networks isn't straightforward and relies alot on human judgement. If you'd like to train your own model, please take a look at the accompanying notebook.
Predicting
To make predictions on new images:
$ python predict.py image1.py image1.py --output_path /output/path
This will load weights from my model by default. Makes predictions and saves them as transparency masks overlaid on the original image. They'll be saved as output_{i}.png in the specified folder.
If you want to use your own model weights, just specify with the --model flag.
Acknowledgments
This project is inspired by material studied while interning with Jeremy Howard and Rachel Thomas' fast.ai, in particular Lesson 14 of their course Cutting Edge Deep Learning for Coders, taught at USF’s Data Institute.
Related Skills
node-connect
335.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
335.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.7kCommit, push, and open a PR
