Programmer/Data Scientist/Instructor・Mostly write Python & R・Big fan of OpenCV
Published Dec 28, 2017
In this post I’ll be sharing a computer vision model that was trained to detect the character Fox in the game Super Smash Bros Melee for the Nintendo Gamecube.
Here’s a sneak peak at the output if you aren’t too intereseted in reading more about the process.
Thanks to there already being a keras-frcnn framework coded up, the steps to making this fox model were reduced to (1) gathering/tagging training data, (2) training the model, & (3) testing the model.
This will be a relatively high level overview. If you have any specific questions about the process please leave a comment, and I’ll do my best to help answer it.
Bounding boxes were annotated in 348 frames of gameplay for training. The bounding boxes were drawn using imglab.
Thanks to the effort put in by the keras-frcnn authors and Adrian Rosebrock of PyImageSearch the training step was relatively simple. I used the PyImageSearch pre-configured Amazon AMI as the location for model training. Once the server was spun up I followed the instructions listed in the readme of the keras-frcnn repo. Since I only had 348 training images I supplemented the training by using horizontal flips, vertical flips, and 90 degree rotations. After a couple days, I stopped the training process to try out the model.
Again thanks to standing on the shoulders of giants, testing the model was a relatively simple task. To test I followed the instructions in the readme of the keras-frcnn repo, and I was pleasantly surprised by the results. The model was correctly finding and annotating Fox in a majority of the test images! However, I did not tag the testing images, so there is no quantitative metric on the model’s accuracy.
The .gif at the top of this post shows a combination of man.
I forgot to link to the github repo for this project. The repo has all the code, links to training data, & usage information.