[P] Playing RTS games with audio recognition instead of using hands for input

Written by torontoai on July 17, 2019. Posted in Reddit MachineLearning.

So about two years ago I started getting shoulder aches, but I still wanted to play RTS games. That’s when I started working on a project to allow me to play certain games without using my hands.

At first it started off with 100ms audio and a slow 80ms delay afterwards to respond to inputs, right now I’ve brought it down to 50ms audio with a response time of 10ms.

Also using an eyetracker to move the mouse around so that it’s completely hands free.

A demo where I’m using the program to play Starcraft 2 can be found here with all the controls explained during the video:

https://www.youtube.com/watch?v=lhQzrZ3PrtU

The project has the recording tools needed for data collection, using a sliding window over the microphone input to generate 50ms audio files every 25ms.

I added some simple thresholding filters so that I can more easily get the right audio samples when I am recording them ( sibilants can get by with just a pitch threshold, others like finger snaps work best with high peak-peak thresholds )

I’m using neural nets with four layers in an ensemble to do the recognition part, and do some post-processing to make sure keyboard-inputs are done at the proper times with as little mis-clicks as possible.

I validate out-of-sample performance by recording some more sounds and analysing the outputs of the model in a few graphs ( https://github.com/chaosparrot/parrot.py/blob/master/docs/ANALYSING.md ).

The post-processing tweaks I do after playing a match in a game, and alter the thresholds for input activation based on my experience during it ( maybe I felt the SHIFT key was pressed too late, or another key was way too trigger happy )

by analysing the model output of the match with the CSV output of the recognitions.

The program is multithreaded to ensure that I don’t lose audio recordings during the feature-engineering/evaluation phase.

A github with all the code can be found here: https://github.com/chaosparrot/parrot.py

As for the future, I think I want to make it record 30ms sounds read at 60hz, and maybe fool around with some CNNs to see if it improves the recognition.

Considering I also control the data collection, I can just add a few thousand more samples of certain sounds, so I might try training with 5000 samples per label instead of 1500.

submitted by /u/chaosparrot
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[P] Playing RTS games with audio recognition instead of using hands for input