Detecting gunshot at different distances in African Savannah with Machine Learning Approaches
Rhino poaching has been a long-lasting problem for preservers in Africa. In 2019, 327 rhinos were murdered by poachers in the Kruger national park in South Africa and a total of 594 poaching incidents happened in South Africa. Gunshot detection may help rhinos to survive. Since most poaching happens during late night, it can be hard for rangers to locate poachers even if they heard the gunshot. When rangers find the dead rhino in the morning, it would be impossible to catch poachers. Thus, I worked with Professor Stephen Tarzia, Northwestern Engineering Senior Saif Bhatti along with other students to detect gunshot from different distances using real recordings from Africa.
The gunshot recordings we used consists of two parts. There are 6 gunshot recordings recorded in African natural reserves. These recordings are obtained by a collaborator who travelled to Africa in December, 2019 with a ZOOM H4N Handheld recorder. Each has 5 gunshots in the recording. Another 114 clean gunshot recordings are from the Gunshot Audio Forensics Dataset. However, unlike African gunshot recordings, these dataset recordings do not contain any noise and the type of guns and bullets used in these two sets of recordings are different. To make our analysis more precise, we also added branching breaking noise recorded using Voice Memo application on iPhone 8. Other noise includes natural recordings, such as thunder and rain noise, from national parks in the US.
Each gunshot is separated into 1.8sec segments. A band-reject filter from 3000Hz to 5000Hz is then applied to the segments to filter out high and low frequency noises. Another 300Hz high-pass filter is also applied to reduce low-frequency noise.
To identify the gunshots, we compared the performance of four models, k-nearest neighbor, support vector machine, long-short term memory and ada cost. We evaluated the models based on confusion matrices, ROC curves and leave-one-out results as our metrics. The results show that the LSTM model shows high tendency of overfitting due to the small size of the dataset. Among the other three models, ada cost model has the highest leave-one-out accuracy and the best specificity, sensitivity and AUC value from the ROC curve.
In conclusion, we find that ada cost is the best model for the current dataset that we have. Due to the limitatin of the size of the dataset, most deep learning models are not applicable in our current state. If more recordings are provided in the future, perhaps more deep learning models can be tested. If the distance of these gunshots are provided, we can also use a regression model instead of classification models and may find more features of gunshots at different distances.