OrcaDetector
A VGGish-based DNN trained on the Watkins Marine Mammal Sound Database, with transfer learning from Audioset, to detect multiple marine mammal species.
Install / Use
/learn @paloukari/OrcaDetectorREADME
OrcaDetector Final Project Report
This UC Berkeley Master of Information in Data Science final course project was developed by Spyros Garyfallos, Ram Iyer, and Mike Winton for the W251 "Deep Learning in the Cloud and at the Edge" course (Summer 2019 term).
Abstract
This paper applies the previously published VGGish audio classification model to classify the species of marine mammals based on audio samples. We use a distant learning approach, beginning with model weights that were pretrained on Google's published Audioset data. We then finish training with a strongly supervised dataset from Watkins Marine Mammal Sound Database. We achieve an overall F1 score of 0.89 over 38 species, with 26 of the species achieving an F1 score >= 0.70. We then deploy the trained model to an NVIDIA Jetson TX2 edge computing device to perform inference locally, simulating a deployment connected to a hydrophone in the middle of the ocean without internet connectivity. Since we don't have access to our own hydrophone, for the purposes of simulation, we connect to the live.orcasound.net live audio stream and perform inference on this stream. We also incorporate the ability for a person to "inject" an audio sample from a marine mammal species into the live audio stream to simulate an actual detection event via our interative notebook.
Introduction
Marine Mammals
Marine mammals rely on the ocean and other marine ecosystems for their existence. They include animals such as seals, whales, manatees, sea otters and polar bears, and are unified by their reliance on the marine environment for feeding [Wikipedia].
Killer Whale

Source: Wikipedia
The Killer Whale, or Orca, is a toothed whale that is the largest member of the oceanic dolphin family. Some feed exclusively on fish, while others hunt marine mammals such as seals and other dolphins. They have even been known to attack whales. Killer whales are at the top of the food chain in the ocean, as no animal preys on them, and they can be found in all of the world's oceans,absent only from the Baltic and Black seas, as well as some areas of the Arctic Ocean [Wikipedia].
Killer Whale Types
Research off the west coast of Canada and the United States in the 1970s and 1980s identified the following three types [Wikia.org]:
-
Resident: These are the most commonly sighted of the three populations in the coastal waters of the northeast Pacific.
-
Transient: The diets of these whales consist almost exclusively of marine mammals.
-
Offshore: A third population of killer whales in the northeast Pacific was discovered in 1988, when a humpback whale researcher observed them in open water.
Examples of their geographic ranges:

Source: Scientific Reports
Echolocation
Sound waves travel through water at a speed of about 1.5 km/sec (0.9 mi/sec), which is 4.5 times as fast as sound traveling through air. Marine mammals have developed adaptations to ensure effective communication, prey capture, and predator detection. Probably one of the most important adaptations is the development of echolocation in whales and dolphins [Wikipedia].
Source: Wikipedia
Killer whales are believed to rely use echolocation to navigate, communicate, and hunt in dark or murky waters [Seaworld.org].

Source: Wikipedia
Communication
Killer whales make a variety of different noises. They use whistles for close-range communication; the frequency of killer whale whistles ranges from about 0.5 to 40 kHz, with peak energy at 6 to 12 kHz. However, pulsed calls are their most common vocalization [Seaworld.org].

Source: Seaworld.org
The individuals of any particular pod share the same repertoire of calls, a vocalization system called a dialect, which is unique to that pod. Analysis of killer whale call patterns has demonstrated substantial differences between the dialects of different pods. Pods that associate with one another may share certain calls, but no two pods share the entire repertoire [Seaworld.org].
Audio samples
There are multiple samples online of different Killer Whale sounds. Here, we've assembled a few of them that we think are interesting:
Echolocation samples slowed down
Here, we can hear the clicks echoing back on underwater surfaces by slowing down the audio speed by a factor of 10x:
Underwater noise
Underwater noise from shipping, drilling, and other human activities is a significant concern in some Killer Whale habitats, including Johnstone Strait and Haro Strait. In the mid-1990s, loud underwater noises from salmon farms were used to deter seals. Killer whales also avoided the surrounding waters [Seattle Times]. High-intensity sonar used by the Navy also disturbs Killer Whales and other marine mammals [Scientific American]. Killer Whales are also extremely popular with whale watchers, whose ships may stress the whales and alter their behaviour if boats approach too closely or block their lines of travel.
OrcaSound Lab
Because it's within the summertime habitat of the endangered southern resident Killer Whales, Orcasound Lab is a good location for a hydrophone to listen for Killer Whales. On their live stream, one can also hear ships passing by. At other times of year, it's possible to hear Humpback Whales (fall) and Harbor Seals (summer). According to their site, hydrophones were first deployed in 2002 are currently just beyond the kelp about 30 m offshore at a depth of 8 m.
OrcaSound Lab Hydrophones
Here we see an image of one of their first generation hydrophones from the Orcasound Lab website. It is connected to an array of hydrophones stretched for ~200 m along the shore at depths of 5-20 meters.

Source: OrcaSound
And here's a view of their second generation hydrophone, based on a Raspberry Pi 3 in a waterproof box. according to their website, it is using ffmpeg (as do we), and storing data on AWS.

Source: OrcaSound
Automating Detection on the Edge
In the research community there seems to be tremendous desire for an accurate, automated detection system that could process live streams from such hydrophones. Although attempts have been made to create one, today most of the detections seem to be made by people listening to the audio, live or recorded, either experts or self-taught community enthusiasts. Our project is developed in response to this need.
An accurate detection mechanism will allow the marine scientists to collect more samples and better understand these species. A real time accurate detector could also produce a timely warning to vessels in the proximity of the detection to reduce speed or pause extreme activities like submarine military sonar exercises, in order to protect the marine mammals and offer a safe passage.
Additionally, because the required upstream bandwidth and cost for streaming and recording all audio from such hydrophones is prohibitive, there would be enormous value in having an automated detector which could run at the edge and upload only the (realtively rare) positive samples.
Background for Audio Classification
Representing Audio Signals
Audio signals are typically represented by a 1D vector comprising of a time series of signal amplitudes. The translation of the analog
Related Skills
feishu-drive
351.4k|
things-mac
351.4kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
351.4kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
