SkillAgentSearch skills...

OrcaDetector

A VGGish-based DNN trained on the Watkins Marine Mammal Sound Database, with transfer learning from Audioset, to detect multiple marine mammal species.

Install / Use

/learn @paloukari/OrcaDetector
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

OrcaDetector Final Project Report

This UC Berkeley Master of Information in Data Science final course project was developed by Spyros Garyfallos, Ram Iyer, and Mike Winton for the W251 "Deep Learning in the Cloud and at the Edge" course (Summer 2019 term).

Abstract

This paper applies the previously published VGGish audio classification model to classify the species of marine mammals based on audio samples. We use a distant learning approach, beginning with model weights that were pretrained on Google's published Audioset data. We then finish training with a strongly supervised dataset from Watkins Marine Mammal Sound Database. We achieve an overall F1 score of 0.89 over 38 species, with 26 of the species achieving an F1 score >= 0.70. We then deploy the trained model to an NVIDIA Jetson TX2 edge computing device to perform inference locally, simulating a deployment connected to a hydrophone in the middle of the ocean without internet connectivity. Since we don't have access to our own hydrophone, for the purposes of simulation, we connect to the live.orcasound.net live audio stream and perform inference on this stream. We also incorporate the ability for a person to "inject" an audio sample from a marine mammal species into the live audio stream to simulate an actual detection event via our interative notebook.

Introduction

Marine Mammals

Marine mammals rely on the ocean and other marine ecosystems for their existence. They include animals such as seals, whales, manatees, sea otters and polar bears, and are unified by their reliance on the marine environment for feeding [Wikipedia].

Killer Whale

Killer whales jumping

Source: Wikipedia

The Killer Whale, or Orca, is a toothed whale that is the largest member of the oceanic dolphin family. Some feed exclusively on fish, while others hunt marine mammals such as seals and other dolphins. They have even been known to attack whales. Killer whales are at the top of the food chain in the ocean, as no animal preys on them, and they can be found in all of the world's oceans,absent only from the Baltic and Black seas, as well as some areas of the Arctic Ocean [Wikipedia].

Killer Whale Types

Research off the west coast of Canada and the United States in the 1970s and 1980s identified the following three types [Wikia.org]:

  1. Resident: These are the most commonly sighted of the three populations in the coastal waters of the northeast Pacific.

  2. Transient: The diets of these whales consist almost exclusively of marine mammals.

  3. Offshore: A third population of killer whales in the northeast Pacific was discovered in 1988, when a humpback whale researcher observed them in open water.

Examples of their geographic ranges:

Toothed Whale Sound Production

Source: Scientific Reports

Echolocation

Sound waves travel through water at a speed of about 1.5 km/sec (0.9 mi/sec), which is 4.5 times as fast as sound traveling through air. Marine mammals have developed adaptations to ensure effective communication, prey capture, and predator detection. Probably one of the most important adaptations is the development of echolocation in whales and dolphins [Wikipedia].

Marine Mammals Echolocation

Source: Wikipedia

Killer whales are believed to rely use echolocation to navigate, communicate, and hunt in dark or murky waters [Seaworld.org].

Toothed Whale Sound Production

Source: Wikipedia

Communication

Killer whales make a variety of different noises. They use whistles for close-range communication; the frequency of killer whale whistles ranges from about 0.5 to 40 kHz, with peak energy at 6 to 12 kHz. However, pulsed calls are their most common vocalization [Seaworld.org].

Spectrograms of three characteristic killer whale sounds

Source: Seaworld.org

The individuals of any particular pod share the same repertoire of calls, a vocalization system called a dialect, which is unique to that pod. Analysis of killer whale call patterns has demonstrated substantial differences between the dialects of different pods. Pods that associate with one another may share certain calls, but no two pods share the entire repertoire [Seaworld.org].

Audio samples

There are multiple samples online of different Killer Whale sounds. Here, we've assembled a few of them that we think are interesting:

  1. Resident calls

  2. Transient whistles

  3. Transient calls

  4. Transient echolocation

Echolocation samples slowed down

Here, we can hear the clicks echoing back on underwater surfaces by slowing down the audio speed by a factor of 10x:

  1. South Residents Killer Whale clicks normal speed

  2. South Residents Killer Whale clicks slowed down 10x

Underwater noise

Underwater noise from shipping, drilling, and other human activities is a significant concern in some Killer Whale habitats, including Johnstone Strait and Haro Strait. In the mid-1990s, loud underwater noises from salmon farms were used to deter seals. Killer whales also avoided the surrounding waters [Seattle Times]. High-intensity sonar used by the Navy also disturbs Killer Whales and other marine mammals [Scientific American]. Killer Whales are also extremely popular with whale watchers, whose ships may stress the whales and alter their behaviour if boats approach too closely or block their lines of travel.

OrcaSound Lab

Because it's within the summertime habitat of the endangered southern resident Killer Whales, Orcasound Lab is a good location for a hydrophone to listen for Killer Whales. On their live stream, one can also hear ships passing by. At other times of year, it's possible to hear Humpback Whales (fall) and Harbor Seals (summer). According to their site, hydrophones were first deployed in 2002 are currently just beyond the kelp about 30 m offshore at a depth of 8 m.

OrcaSound Lab Hydrophones

Here we see an image of one of their first generation hydrophones from the Orcasound Lab website. It is connected to an array of hydrophones stretched for ~200 m along the shore at depths of 5-20 meters.

Orcasound 1.0 Hydrophone

Source: OrcaSound

And here's a view of their second generation hydrophone, based on a Raspberry Pi 3 in a waterproof box. according to their website, it is using ffmpeg (as do we), and storing data on AWS.

Orcasound 2.0 Hydrophone

Source: OrcaSound

Automating Detection on the Edge

In the research community there seems to be tremendous desire for an accurate, automated detection system that could process live streams from such hydrophones. Although attempts have been made to create one, today most of the detections seem to be made by people listening to the audio, live or recorded, either experts or self-taught community enthusiasts. Our project is developed in response to this need.

An accurate detection mechanism will allow the marine scientists to collect more samples and better understand these species. A real time accurate detector could also produce a timely warning to vessels in the proximity of the detection to reduce speed or pause extreme activities like submarine military sonar exercises, in order to protect the marine mammals and offer a safe passage.

Additionally, because the required upstream bandwidth and cost for streaming and recording all audio from such hydrophones is prohibitive, there would be enormous value in having an automated detector which could run at the edge and upload only the (realtively rare) positive samples.

Background for Audio Classification

Representing Audio Signals

Audio signals are typically represented by a 1D vector comprising of a time series of signal amplitudes. The translation of the analog

Related Skills

View on GitHub
GitHub Stars29
CategoryData
Updated1mo ago
Forks11

Languages

Jupyter Notebook

Security Score

95/100

Audited on Feb 21, 2026

No findings