GitHub - GilGeva1/GilGeva1.g: Simple project webpage template. Originally used in Colorful Image Colorization. ECCV, 2016.

Introduction

This repository presents research on "Binaural sound source localization using a hybrid time and frequency domain model". The study was conducted as part of a thesis for Reichman University in collaboration with IRCAM Institute. It has been accepted for presentation at the IEEE conference ICASSP 2024.

Included in this repository are an overview of the research, images illustrating data collection at Reichman University and IRCAM Institute, and a description of the model architecture. Additionally, HRIR files for each ear of every speaker in IRCAM are provided. Furthermore, there is a .py file containing the code and other tools for data processing. Moreover, it contains the thesis document, the presentation used for the thesis defense, the conference paper, and a citation for reference.

Abstract

Sound source localization plays a foundational role in auditory perception, enabling both human and machines to determine the sound source location. Traditional sound localization methods often rely on manually crafted features and simplified conditions, which limit their applicability in real-world situations.

Accurate sound localization holds vital importance across diverse applications, spanning robotics, virtual reality, human-computer interactions, and medical devices. This significance is particularly amplified for individuals with cochlear implants (CI), who confront significant challenges in perceiving the direction of sound sources.

Previous research focused on extensive microphone arrays in the frontal plane, which exhibit accuracy and robustness limitations when employing small microphone arrays. These sound localization techniques are also impractical for CI users due to size and weight constraints, and the need for full-sphere localization capabilities.

This research introduces a new approach to sound source localization using head-related transfer function (HRTF) characteristics, from raw data, in both the time and frequency domains. Furthermore, it advances binaural sound localization by extending its capabilities from a 180-degree range to a full-sphere context.

The proposed approach introduces an end-to-end Deep-Learning (DL) hybrid model, that integrates spectrogram and temporal domain insights via parallel channels. The performance of our proposed hybrid model, surpasses the current state-of-the-art results. Specifically, it boasts an average angular error of $0.24^\circ$ and an average Euclidean distance of $0.01$ meters, while the known state-of-the-art gives average angular error of $19.07^\circ$ and average Euclidean distance of $1.08$ meters.

This level of accuracy is of paramount importance for a wide range of applications, including robotics, virtual reality, and aiding individuals with CI.

In conclusion, as the field of sound source localization continues to progress, this research contributes to a deeper understanding of auditory perception and offers practical applications within healthcare scenarios.

Recording Methods

Architecture

Thesis paper

Thesis paper - remaining

Thesis presentation

Thesis - Gil Geva.pptx

ICASSP 2024

ICASSP_2024.pdf

arXiv paper link

@article{geva2024binaural, title={Binaural sound source localization using a hybrid time and frequency domain model}, author={Geva, Gil and Warusfel, Olivier and Dubnov, Shlomo and Dubnov, Tammuz and Amedi, Amir and Hel-Or, Yacov}, journal={arXiv preprint arXiv:2402.03867}, year={2024} }

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.gitignore		.gitignore
resources		resources
KU100_splited_RIR.zip		KU100_splited_RIR.zip
NEU-KU100.jpg		NEU-KU100.jpg
README.md		README.md
Thesis_Final.py.zip		Thesis_Final.py.zip
hybrid_arch_new (1).jpg		hybrid_arch_new (1).jpg
index.html		index.html
tools.zip		tools.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Abstract

Recording Methods

Architecture

Thesis paper

Thesis presentation

ICASSP 2024

arXiv paper link

About

Releases

Packages

Languages

GilGeva1/GilGeva1.g

Folders and files

Latest commit

History

Repository files navigation

Introduction

Abstract

Recording Methods

Architecture

Thesis paper

Thesis presentation

ICASSP 2024

arXiv paper link

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages