Visual onoma-to-wave

accepted to ICASSP 2023

We propose an environmental sound synthesis from visual onomatopoeia and sound-source images (Ohnaka et al., 2023).

External project page is Here.

At that time, the DALL-E prototype had just been released, and I got the impression that it was difficult to effectively use methods like automated image generation. However, I believe it’s now possible to create a much higher-quality method.

References

2023

Sound synthesis

Peer-reviewed

First

Visual Onoma-to-Wave: Environmental Sound Synthesis from Visual Onomatopoeias and Sound-Source Images

Hien Ohnaka, Shinnosuke Takamichi, Keisuke Imoto, and 3 more authors

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2023

DOI arXiv Code Website