Visual onoma-to-wave

accepted to ICASSP 2023

We propose an environmental sound synthesis from visual onomatopoeia and sound-source images (Ohnaka et al., 2023).

External project page is Here.

At that time, the DALL-E prototype had just been released, and I got the impression that it was difficult to effectively use methods like automated image generation. However, I believe it’s now possible to create a much higher-quality method.

References

2023

  1. Sound synthesis
    Peer-reviewed
    First
    Visual Onoma-to-Wave: Environmental Sound Synthesis from Visual Onomatopoeias and Sound-Source Images
    Hien Ohnaka, Shinnosuke Takamichi, Keisuke Imoto, and 3 more authors
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2023