Download Our Alexa Wake Word Voice Samples

Building an Alexa-enabled voice product? Make sure you’re ready for a global, multilingual customer base. Hear the difference that data variance makes with this sample of 24 Alexa wake word recordings in four languages.

Download Now

If you’re training an Alexa-enabled product or device, you’ll need high-quality speech data to train your voice recognition model for different accents, age groups, and genders.

This sample dataset was originally collected for Amazon’s Alexa wake word functionality and contains speech data from across the world. The sample contains English, Italian, Spanish, and French speech data from varying ages and genders.

Download Now

Here are a few use cases for this speech data sample:

  • Identify the specific phrases and words used to wake up Amazon’s Alexa
  • Hear the accents and tonal differences that need to be considered by Alexa
  • Analyze metadata that gives your team and Alexa complete context
Download Now

Whats included in the data set?

This sample dataset contains 24 .WAV audio files that were collected and labeled by Summa Linguae Technologies. The metadata for each sample is provided as well. All samples were recorded, collected, and processed online to capture intricacies in speech data from various countries and regions.

The provided audio files are free to use and test for your educational and research purposes only. This work is licensed under a Creative Commons Attribution Non-Commercial No-Derivates 4.0 International License.

Download the free wake word data set

    Summa Linguae uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy.

    Learn More