Remote Data Collection: Why It’s Ideal for Speech Recognition Projects

Last Updated March 17, 2022

remote data collection

For most speech recognition projects, remote data collection is the best option. Why? Because it’s affordable, scalable, and highly customizable to your needs.

If you’re developing machine learning technology, you need high-quality training and testing data.

However, you’re unlikely to find an existing dataset that encompasses all your training use cases. The solution? Collect your own.

Only you know exactly what you need to maximize the user experience for your application or device. The good news is you can tailor data collection according to your specific requirements. And there’s many ways to do so.

If you need data recorded in-person and in a specifically chosen physical location or environment, field collection is the way to go.

For more basis voice commands or conversational data, though, remote data collection is an efficient option.

Let’s discuss why that is.

What is remote data collection?

What do you think of when you hear the word “remote?” Perhaps a beach tucked away on some far-off island. Or maybe your mind turns to your TV. In that sense, the remote gives you control from a distance, as opposed to when your grandparents had to physically turn a dial to change the channel.

It’s the distance that matters here. We collect speech data remotely through a mobile app from anywhere with an internet connection. All you need is a trusted crowd, which we’ll touch on later.

In general, though, participants are typically recruited online based on their language and demographic profile. They’re asked to record speech samples by reading prompts off their screen or by speaking through a variety of scripted scenarios.

Let’s say, for instance, you need recorded wake words in a specific language or dialect. Remote collection lets you gather short clips be in any language from anywhere around the world. And we can collect the data on a small scale or from thousands of participants.

You can get exactly what you need quickly, cost-efficiently, and at scale.

It’s also perfect for automated speech recognition projects, for example.

Here are some additional pros and cons of remote data collection.


  • Customizable – You structure the collection to your exact training data specifications.
  • Cost efficient – Remote collection is more affordable than in-person collection.
  • Quick – We can turn it around within a few weeks.
  • Variety of speech data – Collect different types of speech data, including command-based, scenario-based, or unscripted speech.
  • Scalable and flexible – The infrastructure is in place to collect more data quickly and affordably if necessary.
  • Gather a ready crowd – Collecting from a trusted crowd allows for access to any language, dialect, accent, or demographic.
  • Post processing options – As part of the collection project, you specify your exact transcription and labeling requirements.
  • Data ownership – Because you’ve collected this data yourself, the data won’t be accessible by any of your competitors.


  • Limited audio options – Since data is collected via participants’ cellphones or headsets, you have fewer choices when it comes to audio or microphone specifications.
  • Limited acoustic scenarios – If you require a particular acoustic scenario, like certain types of background noise, you may need to opt for in-person collection. Setting it up yourself gives you more control over the outcome than trusting others to do it right the first time.

As you can see, the benefits of remote data collection outweigh the potential hiccups. To make it work, though, you need to right people and an efficient platform.

Crowd Management Platform: The Backbone of Remote Data Collection

While there are many benefits to this approach, it’s a big task to manage a remote data collection workflow and the substantial number of people involved.

A comprehensive technology platform is the backbone of any successful data collection project, from the planning and recruitment stages right through to transcription, annotation, and quality assurance. The more efficient the platform, the better quality the data and the bigger the cost savings.

Here at Summa Linguae, we built Robson as our remote data collection and crowd management platform. We went with a hybrid platform approach composed of a mobile app, desktop interface, and backend administration platform.

How Our Platform Works for Remote Data Collection

Robson is our in-house data platform and mobile app for Android and iOS that allows us to collect and annotate data from any end user that fits your data needs.

Upon downloading the app and filling out their profiles, Robson users are matched to simple data collection tasks based on their unique information, including:

  • Gender
  • Date of birth
  • Home city, region, and country
  • Current city, region, and country
  • Mother tongue
  • Education level

The user can view and sign up for all the tasks they’re eligible for. Once assigned to a task, they’ll see instructions and the sentences to record. After making a recording, the user can play it back, re-record if necessary, or move on to the next utterance. This data then enters the pipeline, where submissions are reviewed for quality, processed, and then securely shipped over to you.

Did we mention participants are compensated for their contributions?

We’ve concentrated on ensuring the system is solid, robust, and oriented around making the user, QA, and project managers’ experience as simple as possible. By doing that, we ensure that the number of issues escalated to support will get lower and lower over time.

Keys to a Successful Remote Data Collection Platform

Here’s what we focused on when building our crowd platform to effectively collect, process and deliver large volumes of data.

Strong branding and recruitment strategies

People are naturally skeptical about exchanging their data for payment. We work to build trust with potential crowd members, assuring them their submissions are used for good – this makes voice recognition technology more inclusive, for instance. This requires strong branding and social proof on our website, social media, and app reviews.

For recruitment, Facebook ads are a useful channel to advertise to specific demographics at a relatively low cost. For example, we recently advertised a task for English speakers living in the United States:

Seems basic, right? When you consider there are roughly 30 major dialects in America, you get a sense of how important it is to catch them all.

User-friendly interface

Not only does the app have to be functional, it also must provide a smooth user experience to keep people coming back.

Robson’s mobile app is built to handle small tasks to be done anywhere, like voice recording or quick surveys. A light, seamless experience of completing small tasks on an app is less cumbersome than having to log into a desktop browser.

For bigger tasks that require more screen real estate like transcription, a secondary web browser interface is also needed. This allows us to match the right tasks to the right platforms to maximize efficiency and keep costs down.

Central administration system

This is the central part of the solution. It links all other components together to collect and deliver speech data at scale.

It controls the project, connects with finance where necessary, sends out the invites to our user base, controls the flow of submissions from the various sources into QA and back out again, and helps manage the payment process for all those users all over the world. Some aspects are automated, but much of it is managed manually, giving an extra level of care to make sure everything is running smoothly.

Our QA system is designed to make the process quick and easy so that should rework be needed the feedback is immediate. This allows us to collect and QA thousands of hours with small core in-house teams.

Finally, we should add data privacy is taken extremely seriously around here. As data stewards for our clients, we’re responsible for the safety and security of this resource.

You need to know it’s been protected at every step of the process—from collection, to storage, through delivery.

Your Remote Data is Our Business

Our data solutions team is recognized by our clients to be extremely versatile with our outside-of-the-box thinking.

With our crowd and our platform, we can offer custom speech data collection at scale.

To learn how we can create a speech collection program for your organization, book a consultation now.

Related Posts

Summa Linguae uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy.

Learn More