Globalme Does Chinese Speech Data Collection in Chengdu, China

Last Updated December 15, 2016

Globalme was acquired by Summa Linguae Technologies in 2019. We kept this blog to preserve the rich history of Globalme that made us who we are today. Enjoy!

Once again, team Globalme hit the road for another fun speech data collection project abroad. This time, our Chinese speech data collection project brought us all the way to China – a whopping 16 hours ahead of Vancouver. We were tasked with collecting natural language processing of the Sichuanese dialect, so needed to spend time in-country to hear from native speakers. Luckily, our team is dispersed all over the world; we just so happened to have a few members in Chengdu, an ideal location for the natural language utterance we needed to collect for the project. If you are wondering why we are collecting all this speech data, what the process is like and how it helps tech to recognize speech, check out this post on how speech recognition technology works.

This was not our first project there (check out our article on “Cultural Differences You Should Know Before Doing Business in China”), so we were able to use our previous experience to ensure the project was smooth sailing the whole way. Just like that, two of our Vancouver-based Project Managers packed their bags full of equipment, and, after explaining to airport security why we needed six laptops and various other electronic gadgets, we headed off to meet our team in Chengdu.

Something foreign companies need to be aware of when visiting China are the social and cultural differences. One can’t assume that everything works the same as in the Western world; you must switch all the assumptions you have based on your experience from your home country “off”, especially when conducting business there. For example, unlike our hometown of Vancouver, the middle class is a much smaller portion of the population in China. Thus, finding the sweet spot in the market of testers who would meet our requirements and yet have a financial need to work on a project such as this was not an easy task.

Furthermore, many people think of China as a country where you can spend and manufacture at very low rates. While this is true for some commodities such as taxis, street food, and shopping at markets, the prices are quickly rising year-on-year. Things that we are used to in the Western world such as a casual latte, free internet in coffee shops, and parking spaces, are still considered as luxury products in China; services you must be prepared to pay a hefty price for. This is even more apparent in China’s big cities – places where the more wealthy population flock towards.

Though the Internet worked much better than on previous trips, connecting with our favorite services such as Google Maps, or social media sites such as Facebook, was, as expected, not possible without a VPN service. Tip of the day: If you have T-Mobile, you can access Google Maps and all other Google-related services from your phone.

Completing the Chinese Speech Data Collection Project

Overall, our team worked phenomenally well together, and everyone was eager to put in the extra hours and weekends it took to get the project done in the time frame we needed. They were so phenomenal in fact, that we finished earlier than expected and were able to put some time aside for sightseeing. We visited the Chengdu Research Base of Giant Panda Breeding (also known simply as Chengdu Panda Base):

We shopped at Jinli Pedestrian Street:

…and then again at Kuanzhai Ancient Street of the Qing Dynasty!

We also ate a whooole lot of delicious food along the way:

It was a successful two weeks, and we can’t wait to see where the next data collection project takes us. It’s always an adventure, and always a great time to meet up with our team members who work remotely around the world.  

Interested in reading more about our adventures overseas? Check out our speech data collection project in Rome, Italy.

Until next time!

Related Posts

Summa Linguae uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy.

Learn More