Lingo—A Streamlined Video Translation Workflow Tool

Increasing the amount of videos published internationally was a company-wide priority for 2017. Translating videos was a key element of the strategy for meeting this objective. My team was given six months to improve the video translation workflow.


  • Product Design
  • Product Strategy

Lingo logo designed in collaboration with Celine Chang.

Defining the MVP

Tasty was our most popular video brand at the time, and the most lucrative opportunity for international expansion. Tasty recipe videos follow a strict format: they are shot from overhead, feature the recipe as on-screen text, and have no dialogue. Our MVP would focus on translating Tasty videos.Tasty was our most popular video brand at the time, and the most lucrative opportunity for international expansion. Tasty recipe videos follow a strict format: they are shot from overhead, feature the recipe as on-screen text, and have no dialogue. Our MVP would focus on translating Tasty videos.

User Research

We spent the first month learning as much as we could about the translation workflow. We started by interviewing our stakeholders: the international Social Media Editors (SMEs) who would request video translations for their markets, and the Los Angeles based Video Adaptation Editors (VAEs) in Los Angeles who produced the translated video. We then audited the tools used at each step in the process.

The translation workflow at the time involved two teams, and spanned several tools. The process begun with the international SMEs who would request video translations for their markets. The on-screen text in the requested video would be transcribed by an Amazon Turk. The SMEs would then translate the transcript. Once the translation was complete, the VAEs in Los Angeles would use the transcript and premier files for the original video to produce the translated video. The translated video would then be reviewed by SMEs. After the review, the translated video was ready for publishing by the SME.

Translation Workflow Overview

We identified the following as the key pain points in the workflow

  • The process involved too many tools making it difficult to track the progress of a translation project
  • The transcript quality from the Turks was inconsistent
  • SMEs often translated transcripts without referencing the original videos because switching between the translation spreadsheet and video player was too cumbersome. Meaning SMEs didn’t see their translations in context until the review stage. Due to the time zone differences between the SMEs and VMEs, addressing copyedits could take at least a day or up to two weeks.

MVP Objectives

  • Increase translation output and speed
  • Improve transcript translation accuracy and subsequently reduce number of copyedits during the translated video review
  • Improve transcript quality
  • Provide visibility and accountability by tracking workflow status
  • Reduce the number tools used

Iterating to a Solution

A major benefit of working on internals tools is easy access to users. I made it a priority to involve our stakeholders throughout the design process. We checked in with them regularly; sharing wireframes, prototype and early production builds. This tight feedback loop ensured we were on the right track and our users were invested in the tool.

Optimizing the Workflow

The goal here was to consolidate transcript request, transcript translation, translation request fulfillment into a single tool. Video editing and video review would still be handled in Premier and respectively because replicating their rich feature set would be impossible in the allotted time.

Translation Candidates

To increase the number of videos translated, we had to needed to grow the pool of translation candidates. The requests from SMEs weren’t enough, we needed another source, ideally an automated one. We put our data scientists on the case. After some investigating the performance of past video translations, he created a data model that analyzes published videos and recommends the two markets the video will likely perform the best in when translated. These recommendations were then added to the Translation candidate queue with LingoBot identifier for selection by SMEs.

Early Wireframes

Translation Candidate Queue

Translation Edit Request Queue

Automating Transcription

To increase transcript quality we replaced Amazon Turks with a third-party transcription service. In addition to improved and consistent transcript quality, we also gained the following

  • Significantly faster turn around
  • Capturing start and end time for every occurrence of on-screen text
  • Capturing text with line breaks as they appear on screen
  • Ability to optimize the OCR model for our lexicon

Improving Transcript Translation Accuracy

Our primary goal was to make it easy to reference the original video while translating. We explored a few options for placing the video, and ultimately located it to the right of the transcript, following along as a user scrolled. And when a transcript translation input field is in focus, the video jumps to the point when that text appears. Other notable features were adding video editing notes, indicating transcriptions that should be excluded from the translated video, and basic text formatting.

Early Wireframes

Final Designs

The final view has two modes: translation and reference. The former is primarily used by SMEs for transcript translation, the latter by VAEs when creating the translated video.

Translation Mode

Video jumps to the start time for the in focus transcription.

Transcription actions menu.

Reference Mode


The response from SMEs and VAEs was effusive. As we predicted, having the original video as reference led to more accurate translations and far fewer edits during the review process. And having a single tool that handled the bulk of the process provided the visibility both parties sought. Average translation time was dramatically reduced from two weeks to a couple of business days.

The tool now supports all on-screen text video translation requests, and is set-up to support dialogue translation in the future.