In May 2021, Google unveiled a new search technology called Multitask Unified Model (MUM) at the Google I/O virtual event. This coincided with an article published on The Keyword, written by Vice President of Search, Pandu Nayak, detailing Google’s latest AI breakthrough.
In essence, MUM is an evolution of the same technology behind BERT but Google says the new model is 1,000 times more powerful than its predecessor. According to Pandu Nayak, MUM is designed to solve one of the biggest problems users face with search: “having to type out many queries and perform many searches to get the answer you need.”
What is MUM and how does it work?
The Multitask Unified Model (MUM) is built using the same transformer architecture Google used to create BERT, utilising recurrent neural networks (RNNs) for managing tasks including language modelling, machine translation and, most importantly, question answering.
MUM takes the same technology further with more processing power and data feeding into the model.
Here’s how Pandu Nayak describes MUM in his announcement:
“Like BERT, MUM is built on a Transformer architecture, but it’s 1,000 times more powerful. MUM not only understands language, but also generates it. It’s trained across 75 different languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge than previous models.”
There’s an important detail in that quote about how this technology works.
All of Google’s interpretational technology (Hummingbird, BERT, MUM, etc.) enhance the search engine’s ability to understand user queries with more accuracy and, also, understand more complex queries.
However, a key component of MUM is its ability to “generate” language by implementing the latest machine translation technology into its question-answering operation. In practical terms, this allows Google to find information in one language and translate it more effectively for audiences in another language. For example, someone planning to travel to Germany later in the year may want to find venues playing specific types of music they’re into. Of course, there’s a certain amount of content in English on this subject but the depth and quality of German content are undoubtedly higher.
MUM allows Google to take information from German content and translate it for uses in English – or 75 other languages – with greater accuracy, removing a major language barrier in the search experience.
This multilingual capability opens up Google’s search algorithm to a new depth of information about the world, people and cultures.
Another interesting point Pandu Nayak raises about MUM is that it’s “multimodal, so it understands information across text and images and, in the future, can expand to more modalities like video and audio.”
Examples of MUM in action
The best way to explain MUM is to look at some real-world examples and Pandu Nayak offered up a few in his statement on The Keyword. He leads with a scenario where someone has hiked Mt. Adams in the US and plans to hike Mt. Fuji in Japan.
As Nayak explains, “Google [today] could help you with this, but it would take many thoughtfully considered searches — you’d have to search for the elevation of each mountain, the average temperature in the fall, difficulty of the hiking trails, the right gear to use, and more.”
As touched on earlier, one of the biggest problems users face is having to conduct multiple searches to answer one question.
Tackling complex questions with single answers
The issue Nayak raises is that you could ask a hiking expert one question and get a “thoughtful answer that takes into account the nuances of your task at hand and guides you through the many things to consider.”
This is the kind of response Google is aiming for with MUM.
“Take the question about hiking Mt. Fuji: MUM could understand you’re comparing two mountains, so elevation and trail information may be relevant. It could also understand that, in the context of hiking, to “prepare” could include things like fitness training as well as finding the right gear.”
MUM can detect that both mountains are roughly the same elevation, meaning someone with experience climbing Mt. Admans should have the required skills to tackle Mt. Fuji. However, it could also highlight the fact that the user’s trip falls in Japan’s rainy season and recommend advice or products, such as a lightweight weatherproof jacket to cope with the, often humid, autumn rainfall in Japan.
MUM is also capable of exploring useful subtopics for deeper exploration, such as the top-rated gear or training exercises, pulling info in from articles, videos and images from around the web – not only in English but also translated info from Japanese and other languages.
Breaking down language barriers in search
The multilingual aspect of MUM is a real breakthrough for Google’s search experience, allowing it to find and translate information for users in 75 different languages. As Nayak explains, “language can be a significant barrier to accessing information”.
“MUM has the potential to break down these boundaries by transferring knowledge across languages. It can learn from sources that aren’t written in the language you wrote your search in, and help bring that information to you.”
If a user is searching for information about Mt. Fuji, some of the most useful content will be written in Japanese, especially when it comes to tourist information aimed at locals. As things stand, Google is unlikely to return this content if the user’s query isn’t written in Japanese and, if you can’t type in Japanese, you probably can’t read content written in the language anyway.
Thanks to MUM’s ability to transfer knowledge from different sources, across multiple languages, it can access the best content written in Japanese, English and dozens of other languages to compile this information into results translated into the language of the original query.
Interpreting information across content formats
Aside from being multilingual, MUM is also multimodal and this means the model is capable of understanding information from multiple content formats. Google says the technology is already capable of understanding information in text and images and it expects it to interpret information from video and audio content in the future.
At this point, we don’t know how much information MUM is able to interpret from images, but Nayak suggests users could, “eventually,” take a photo of their hiking boots and ask Google whether they’re suitable for hiking Mt. Fuji.
At this point, it seems ambitious to think Google will be answering questions like these with any reliability. Presumably, it would need to match the image of (probably used) boots to a product listing or number and match this with information from reviews, forums and other online sources to answer this question.
Matching this information up is probably the easy part, too. Accurately identifying a pair of used hiking boots from a dark image taken in someone’s halfway could be the bigger challenge. The obvious fallback would be asking users to provide the product name or number manually but you start to question how much this improves the experience over searching with the product info, to begin with.
Time will tell.
When will MUM start affecting search results?
Google says it will start rolling “MUM-powered features and improvements” to search over the coming months and years. As it has done with BERT, Google will extensively test MUM as it applies these features to search and it says it’ll be paying particular attention to any “patterns that may indicate bias in machine learning to avoid introducing bias into our systems.”
In his announcement, Nayak also references Google’s Search Quality Rater Guidelines, the human team of assessors who give the search giant feedback over the quality of the results it provides for queries. This emphasises the importance of this team for Google but also the role it plays in rolling out new features, which content creators need to pay closer attention to after the E-A-T guideline updates in 2019.
In the meantime, there isn’t anything to optimise for but it is worth paying attention to keywords and the results they return to start mapping out the impact of MUM as the technology gradually rolls out.