Podcasts - Season 4, Episode 6

The power behind Pixel phones

Hear all about how Google Tensor G3 supports the Pixel phones we know and love

Cutting-edge innovation

When you use a Pixel 8 Pro or Pixel 8 phone, you’re using technology that merges the best silicon architecture with the latest AI advancements. And it all starts with Tensor G3, the custom-built mobile chip designed to use Google AI and machine learning.

When you get an automated summary of an audio recording or a surprisingly clear photo using max zoom, that’s Tensor G3 at work. It puts the power and helpfulness of cutting-edge machine learning at your fingertips.

This episode of the Made by Google Podcast is all about this tiny powerhouse chip, including an interview with Prasad Modali, director of product management at Google.

Photography features abound

Tensor G3’s AI powers several photography features that will help you capture the most dynamic, flattering, and eye-catching images. The rest of your friend group won’t stand a chance against your photography skills when you’re holding a Pixel 8 or Pixel 8 Pro. With Best Take, a feature powered by the generative AI on Tensor G3, you can remix group shots to make sure you have a photo showing everyone in their best light.¹ The chip also enables features like Audio Magic Eraser,² Live HDR+, and more to help you capture the world and people around you.

Tune in to the Made by Google Podcast today for a deeper dive on all that Tensor G3 brings to Pixel.

Transcript

Rachid (00:01.611)

Prasad, welcome to the Made by Google podcast. It's great to have you. You are a director of product management for Silicon. So what is it you do all day?

Prasad Modali (00:26.061) First of all, thank you, Rachid, for having me. My job in two words is research prefetch. What this means is my team, I have a small team focusing on understanding what AI research is happening in Google in the areas that are relevant for the Pixel users. And then work with the research teams to prefetch, prototype, and land these in meaningful ways and meaningful use cases on the Pixel phone.

It's highly cross-functional and very collaborative across Silicon software, Pixel teams, research teams, and other product management teams across Google. It's also thoroughly satisfying taking all these kinds of ideas all the way from initial research to ultimately landing as a product feature. So that to me is, you know incredibly satisfying. We've been doing this for the past five years, pushing the boundaries on non-device machine learning, and bringing the latest in Google AI research directly to our newest Pixel devices.

Rachid (01:37.685) Now, long-time listeners of the Made by Google podcast know that we sometimes go into our internal directory and see the personal mission statement that all Googlers have there. Yours is, land Google AI innovations into a Tensor silicon, which obviously makes sense after what you just said. So that means you work on one hand with the Silicon folks and on one hand with the research folks. Are you sort of a bridge between the two? Is that fair to say?

Prasad Modali (02:05.597) Absolutely, that's a great way to say it. In fact, in my previous life, I was a chip designer and a chip architect. Doing chips, I think, without knowing exactly where they will be used and how they will be used is very different from being part of Google. It gives me this unique opportunity to understand the user journeys much better while we define the future Silicon. This really allows us to optimize for more meaningful experiences rather than some very specific, you know, speeds and feeds.

Prasad Modali (02:57.553) This is a journey we've been betting on AI and machine learning. That's the center of everything we do on the Tensor processor. Today, our partnership with research is stronger than ever, collaborating far in advance on each Tensor deployment to source new optimizations.

I really credit this relationship with thousands of researchers across Google research, Google DeepMind, pulling these innovative experiences on devices. I think doing a lot of these things in a server or cloud is one thing because we have an infinite amount of computer resources. And not to mention the power, you know, amount of power it takes to do these things.

Bringing these into low power handheld devices is incredibly challenging. That is something we co-optimize, co-design together with the research teams. But in the context of the user journeys, what kind of use cases we are bringing and how are we improving the silicon overall for these use cases in a power efficient manner. So that's really the trick in all this.

Google research actually helps us place bets on forward-looking AI technologies and workloads. Together with them, we are tirelessly refining the AI models. And these models allow us to run seamlessly on pixel devices, ensuring that users experience the benefits of AI without sacrificing battery life or performance, for example.

This is a highly collaborative effort and it allows us to bring the latest AI advancements, including the Generative AI capabilities.

Rachid (29:07.873) Now Prasad, the latest and greatest is Tensor G3 in Pixel 8 Pro and Pixel 8. What is better with Tensor G3 compared to its predecessor, Tensor G2?

Prasad Modali (29:30.741) Basically, we have upgraded all the major subsystems to help push the boundaries of what's possible on a mobile device with machine learning and AI on Tensor G3. We got the latest ARM CPUs. We upgraded the GPU. We upgraded the camera hardware. We upgraded the DSP. And we, of course, advanced the next generation TPU machine learning acceleration.

Rachid (05:28.929) And does that mean then that a few years ago you already needed to have some sort of insight that maybe generative AI would be a thing somewhere in 2023, just to make sure that the Tensor G3 chip coming out this year, that came out this year, was suitable for the task?

Prasad Modali (05:45.865) I think the way we thought of the on-device ML compute is super important. So we invested heavily on on-device compute. And then increasing the pipe to the external memory is very important. We made sure that we have this NF memory on the device access to the memory in the system.

These are the sort of things that we need to do in order to make sure that the large Generative AI models will run well. One example I can give that kind of pushed the envelope for us previously is speech research. We've been working a lot with speech researchers on bringing larger and larger speech models onto the device. A lot of the learnings we got from landing speech recognition technologies on device, but really helpful in creating a nice preparation for landing generative AI.

I'll give you a couple of interesting data points. When we compare to our first generation Tensor that came out in Pixel 6, our latest phone runs more than twice the number of machine learning models on the chip. That's pretty incredible, given that it was literally two years ago we launched the first generation chip. Also, if we can talk a bit about on device-generative AI. This is roughly, in some measures, 150 times more complex than the most complex machine learning model that we deployed in last year's Pixel 7, just a year ago. This speaks to the incredible amount of advances that are happening in AI and how we're able to bring them to the users in a meaningful way on Pixel 8.

Rachid (09:12.005) And I guess there's more on the way when it comes to large language models, as we saw in the keynote in October, right?

Prasad Modali (09:17.713) Absolutely. You'll see feature drops coming on Pixel in the future. You'll hear about things like recorder Summarize and smart reply in G-Board. You'll also see some imaging capabilities, like Zoom Enhance.

Rachid (09:41.133) Yeah, I mean, everyone talks about the enhanced meme now coming to life on Pixel 8 and Pixel 8 Pro, so definitely we should talk about that. Now you mentioned earlier, benchmarking is important, but maybe not in this synthetic way, but actually, you know, in a practical way. So what are some of the results you've seen with Tensor G3? For example, when it comes to speech and photos and videos where it makes a difference.

Prasad Modali (10:02.953) Yeah, I'll give you a few examples. We have been seeing a lot of good results everywhere across speech and language to photos and videos to audio and face unlock on the phone and so on and so forth. I'll give you, I will start with TTS, which is text to speech on device. For on device text to speech, we are using the same model as what we are using in our data center.

This is just incredible, bringing these models to on-device. I actually spent a few hours a week either running or biking on my free time. I find this read-aloud feature that's available in the phone is super useful. It's part of the assistant feature that I could just turn it on and go for a run or a hike.

And I can listen to long form articles really nicely. The second example I would like to give is last year on a hike I broke my right hand and I was having trouble typing with a cast, and it took me a bit, but then I realized, hey, we worked with our speech team and landed the best speech recognition on device. So I started voice typing.

For literally everything I do, chat messages, documents, presentations, and so on. I still pretty much use voice typing on my phone now, starting with that. So I think that's pretty incredible where we got to from where we were before. In terms of video, we've been investing heavily in the HDR Plus and hard-coding some of the algorithms on video. Google introduced Super Res Zoom on Pixel phones with Pixel 3.

Rachid (11:38.725) Amazing.

Prasad Modali (11:58.813) For many years ago. And over time, it's only gotten better. For example, last year in Pixel 7, we introduced enhancements to SuperHZoom. It goes all the way beyond 20X Zoom. You can almost think of it like an orchestra of multiple machine learning algorithms triggered at different ranges, Zoom ranges. For example, once you hit 20X Zoom, we turn on a new ML App Scaler that includes a neural network to enhance the details of your photos. The more you zoom in, the more the telephoto camera leans into AI. Now, beyond, we are taking all these cool things that Google has done on photography and moving them to videos with the Tensor G3. We bring great quality ML-based super resolution to zoom videos, letting you zoom in beyond the physical limits of the camera by capturing photos or videos in real time. I'm really looking forward to using this feature with my latest Pixel camera next time I go see Coldplay.

Rachid (13:06.933) Amazing. That should definitely be a great help for people filming in a concert hall for sure.

Prasad Modali (13:09.885) Right? Yeah. Imagine you don't have to worry so much about where you are sitting, and you can really zoom into Chris Martin, for example, here. That's pretty amazing. And then we also have a lot of examples on Audio Magic Eraser, taking a video and cleaning up the unwanted sounds and noise. It's kind of similar in concept to a Magic Eraser, except you remove audio. It works beautifully with the audio.

We also have done significant enhancements with G3 to bring much stronger face unlock capability. And with this, we can do things like Google Wallet and banking app sign-ins with Pixel 8.

Rachid (13:57.401) Amazing. And for people who are curious about Audio Magic Eraser, we, of course, had an episode about that specific topic earlier in the season in the Made by Google podcast. So go check that out if you want to know more about how that works, the Audio Magic Eraser on Pixel 8 Pro and Pixel 8. Now, Prasad, going back to Tensor G3, what would you say it brings to the latest Pixel phones? I mean, of course, it brings a lot of AI capabilities. Probably there are more examples of what it can do.

Prasad Modali (14:28.573) Absolutely, so much. I'll give you a few examples and talk them through with you. Tensor chip powers everything the Pixel does. It's not just about one experience. Many optimizations in the camera pipeline. We built machine learning algorithms right into Silicon. One example is the new live HDR. With this feature, we are able to deliver the highest dynamic range of photos and videos with the new ML-based video zoom capability. We also are introducing the first hardware capability for AV1 encode on Pixel. This is a standards-based encoder and decoder capability that will allow you to make video calls even better on Pixel, for example. We are kind of starting.

Rachid (15:21.313) Yeah, and just for people who don't really understand what a codec does and what AV1 is. But if I understand it correctly, and you're the expert here, you'll correct me if I say anything weird, but AV1 would help you get great video and audio quality even when you have low bandwidth, for example. You can still have a decent video.

Prasad Modali (15:39.413) Absolutely. You have to think of this in basically if you send the raw bits of video across internet, you're you know, you're choking up the pipelines with the amount of bits you are sending. So you generally do a compression of the video and there are several standards for compressing the video and decompressing the video and AV1 is one such standard that recently has come up.

Generally speaking, these codecs, they generally support a decode function in the beginning and then ultimately go into encode function. But having the encode function on the device significantly helps in things like video calls.

Rachid (16:30.309) That's amazing. Now, there are some features that you might find on other phones, but Pixel just does it better. And I'm wondering, does Tensor G3 play a role in that?

Prasad Modali (16:40.285) Absolutely. So one example I want to give you is the Magic Eraser experience on Pixel. One of the things we have done in the last year or so with optimizing on TensorFlow G3 is that we have upped the ante on the complexity of the model that can erase large areas on the photograph. So with this complex model, we are able to bring that capability on Tensor G3. You can, you're able to infill, if I may, basically, even if you move a person or when you erase something, these algorithms help you cover much large surface areas in.

Yeah, so the end user experience is incredible with the ability to delete larger areas in the photograph with Magic Eraser.

Rachid (17:41.013) And that's something possible because Tensor G3 is capable of running much more complex AI and machine learning compared to other chips out there.

Prasad Modali (17:49.657) Absolutely. I think the way to think about this is that when we work with research teams, we work closely with these teams to simplify the server class algorithms and machine learning models in a way without compromising the quality to be able to run them in mobile conditions, you know, battery operating, thermally constrained conditions. That allows us to optimize the algorithm in a way that can do meaningfully amazing quality on the device.

Rachid (18:48.121) And you mentioned when it comes to speech, we run the same text-to-speech model on Tensor G3 as in the cloud, which is simple to say, much harder to pull off, of course. So I'm wondering, are there other things that come into play with the Assistant? Of course, text-to-speech, important there. That help the Assistant be better on a device with Tensor G3?

Prasad Modali (19:12.469) I'm glad you asked that. Assistant has an improved conversational experience compared to where we were in the previous Tensor chips. So with Tensor G3, we can actually now, the Assistant can wait for you patiently as you pause or use filler words. This is all powered by Google Speech Research ML models. What this means is that you can have pauses when you're talking to an assistant. And the assistant can still understand how you naturally speak. This is combining the Google's state-of-the-art speech recognition with natural language capabilities. That's what is allowing us to bring these capabilities. Many of these features will eventually come to other Android phones. But you either get it first or just have a better experience on Tensor.

Rachid (20:05.957) I think when people think about a smartphone being more personal and helpful, they don't often think about the chip powering that phone. They maybe think about the software, but not about the chip that enables it all. So how exactly does Tensor G3 help with that?

Prasad Modali (20:24.233) Yeah, we see that Pixel is the only phone with AI at the center. Looking back to when Pixel first launched, it seemed like a radical concept to have AI-centric mobile computing. But with Pixel and Tensor, we found that it's the cleanest path to powering smartphone experiences that are helpful, simple, and personal. Tensor G3 brings more of Google's cutting-edge ML and AI to Pixel.

Amazing features like, I just talked about, Audio Magic Eraser. You will see Best Take. Next generation call screening. Generative AI features like Summarize. Tensor G3 helps bring Google AI to nearly every experience on Pixel, going beyond photos to video quality, audio, security. We just talked about speak at your own pace. We talked about text to speech being the same model that we use in the data centers, resulting in an assistant read-aloud feature. We also are bringing generative AI on-device features that we announced and made by Google event recently. Smart replies and Gboard, we called our Summarize capabilities, Zoom Enhance, coming to Pixel phones in the future feature drops.

Rachid (21:47.301) Prasad, this is a question I've been dying to ask someone, and I think you might have the answer. I know many, many people are excited that we announced seven years of updates for Pixel 8 and Pixel 8 Pro. And again, that might be something where people think this is a software thing. But I'm wondering, does Tensor play a part in that decision and the ability to keep software updates so long in the future? Until 2030, actually.

Prasad Modali (22:14.825) Yeah, this was a huge commitment that conveys how dedicated Google is to building hardware that lasts. No other major smartphone brand currently offers this level of support. So this comprehensive commitment was only possible with in-house silicon with TensorG3 chip, where we can chart our own path that far into the future. Like with our commitment to secure hardware in Tensor and Titan, this recent news helps demonstrate how serious we are about supporting users with sustainable and secure devices.

Rachid (23:03.321) It's one of the greatest announcements, even though there were so many, of course, during the Made by Google event. Now, Prasad, we always like to close with a top tip for our listeners on what to try with Tensor. There are a gazillion things they could do. I'm just wondering, what is perhaps your favorite feature where Tensor G3 makes the difference? What is something that our listeners should try out as well?

Prasad Modali (23:28.329) Yeah, Tensor helps in many areas I just mentioned. But one that I hear from a lot of users is video quality on Pixel phones, plus amazingly clear pictures in low light conditions. I'm personally very happy to say that I'm the official photographer on all our hikes. For example, we have many opportunities in the neighborhood here in Northern California. On Mount Whitney, for example, we catch the sunrise around the trail camp area, and all my friends line up to take pictures with my latest pixel using Best Take. My friends love this feature, and this has become a bit of a gaming activity. So we take multiple pictures. We change their faces and change the heads in the pictures. And we kind of create multiple output pictures, not just one picture. So while we started this feature thinking that people will use them to make sure everyone is looking into the camera, for example, and smiling, it turns out we are kind of gamifying this feature in some sense. That's really very interesting. And I love that aspect of it.

Rachid (24:44.13) Yeah.

Prasad Modali (24:50.529) With the generative AI, Magic Eraser can now help you remove even larger distractions. I talked about it, including shadows and objects attached to those distractions, resulting in a much higher quality photo. So look out for this feature. Finally, a couple of other important points on video quality. We have done a lot of effort, a lot of work on Tensor G3…

Rachid (24:57.487) Mm-hmm.

Prasad Modali (25:19.929) On Pixel camera, the most advanced machine learning from Google research, building them right into the Google Tensor G3 chip. With G3's cutting edge computational photography and advanced image processing, your phone can now render more dynamic images, capture more details during zoom, and process sharper, higher quality photos and videos.

Rachid (25:43.433) And I bet Prasad, you are also looking forward to the feature drop that will bring that enhanced meme to life, where Tensor G3 will help us zoom and zoom and zoom.

Prasad Modali (25:54.097) Absolutely. This is in my mind magical feature. You should check out this fantastic Zoom Enhance feature that's coming out. That's completely generative AI based.

Rachid (26:05.314) Yeah.

Rachid (26:21.349) Amazing. Prasad, thank you so much for teaching us what is new with Tensor G3. And if you don't have any experience yet with Tensor G3, of course, you can only get that right now in Pixel 8 Pro. And Pixel 8, check out the Google Store if you want to get your own.

Prasad Modali (26:37.097) Yeah, thanks for having me on the podcast. Check out the latest Google devices powered by Tensor G3 on the Google Store, Pixel 8 and Pixel 8 Pro.

Rachid (26:45.305) Thank you, Prasad. Talk to you soon.

Prasad Modali (26:47.626) See you soon.

Related podcasts

The Pixel Watch and personal safety

Pixel Buds Pro: Elevating your audio

Your Pixel phone and earbuds questions, answered

Where to listen

Share this podcast

Requires Google Photos app.
Requires Google Photos app. May not work on all audio elements. Available on Pixel 8 and Pixel 8 Pro.