The Lex Fridman podcast recently featured a conversation with Mark Zuckerberg inside the metaverse. Despite being hundreds of miles apart in physical space, they appeared to each other as photorealistic Codec avatars in 3D with spatial audio. This technology is believed to be the future of how human beings connect to each other in a deeply meaningful way on the internet.

During the conversation, Zuckerberg explained that these avatars can capture many of the nuances of facial expressions that humans use to communicate emotion to each other. The avatars are able to capture everything from the subtleties of the human face to the flaws, making the experience much more immersive. The goal is to deliver a sense of presence as if individuals are together no matter where they are in the world.

lex fridman mark zuckerberg

The Metaverse Experience

In a recent conversation with Lex Fridman, Mark Zuckerberg discussed the incredible technology behind the photorealistic Codec avatars in the metaverse. Despite being hundreds of miles apart, they appeared to each other as lifelike avatars in 3D with spatial audio, creating a deeply meaningful connection on the internet.

The avatars are capable of capturing the nuances of facial expressions that humans use to communicate emotion, allowing for a more immersive experience. The scanning process for the Codec avatars involves building a computer model of each person's face and body, including different expressions, and collapsing that into a codec. This codec is then transmitted over the wire, making it much more bandwidth efficient than transmitting a full video or 3D immersive video of a scene.

The realism of the avatars captures all the subtleties of the human face, including flaws, variations in color, wrinkles, and asymmetry. This attention to detail creates a more immersive experience, allowing for a sense of presence as if you're there together, no matter where you are in the world.

The scanning process for the Codec avatars is currently a lengthy procedure, but the goal is to create a very quick scan with a cell phone that produces the same quality as the current scans. The ability to efficiently produce these scans is one of the last pieces that need to be nailed down.

The vision for the future of the metaverse is to create experiences where people can physically be together, participate in activities, and have meetings in a photorealistic form. Mixed reality and augmented reality will allow for codec avatars to be superimposed on the physical environment, creating a powerful tool for communication and collaboration.

Photorealistic Avatars

The Metaverse is a virtual space where people can interact with each other through photorealistic avatars. These avatars capture many of the nuances of facial expressions that humans use to communicate emotions to each other. This technology is incredible and it is believed to be the future of how human beings connect to each other in a deeply meaningful way on the internet.

The Codec avatars are 3D avatars with spatial audio that appear to each other as photorealistic avatars. The avatars can be customized to capture many of the subtleties of facial expressions that humans use to communicate emotions to each other. The photorealistic avatars are created by scanning individuals in a lot of different expressions and building a computer model of their faces and bodies.

The Codec avatars are much more bandwidth efficient than transmitting a full video or 3D immersive video of a whole scene. They capture everything, including the flaws and subtleties of the human face, such as freckles, variations in color, wrinkles, asymmetry, and the corners of the eyes.

The goal is to make this technology more accessible to people. Currently, the scanning procedure is a lengthy process that requires hours of sitting. However, the goal is to do a very quick scan with a cell phone that takes only a few minutes and produces a quality photorealistic avatar.

The vision for the future of photorealistic avatars is to have people physically participate in things together, such as playing games or having meetings in the Metaverse. This technology is expected to revolutionize how people connect with each other on the internet and create a sense of presence as if they are together in the same room.

Expression Capture Technology

Expression capture technology is a cutting-edge innovation that allows individuals to appear as photorealistic 3D avatars with spatial audio. This technology is a part of the research project called Codec avatars by Meta. The idea behind this technology is to capture the nuances of facial expressions that humans use to communicate emotions to each other. The avatars capture many of the subtleties of facial expressions that we use to communicate emotions to each other.

The Codec avatars are built by scanning individuals in a lot of different expressions and building a computer model of their faces and bodies. The different expressions that individuals make are collapsed into a codec that is sent over the wire when the headset is worn. The avatars are photorealistic and capture everything from the flaws to the subtleties of the human face, including freckles, variations in color, wrinkles, asymmetry, and the corners of the eyes.

The expressive Avatar system is designed to capture the subtleties of facial expressions and is much more bandwidth efficient than transmitting a full video or a 3D immersive video of a whole scene. The technology captures the nuances of facial expressions that we use to communicate emotions to each other, making the experience more immersive.

The technology captures the core vision of virtual and augmented reality, which is to deliver a sense of presence as if individuals are together in the same room, no matter where they are in the world. The Codec avatars are the embodiment of this vision, enabling individuals to have conversations with loved ones that feel like they are in the same room.

The current process of creating Codec avatars is lengthy and requires individuals to undergo a scanning procedure that involves a lot of incredible technology, software, and hardware. However, the Meta team is working on making the technology more accessible to people by doing a quick scan with their cell phones. The goal is to produce an avatar of the same quality as the current ones in just two to three minutes.

In the future, the technology will be used for more than just video calls. Individuals can participate in activities and games together, have meetings, and participate in physical environments. The technology will be used for mixed reality and augmented reality, enabling individuals to have codec avatars and participate in meetings and activities physically. The technology is still being developed, and the Meta team is working on building applications and use cases around it.

The Magic of Presence

Mark Zuckerberg and Lex Fridman recently had a conversation inside the metaverse, where they appeared to each other as photorealistic Codec avatars in 3D with spatial audio. This technology is incredible and is the future of how human beings connect to each other in a deeply meaningful way on the internet.

These avatars can capture many of the nuances of facial expressions that humans use to communicate emotion to each other. The photorealistic avatars are created by scanning individuals in a lot of different expressions and building a computer model of their faces and bodies. This computer model is then collapsed into a codec that can be sent over the wire, making it much more bandwidth efficient than transmitting a full video or a 3D immersive video of a whole scene.

The photorealistic avatars capture everything, including the subtleties of the human face like freckles, variations in color, wrinkles, asymmetry, and the different corners of the eyes. Eyes are a huge part of it, and most of the communication even when people are speaking is not actually the words that they're saying but the expression.

The goal is to make these photorealistic avatars accessible to everyone. Currently, a small number of people are doing these very detailed scans, but the vision is to do a very quick scan with a cell phone where the whole process just takes two to three minutes. The challenge is to produce something that's of the quality of what they have right now.

The photorealistic avatars are a good embodiment of the core vision around virtual and augmented reality, which is to deliver a sense of presence as if you're there together no matter where you actually are in the world. It's not just having this be like a video call, but doing stuff where you're physically there together and participating in things together. The possibilities are endless, from playing games to having meetings in the future. Once mixed reality and augmented reality are added, people could have codec avatars like this and go into a meeting and have some people physically there and have some people show up in this photorealistic form superimposed on the physical environment.

The photorealistic avatars are truly incredible and make people feel like they're in the same room. It's a fundamentally new experience that could change everything, especially for having conversations with loved ones. The photorealistic avatars are a glimpse into how incredible technology can be, and it's going to be a pretty wild next few years around this.

Emotional Impact

The use of photorealistic Codec avatars in 3D with spatial audio in the Metaverse has an emotional impact on the users. The technology captures many of the nuances of facial expressions that humans use to communicate emotion to each other. The avatars are able to capture even the subtle flaws of the human face, such as freckles, wrinkles, asymmetry, and variations in color, which makes the experience more immersive. The realism of the avatars makes it feel like the users are in the same room, even if they are hundreds of miles apart from each other in physical space.

The emotional impact of the technology is evident in the conversation between Mark Zuckerberg and the host of the Lex Fridman podcast. The host describes the experience as "truly incredible" and "magical", and admits to almost getting emotional. The sense of presence that the technology delivers is a fundamentally new experience, which could change the way people have conversations with loved ones.

The future of the technology is promising, as the goal is to make the scanning process more accessible and efficient. The ability to do a quick scan with a cell phone and produce a quality avatar is one of the challenges that needs to be overcome. Once this is achieved, the technology can be used for more than just video calls. Users can physically participate in activities together, such as playing games or attending meetings. The emotional impact of the technology is expected to have a significant impact on the way humans connect with each other in a deeply meaningful way on the internet.

Youtube: Mark Zuckerberg: First Interview in the Metaverse | Lex Fridman Podcast #398

Scanning Procedure

In a recent Lex Fridman podcast, Mark Zuckerberg discussed the scanning procedure used for creating the photorealistic Codec avatars in the metaverse. The scanning process involves capturing many different expressions and building a computer model of the face and body. This model is then collapsed into a codec that can be transmitted over the internet.

The scanning process is currently a lengthy procedure that requires hours of sitting still. However, the goal is to create a quick scan that can be done with a cell phone in just a few minutes. The challenge is to reduce the amount of data collected during the scan while still maintaining the quality of the avatar.

The photorealistic avatars capture the nuances of facial expressions and make communication more immersive. The realism of the avatars is achieved by capturing the subtle flaws and variations in the human face, such as freckles, wrinkles, and asymmetry. The avatars also capture the nuances of the eyes, which are a significant part of communication.

The ultimate goal of the scanning procedure is to create a sense of presence as if the participants are in the same room together, no matter where they are in the world. This technology has the potential to change the way people connect with each other on the internet. The avatars are much more bandwidth-efficient than transmitting a full video or a 3D immersive video of a scene.

In addition to being used for video calls, the Codec avatars can be used for physically participating in activities together, such as playing games or attending meetings. The future of the scanning procedure involves creating more efficient scans and building applications and use cases around the technology.

Vision for the Future

Mark Zuckerberg and Lex Fridman recently had a conversation inside the metaverse, where they appeared to each other as photorealistic Codec avatars in 3D with spatial audio. This technology is believed to be the future of how human beings connect to each other in a deeply meaningful way on the internet. These avatars can capture many of the nuances of facial expressions that humans use to communicate emotion to each other.

According to Zuckerberg, they both did scans for a research project at Meta called Codec avatars. The idea is that instead of transmitting a video, they have scanned themselves in a lot of different expressions and built a computer model of each of their faces and bodies and the different expressions that they make and collapsed that into a codec. This codec can then send an encoded version of what they are supposed to look like over the wire.

In addition to being photorealistic, this technology is also much more bandwidth efficient than transmitting a full video or especially a 3D immersive video of a whole scene like this. It captures everything, including the flaws and subtleties of the human face.

The vision for the future is to make this technology more accessible to people. Currently, the scanning procedure is a lengthy process, but the goal is to do a very quick scan with a cell phone where one can wave it in front of their face for a couple of minutes, say a few sentences, and make a bunch of expressions. The whole process should be two to three minutes long and produce something of the quality of what they have right now.

The next step is to build out all the applications and use cases around this technology. The goal is to have codec avatars like this and go into a meeting and have some people physically there and have some people show up in this photorealistic form superimposed on the physical environment. This technology will be useful for playing games, having meetings, and participating in activities together. The future looks bright for this technology, and it will be interesting to see how it develops in the coming years.

Challenges and Future Development

The technology used to create photorealistic Codec avatars in 3D with spatial audio is incredible and has the potential to revolutionize how human beings connect with each other on the internet. However, there are still some challenges that need to be addressed to make this technology more accessible and efficient.

One of the major challenges is the lengthy process of scanning individuals to create such avatars. The current process requires hours of scanning and collecting expressions, which is not feasible for most people. To make this technology more accessible, the goal is to develop a quick and efficient scanning process that can be done with a cell phone. The process should take only a few minutes and produce avatars of the same quality as those created through the current scanning process.

Another challenge is to develop more use cases and applications for this technology. While the current use case is for video calls, the potential for physical interactions and participation in activities is immense. The future of this technology lies in mixed reality and augmented reality, where people can physically interact with each other and participate in activities together. The development of applications and use cases for this technology is crucial to its success.

Overall, the development of photorealistic Codec avatars in 3D with spatial audio is a major breakthrough in the field of virtual and augmented reality. While there are still challenges to be addressed, the potential for this technology is immense, and it has the power to transform how people connect with each other on the internet.

Potential Applications

The photorealistic Codec avatars in 3D with spatial audio technology used by Mark Zuckerberg and his interviewer in the Metaverse have the potential to revolutionize how human beings connect with each other on the internet. The avatars can capture many of the nuances of facial expressions that humans use to communicate emotion to each other, making interactions more meaningful.

The technology captures everything, including the subtleties of the human face, such as freckles, variations in color, wrinkles, asymmetry, and the corners of the eyes. It is also much more bandwidth efficient than transmitting a full video or 3D immersive video of a whole scene.

The vision for virtual and augmented reality is to deliver a sense of presence as if people are there together, no matter where they actually are in the world. This technology is a good embodiment of that vision, where people can communicate as if they are sitting in the same room, even if they are hundreds of miles apart.

In the future, the technology could be used for more than just video calls. People could participate in activities together, such as playing games or attending meetings. Mixed reality and augmented reality could be used to have codec avatars superimposed on the physical environment, enabling people to physically be in one place and others to appear in photorealistic form.

However, the challenge in making this technology more accessible to people is to develop a more efficient scanning process. Currently, the scanning process is lengthy and requires detailed scans of expressions. The goal is to produce photorealistic avatars using a quick scan with a cell phone that takes only two to three minutes.

In conclusion, the potential applications of photorealistic Codec avatars in 3D with spatial audio technology are vast and could change the way people interact with each other on the internet.

Conclusion

The Lex Fridman podcast recently featured an interview with Mark Zuckerberg inside the metaverse. The two were hundreds of miles apart in physical space, but thanks to photorealistic Codec avatars in 3D with spatial audio, they appeared to each other as if they were in the same room. This technology is believed to be the future of how human beings connect to each other in a deeply meaningful way on the internet.

The Codec avatars can capture many of the nuances of facial expressions that humans use to communicate emotion to each other. The avatars are created by scanning individuals in a lot of different expressions and building a computer model of their faces and bodies. The different expressions are then collapsed into a codec that, when transmitted, sends an encoded version of what the individual is supposed to look like over the wire. This technology is not only photorealistic, but also more bandwidth efficient than transmitting a full video or 3D immersive video of a whole scene.

The photorealistic experience is an embodiment of the core vision around virtual and augmented reality, which is to deliver a sense of presence as if people are there together no matter where they actually are in the world. The goal is to make this technology accessible to everyone, and there is a project at Meta working on a very quick scan with a cell phone that would take only two to three minutes. The production of these scans in a very efficient way is one of the last pieces that need to be nailed down.

The vision for this technology is not just to have it be like a video call, but to have people physically there together and participating in things together. The possibilities include playing games, having meetings, and having codec avatars superimposed on the physical environment. The next few years are expected to be wild around this technology, and the potential is incredible.