How to Scale Content Creation with AI Agents and NVIDIA’s Ecosystem

Read more about How to Scale Content Creation with AI Agents and NVIDIA’s Ecosystem

How to Scale Content Creation with AI Agents and NVIDIA’s Ecosystem

AI AI, AI & Emerging Technology Consulting, AI Consulting, Technology Services 4 min read

As businesses scramble to keep up with the pressure to deliver innovative content at scale, traditional production methods are fading, giving way to the convergence of AI, digital twins and open standards like OpenUSD. These tools are accelerating workflows, enhancing precision and enabling scalability like never before. But what does it take to harness these advances in a practical, business-ready way?

As part of NVIDIA’s OpenUSD Insiders livestream series, our SVP of Innovation, Susan Foley; our VP, Global Head of Technology, Peter Altamirano; and our VP, Computational Creativity & Innovation, Emrah Gonulkirmaz, dove into AI-driven marketing, content creation and the future of agentic workflows. In conversation with NVIDIA’s Jamie Allan, Director of AdTech & Digital Marketing Industries, and host Edmar Mendizabal, they explored practical use cases for digital twins and NVIDIA Omniverse—complete with hands-on tips, real-world client examples and advice for organizations eager to embrace these innovations.

If you missed it, you can watch the full session below or keep reading for the key takeaways.

Digital twins are the new foundation for creative scale.

Unlocking business value today means taking control of your creative assets and processes. Digital twins—the hyper-accurate virtual models of products, characters and spaces—are quickly becoming the bedrock of that approach. Allan set the stage for the session by explaining, “A big part of what we’ve been doing is [figuring out] how to evolve the content supply chain for marketing content and ads. A lot of that is founded in creating digital twins of products, whether it’s a car or a shampoo bottle. That’s where the power of OpenUSD comes into play.”

Digital twins are built using open standards like OpenUSD—an open-source framework and file format for describing, composing and interchanging 3D scenes and assets—and applications that are developed with platforms like Omniverse. They serve as the single source of truth for everything from product imagery to complex industrial simulations, allowing businesses to rapidly iterate, test changes virtually and deliver new products or updated assets in a fraction of the traditional time. As Altamirano said, “You can optimize layout, workflows, asset creation and test and simulate your processes far faster than in the real world—no matter if you’re updating a retail shelf, visualizing packaging or piloting new robotics workflows.”

Monk Thoughts Precise digital twins accelerate decision-making, cut time-to-market and create a space for experimentation.

That said, what usually holds organizations back isn’t a lack of understanding of the benefits of digital twins, but simply not knowing where to start. Foley advised, “Start small, scale up as you prove value, and we’ll help you migrate services to compute so you can build your own moat with owned intelligence.” Pro tip: Don’t let complexity slow you down. Today’s SDKs, libraries, templates and demo projects make it easier than ever to get started quickly with no need to build everything from scratch.

Modular, agentic workflows mean AI is now your creative partner.

The future of creative production isn’t simply generating more and more images and text with AI. It’s about orchestrating a system where specialized AI agents collaborate across the full pipeline. A standout example is our experimental AI-generated campaign for PUMA, where every stage—from initial script and storyboard to animation and editing—was orchestrated by AI agents using Monks.Flow, our professional managed service powered by AI.

Thanks to NVIDIA NIM microservices and node-based orchestration enabled by Monks.Flow’s Pathways framework, AI agents can swap out models or creative roles as needed. Pathways uses self-learning AI to autonomously manage, optimize and adjust workflows in real time. For example, we can switch from one generative model for texturing to another for background imagery without disrupting the flow.

Crucially, this entire process was anchored to a high-fidelity digital twin of the PUMA product, built using NVIDIA Omniverse libraries. “We started by importing a precise 3D model of the sneaker created in the OpenUSD format into NVIDIA Omniverse USD Composer,” explained Altamirano. This virtual product served as the foundational source of correctness for every subsequent creative step.

The process didn’t stop there: synthetic data generated from the USD-based model was used to train and guide AI agents tasked with upholding the correct cinematic style throughout the film. That way, the team could control vital aspects required to meet brand guidelines, such as camera angles, lens choices and product accuracy.

As Emrah put it, “After bringing the product to the previsualization stage in Omniverse, we gained full control over the generation process.” Then, using Pathways, a network of specialized AI agents orchestrated animation, editing and scene composition, maintaining brand consistency throughout.

Monk Thoughts The future of AI won’t fit into containers of the past. Workflows need to be modular, interoperable, and ready to scale.

How to get started and scale fast.

Whether you’re a marketer, a manufacturer looking to modernize, or a developer curious about AI-driven workflows, the new ecosystem emphasizes accessibility. “Get your first Omniverse setup done and you’ll see how reusable and scalable it really is,” said Altamirano. “Start with a prototype, prove the value, then expand. This approach works for retail, hospitality, even real estate.”

For organizations not sure where to begin, the advice is clear:

Start small and show early results. Quick wins build stakeholder buy-in and reveal practical value.
Invest in training not just for engineers, but for your whole team. NVIDIA Omniverse Blueprints, free Deep Learning Institute courses, and a vibrant developer community enable rapid learning and onboarding.
Embrace open, modular platforms. This lets you change direction, upgrade AI models, and keep workflows on the cutting edge without locking yourself into monolithic systems.

Ultimately, modern creative innovation isn’t about one-off experiments; it’s about embedding intelligence, agility and modularity into the heart of your business. Digital twins anchor accuracy and scale. Agentic AI workflows make creativity collaborative and customizable. The path to scalable, AI-driven content creation has never been clearer.

Learn how to unlock scalable content creation with AI agents and NVIDIA, using digital twins and OpenUSD for faster, brand-accurate production. Nvidia AI content at scale AI agents 3D content Technology Services AI & Emerging Technology Consulting AI Consulting AI

CUBE: Fashion Takes Shape • Driving Art Installations with Data

Client
Google
Solutions
ExperienceExperiential Strategy & ProductionAI & Emerging Technology ConsultingData

00:00

Case Study

0:00

Visualizing fashion brands’ digital footprint and face value.

In this hypercompetitive and hyperconnected world, brands face the daily challenge of how to stand out from the crowd and remain meaningful and memorable to consumers. The secret lies in knowing what sticks with your audiences—a tricky task, as audience perception isn't always obvious to brands. So, we teamed up with Google to solve this issue by creating the first-ever AI-powered interactive tool that provides a visual representation of a brand’s digital presence: CUBE.

A vibrant and fashionable crowd fills the scene, gathered in a dimly lit room. The atmosphere is electric as people mingle and socialize. In the center, a large projection screen commands attention, displaying captivating visuals. The room is bathed in a mystical combination of purple and blue lighting, emanating from ceiling fixtures and casting an ethereal glow. This adds to the allure of the setting, creating an ambiance that is both alluring and mysterious.

In the image, a man in a black suit is standing in front of a large projection screen. The screen is prominently displayed against a pink background. The man is holding a cell phone, specifically a Samsung Galaxy S10e with a pixelated screen. He is also holding a microphone, indicating that he may be speaking or presenting at an event. The man in the black suit is wearing a mustache and is surrounded by various objects and individuals. There is a small text in the image that reads "CUB" on the lower portio

Transforming complex data into key insights for marketers.

Google’s goal was to help brands in the fashion industry use their data to understand how they’re perceived by the outside world. Together with data artist Dr. Kirell Benzi, we used the latest machine learning techniques in natural language processing to create CUBE, which is both a physical art installation and an online platform. Connecting fashion with art and technology, we used state-of-the-art AI to translate massive volumes of data into seven prime topics for the fashion world with the aim to deliver accessible and meaningful insights for marketers.

Presenting the fashion industry with a striking AI-powered tool.

During Google’s hybrid fashion event in Milan, we demonstrated the impact of the 200m2 CUBE art installation to 300 C-level executives from across the globe, as we invited them to interact with the artwork and discover how consumers perceive their unique brand based on its online presence. Shining a bright spotlight on brand storytelling, the purpose was to show fashion brands how they can use data, digital media and Google AI tools to understand what consumers think of them and ultimately communicate better with their target audiences.

Our Craft

Bringing brand storytelling to the next level

Results

500K fashion professionals reached online
1,257 brands from 23 countries
50K interactions with the artwork
10K research downloads
94% of guests recognize Google as the top tech company
+11 points in Google perception as strategic partner

1x FWA
1x Eventex Awards

Want to talk experiences? Get in touch.

Hey 👋

Please fill out the following quick questions so our team can get in touch with you.

Can’t get enough? Here is some related work for you!

Unity Appoints Media.Monks Media Agency of Record

Read more about Unity Appoints Media.Monks Media Agency of Record

Unity Appoints Media.Monks Media Agency of Record

Brand Media Brand Media, Media, Media Strategy & Planning, Metaverse, Monks news 2 min read

September 8, 2022

Unity (NYSE: U), the world’s leading platform for creating and operating interactive, real-time 3D (RT3D) content, has selected Media.Monks as its media agency of record following a competitive RFP process. The move will unify top of funnel awareness by consolidating media services under one roof at Media.Monks, which were previously split among various agencies.

Media.Monks will take on media strategy, planning and buying, and measurement for Unity globally. With subject matter expertise in gaming, VR, Web3 and the metaverse, Media.Monks’ integrated team will scale up media to engage Unity’s core gaming business and its B2B audience.

“Media.Monks is the right fit for our business given our shared expertise and belief in how RT3D, the metaverse and the next phase of the internet are changing not only gaming but many other industries,” said Carol Carpenter, CMO, Unity. “We are excited to partner with them to unify our media efforts globally, and work together to deliver unique solutions for customers.”

Monk Thoughts We’re so excited to partner up in a deeper way with such a similarly-minded, cutting-edge company. As avid fans of Unity, we’re looking forward to helping them charter their next path toward growth as they tackle new verticals and push the boundaries of this technology.

Melissa Wisehart

Global Head of Media

In addition to the media AOR assignment, the Media.Monks creative development teams use Unity software to deliver real-time 3D solutions for clients across a wide range of industries. Recently, the Unity technology powered Media.Monks’ development of an award-winning AR experience, ‘Anne Frank House: The Bookcase for Tolerance,’ honored at the Cannes Lions Festival of Creativity in the Digital Craft category, and many more including The Webby Awards, The One Show, ADC Global and D&AD.

Monk Thoughts Real-time 3D is now a foundational part of our digital toolset. We’re using real-time 3D technology on countless projects across a wide range of verticals––it’s our go-to for creating interactive experiences, new ad formats, and yes, the metaverse.

Tim Dillon

SVP, Real-time & Virtual Worlds

Learn more about the work Unity and Media.Monks are doing to build successful B2C brands in the metaverse by tuning in to an on demand discussion between Unity’s VP of Accelerate Solutions, Ryan Peterson, and Media.Monks' SVP, Tim Dillon. Tim will discuss insider lessons and insights gained from working with major consumer brands––from getting started in the metaverse, ways to leverage a real-time 3D game engine to making a genuine impact, and more. Listen now.

This review was led by Tenx4, an agency search consultancy who specializes in helping Global B2B Brands identify the right agency partner. “We’re on a mission to fix the broken agency RFP process to be about ‘the fit’ rather than ‘the win’ and it is clear that the partnership between Unity and Media.Monks is the perfect fit,” said Ashley Cohen Chandler, Partner, Tenx4.

Unity, the world’s leading platform for creating and operating interactive, real-time 3D (RT3D) content, has selected Media.Monks as its media agency of record. unity real time production 3D content media buying media strategy metaverse gaming VR Web3 Media Media Strategy & Planning Brand Media Monks news Metaverse

Scrap the Manual: Virtualization of Real World Objects into Live Shows

Read more about Scrap the Manual: Virtualization of Real World Objects into Live Shows

Scrap the Manual: Virtualization of Real World Objects into Live Shows

16 min read

What if you could scan an object in your environment and bring it into a live show? In this “How Do We Do this” episode of the Scrap the Manual podcast, we respond to an audience-provided question from one of you! Tune in to learn more about the differences between 3D scanning and photogrammetry, content moderation woes, and what it could take to make this idea real.

You can read the discussion below, or listen to the episode on your preferred podcast platform.

00:00

Angelica: Hey everyone! Welcome to Scrap The Manual, a podcast where we prompt “aha” moments through discussions of technology, creativity, experimentation, and how all those work together to address cultural and business challenges. My name is Angelica.

Rushali: And my name is Rushali. We are both creative technologists with Labs.Monks, which is an innovation group within Media.Monks with a goal to steer and drive global solutions focused on technology and design evolution.

Angelica: Today, we have a new segment called “How Do We Do This?” where we give our audience a sneak peek into everyday life at Labs and open up the opportunity for our listeners to submit ideas or projects. And we figure out how we can make it real. We'll start by exploring the idea itself, the components that make it unique and how it could be further developed, followed by doing a feasibility check. And if it isn't currently feasible, how can we get it there?

Which leads us to today's idea submitted by Maria Biryukova, who works with us in our Experiential department. So Maria, what idea do you have for us today?

Maria: So I was recently playing around with this app that allows you to scan in 3D any objects in your environment. And I thought: what if I could scan anything that I have around me—let's say this mic—upload it live and see it coming to life on the XR stage during the live show?

Angelica: Awesome, thanks Maria for submitting this idea. This is a really amazing creative challenge and there's a lot of really good elements here. What I like most about this idea, and is something that I am personally passionate about, is the blending of the physical world with the digital world, because that's where a lot of magic happens.

That's where, when AR first came out, people were like, “Whoa, what's this thing that's showing up right in front of me?” Or in VR when they brought these scans of the real world into the very initial versions of Google cardboard headsets, that was like the, “Whoa! I can't believe that this is here.”

So this one's touching upon the physicality of a real object that exists in the real world…someone being able to scan it and then bring it into a virtual scene. So there's this transcending of where those lines are, which are already pretty blurred to begin with. And this idea continues to blur them, but I think in a good way, and in a way that guests and those who are part of the experience can have a role where they're beyond being a passive observer, into being an active participant into these.

We see this a little bit with WAVE, where they have a virtual space and people are able to go to this virtual concert and essentially pay like $5 to have a little heart get thrown to Justin Bieber's head. Lovingly, of course, but you get the point. Where this one, it takes it another step further in saying, “Okay, what if there's something in my environment?”

So maybe there is an object that maybe pertains to the show in a particular way. Let's say that it's like a teddy bear. And all the people who have teddy bears around the world can scan their teddy bear and put it into this environment. So they're like, “Oh, that's my teddy bear.” Similar to when people are on the jumbotron during sports events and they're like, “Hey, that's my face up there.” And then they go crazy with that. So it allows more of a two way interaction, which is really nice here.

Rushali: Yeah. That's the part that seems interesting to me. As we grow into this world where user-generated content is extremely useful and we start walking into the world of the metaverse, scanning and getting 3D objects that personally belong to you—or a ceramic clay thing, or a pot that you made yourself—and being able to bring it into the virtual world is going to be one of the most important things. Because right now, in Instagram, TikTok, or with any of the other social platforms, we are mostly generating content that is 2D, or generating content that is textual, or generating audio, but we haven't explored extremely fast 3D content generation and exchange the way that we do with pictures and videos on Instagram. So we discussed the “why.” It's clearly an interesting topic, and it's clearly an interesting idea. Let's get into the “What.”

Angelica: Yeah. So from what we're hearing on this idea, we have scanning the object, which will connect to 3D scanning and photogrammetry, which we can get a little bit into the differences between the two different types of technologies. And then when the scan is actually added into the environment, is it cleaned up? Is it something where it acts as its own 3D model without any artifacts from the environment that it was originally scanned in? And a part of that is also the compositing. So making sure that the object doesn't look like a large ray of sunlight when the event is very moody and dark. It needs to fit within the scene that it's within.

And we're hearing content moderation, in terms of making sure that when guests of any kind become a little bit more immature than the occasion requires, that it filters out those situations to make sure that the object that needs to be scanned in the environment is a correct one.

Rushali: Absolutely. What was interesting while you were walking through all the different components was the way that this idea comes together: it's just so instantaneous and real time that we need to break down how to do this dynamically.

Angelica: Yeah. And I think that's arguably the most challenging part, aside from the content moderation aspect of it. Let's use photogrammetry as an example. Photogrammetry is the process of taking multiple pictures of an object from as many sides as you can. An example of this is with Apple's Object Capture API. You just take a bunch of photos. It does its thing, it processes it, it thinks about it. And then after a certain amount of time (sometimes it's quick, sometimes it's not…depends on how high quality it needs to be), it'll output a 3D model that it has put together based on those photos.

Rushali: Yeah. So the thing that I wanted to add about photogrammetry, that you described very well, was that in the last just five years, photogrammetry has progressed from something very basic to something outrageously beautiful very quickly. And that one of the big reasons for that is how the depth sensing capability came in and became super accessible.

Imagine someone standing on a turntable and taking pictures from each and every angle and we are turning the turntable really slowly and you end up having some 48,000 pictures to stitch together and then create this 3D object. But a big missing piece in this puzzle is the idea of depth. Like is this person's arm further away or closer? And when the depth information comes in, suddenly it becomes a lot more evolved to have a 3D object with that depth information. So iPhones’ having a depth sensing camera, closer to last year and the year before that, have really enhanced the capabilities.

Angelica: Yeah, that's a good point. There is an app that had been doing this time-intensive and custom process for a very long time. But then when Apple released the Object Capture API, they said, “Hey, actually, we're going to revamp our entire app using this API.” And they even say that it's a better experience for iPhone users because of the combination of the Object Capture API and leveraging the super enhanced cameras that are now coming out of just a phone.

Android users, you're not out of the woods here. Some Samsung phones, like the Samsung 20 and up have a feature embedded right in the phone software where you can do the same process that I was just mentioning earlier about a teddy bear.

There's a test online where someone has a teddy bear to the left of a room, they scan it and then they're able to do a point cloud of the area around it. So they could say, “Okay, this is the room. This is where the walls are. This is where the floor is.” And then paste that particular object that they just scanned into the other corner of the room and pin it or save it. So then if they leave the app, come back in, they can load it, and that virtual object is still in the same place because of the point cloud scanning their room, their physical room, and they put it right back where it was. So it's like you got a physical counterpart and a digital counterpart. It's more than just iPhones that are having these enhanced cameras. That experience is made possible because Samsung cameras are also getting better and better over time.

The process I was just explaining about the teddy bear, the point cloud, and placing it into an environment…that is a great example of 3D scanning, where you can move around the object but it's not necessarily taking a bunch of photos and stitching them together to create a 3D model. The 3D scanning part is a little bit more dynamic. But it is quite light sensitive. So, for example, if you're in a very sunny area, it'll be harder to get a higher quality 3D model from that. So keeping in mind, the environment is really key there. Photogrammetry as well, but 3D scanning is especially sensitive to this.

Rushali: So children, no overexposure kindly…

Angelica: …of photos. Yes. [laughter]

Though, that scanning process can take some time and it can vary in terms of fidelity. And then also that 3D model may be pretty hefty. It may be a pretty large file size. So then that's when we're getting into the conversation of having this be uploaded to the cloud and offload some of that storage there. Not beefing up the user's phone, but it goes somewhere else and then the show could actually funnel that into the experience, after the content moderation part of course.

Rushali: You've brought up a great point over here as well, because a big, big, big chunk of this is also fast internet at the end of the day because 3D files are heavy files. They are files that have a lot of information about the textures.

The more polygons there are, the heavier the files. All of that dramatic graphics part comes into play and you are going to get stuck if you do not have extremely fast internet. And I *wink, wink* think 5G is involved in this situation.

Angelica: Yeah, for sure. 5G is definitely a point of contention in the United States right now, because they're converting to that process, which is affecting aviation and the FAA and other things like that. So it's like, yeah, the possibilities with 5G are huge, but there's some things to work out still.

Rushali: So that's the lay of the land of 3D scanning and photogrammetry. And we do have apps right now that in almost real time can give you a 3D scan of an object in their app. But the next part is the integration of this particular feature with a live show or a virtual ecosystem or putting it into a metaverse. What do you think that's going to look like?

Angelica: This will involve a few different components. One: being the storage onto the cloud, or a server of some kind that can store, not just one person’s scan, but multiple people's scans. And I could easily see an overload situation where if you say to an audience of Beliebers, “Hey, I want you to scan something.”

They're like, “Okay!” And you got 20,000 scans that now you dynamically have to sift through and have those uploaded into the cloud to then be able to put into the experience. I can anticipate quite an overload there.

Rushali: You're absolutely on point. You're in a concert: 20,000 to 50,000 people are in the audience. And they are all scanning something that they have either already scanned or will be scanning live. You better have a bunch of servers there to process all of this data that's getting thrown your way. Imagine doing this activity, scanning an object and pulling it up in a live show. I can 100% imagine someone's going to scan something inappropriate. And since this is real time, it's gonna get broadcasted on a live show. Which brings into the picture the idea of curation and the idea of moderation.

Angelica: Because adults can be children too.

Rushali: Yeah, absolutely. If there's no moderation…turns out there's a big [adult product] in the middle of your concert now. And what are you going to do about it?

Angelica: Yeah exactly. Okay, so we've talked about how there are a lot of different platforms out there that allow for 3D scanning or the photogrammetry aspect of scanning an object and creating a virtual version of it, along with a few other considerations as well.

Now we get into…how in the world do we do this? This is where we explore ways that we can bring the idea to life, tech that we can dive a bit deeper into, and then just some things to consider moving forward. One thing that comes up immediately (we've been talking a lot about scanning) is how do they scan it? There's a lot of applications that are open source that allow a custom app to enable the object capture aspect of it. We talked about Apple, but there's also a little bit that has been implemented within ARCore, and this is brought to life with the LiDAR cameras. It's something that would require a lot of custom work to be able to make it from scratch. We would have to rely on some open source APIs to at least get us the infrastructure, so that way we could save a lot of time and make sure that the app that's created is done within a short period of time. Because that's what tends to happen with a lot of these cool ideas is people say, “I want this really awesome idea, but in like three months, or I want this awesome idea yesterday.”

Rushali: I do want to point out that a lot of these technologies have come in within the last few years. If you had to do this idea just five years ago, you would probably not have access to object capture APIs, which are extremely advanced right now because they can leverage the capacity of the cameras and the depth sensing. So doing this in today's day and age is actually much more doable, surprisingly.

And if I had to think about how to do this, the first half of it is basically replicating an app like Qlone. And what it's doing is, it's using one of the object capture APIs, but also leveraging certain depth sensing libraries and creating that 3D object.

The other part of this system would then be: now that I have this object, I need to put it into an environment. And that is the bigger unknown. Are we creating our own environment or is this getting integrated into a platform like Roblox or Decentraland? Like what is the ecosystem we are living within? That needs to be defined.

Angelica: Right, because each of those platforms have their own affordances to be able to even allow for this way of sourcing those 3D models dynamically and live. The easy answer, and I say “easy” with the lightest grain of salt, is to do it custom because there's more that you can control within that environment versus having to work within a platform that has its own set of rules.

We learned this for ourselves during the Roblox prototype for the metaverse, where there are certain things that we wanted to include for features, but based on the restrictions of the platform, we could only do so much.

So that would be a really key factor in determining: are we using a pre-existing platform or creating a bespoke environment that we can control a lot more of those factors?

Rushali: Yeah. And while you were talking about the ecosystem parts of things, it sort of hit me. We're talking about 3D scanning objects, like, on the fly as quickly as possible. And they may not come out beautifully. They may not be accurate. People might not have the best lighting. People might not have the steadiest hands because you do need steady hands when 3D scanning objects. And another aspect that I think I would bring in over here when it comes to how to do this is pulling in a little bit of machine learning so that we can predict which parts of the 3D scan have been scanned correctly versus scanned incorrectly to improve the quality of the 3D scan.

So in my head, this is a multi-step process: figuring out how to object capture and get that information through the APIs available by ARCore or ARKit (whichever ones), bring the object and run it through a machine learning algorithm to see if it’s the best quality, and then bring it into the ecosystem. Not to complicate it, but I feel like this is the sort of thing where machine learning can be used.

Angelica: Yeah, definitely. And one thing that would be interesting to consider is that the dynamic aspect of scanning something and then bringing it live is the key part in all this. But it also has the most complications and is the most technology dependent, because there's a short amount of time to do a lot of different processes.

One thing that I would recommend is: does it have to be real time? Could it be something that's done maybe a few hours in advance? Let's say that there's a really awesome Coachella event where we have a combination of a digital avatar influencer of some kind sharing the stage with the live performer. And for VIP members, if they scan an object before the show, they will be able to have those objects actually rendered into the scene.

So that does a few different things. One: it decreases the amount of processing power that's needed because it's only available for a smaller group of people. So it becomes more manageable. Two: it allows for more time to process those models at a higher quality. And three: content moderation. Making sure that what was scanned is something that would actually fit within the show.

And there is a little bit more of a back and forth. Because it's a VIP experience, you could say: “Hey, so the scan didn't come out quite as well.” I do agree with you, Rushali, that having implementation of machine learning would help in assisting this process. So maybe having it a little bit of time before the actual experience itself would alleviate some of the heaviest processing and the heaviest usage that can cause some concerns when doing it live.

Rushali: And to add to that, I would say this experience (if we had to do it today in 2022) would probably be something on the lines of this: you take the input at the start of a live show, and use the output towards the end of it. So you have those two hours to do the whole process of moderation, to do the whole process of passing it through quality control. All of these steps that need to happen in the middle.

Also, there's a large amount of data transfers happening as well. You're also rendering things at the same time and this is a tricky thing to do in real time as of today. You need to do it with creative solutions, with respect to how you do it. And not with respect to the technologies you use, because the technologies currently have certain constraints.

Angelica: Yeah, and technology changes. That's why the idea is key because maybe it's not perfectly doable today, but it could be perfectly doable within the next few years. Or even sooner, we don't know what's going on behind the curtain of a lot of the FAANG companies. New solutions could be coming out any day now that enable some of these pain points within the process to be alleviated much more.

So we've talked about the dynamic aspect of it. We've talked about the scanning itself, but there are some things to keep in mind either for those scanning an object. What are some things that would help with getting a clean scan?

There's the basics, which is avoid direct lighting. So don't do the theater spotlight on it because then that’ll blow out the picture. Being uniformly lit is a really important thing here, making sure to avoid shiny objects. While they're pretty pretty, they are not super great at being translated into reliable models because the light will reflect off of them.

Those are just a few, and there's definitely others, but those are some of the things that during this process would be a part of the instructions when the users are actually scanning this. After the scan is done, like I mentioned, there are some artifacts that could be within the scan itself. So an auto clean process would be really helpful here, or it has to be done manually. The manual part would take a lot more time, which would hurt the feasibility aspect of it. And that's also where maybe the machine learning aspect could help with that.

And then in addition to cleaning it up would be the compositing, making sure that it looks natural within the environment. So all those things would have to be done either as a combination of an automated process or a manual process. I could see where the final models that are determined to be put into the show, those can be a more manual process to make sure that the lighting suits the occasion. And if we go with the route that you mentioned, which is do it at the very beginning of the show, then we have a bunch of time (and I say a bunch it's really two hours optimistically) to do all of these time-intensive processes and make sure that it's relevant by at the end of the show.

Moderation is something we've also talked about quite a bit here as well. There's a lot of different ways for moderation to happen, but it's primarily focused on image, text and video. There is a paper out of Texas A&M University that does explore moderation of 3D objects, more to prevent the NSFW (not safe for work) 3D models from showing up when teachers just want their students to look up models for 3D printing. That's really the origin of this paper. And they suggested different ways that the learning process of moderation could be done, which they mention is the human-in-loop augmented learning. But it's not always reliable. This is an exploratory space that there's not a lot of concrete solutions in. So this would be something that would be one of the heavier things to implement, just looking at the entire ecosystem of the concept of what would need to be implemented.

Rushali: Yeah, if you had to add a more sustainable way. And when I say sustainable, I mean, not with respect to the planet, because this project is not at all sustainable, considering there’s large amounts of data being transferred. But coming back to making the moderation process more sustainable, you can always open it up to the community. So the people who are attending the concert decide what goes in. Like maybe there's a voting system, or maybe there is an automated AI that can detect whether someone has uploaded something inappropriate. There's different approaches within moderation that you could take. But for the prototype, let's just say: no moderation because we are discussing, “How do we do this?” And simplifying it is one way of reaching a prototype.

Angelica: Right, or it could be a manual moderation.

Rushali: Yes, yes.

Angelica: Which would help out, but you would need to have the team ready for the moderation process of it. And it could be for a smaller group of people.

So it could be for an audience of, let's say 50 people. That's a lot smaller of an audience to have to sift through the scans that are done versus a crowd of 20,000 people. That would definitely need to be an automated process if it has to be done within a short amount of time.

So in conclusion, what we've learned is that this idea is feasible…but with some caveats. Caveats pertaining to how dynamic the scan needs to be. Does it need to be truly real time or could it be something that can take place over the course of a few hours, or maybe even a few days or a few weeks? It makes it more or less feasible depending upon what the requirements are there.

The other one is thinking about the cleanup, making sure that the scan is fitting with the environment, it looks good, all those types of things. The moderation aspect to make sure that the objects that are uploaded are suited to what needs to be implemented. So if we say, “Hey, we want teddy bears in the experience,” but someone uploads an orange. We probably don't want the orange, so there is a little bit of object detection there.

Okay, that's about it. Thanks everybody for listening to Scrap The Manual and thank you, Maria, for submitting the question that we answered here today. Be sure to check out our show notes for more information and references of things that we mentioned here. And if you like what you hear, please subscribe and share. You can find us on Spotify, Apple Podcasts, and wherever you get your podcasts.

Rushali: And if you want to suggest topics, segment ideas, or general feedback, feel free to email us at scrapthemanual@mediamonks.com. If you want to partner with Media.Monks Labs, feel free to reach out to us over there as well.

Angelica: Until next time…

Rushali: Thank you!

Listen to Scrap the Manual

In this “How Do We Do this” episode of the Scrap the Manual podcast, we learn more about the differences between 3D scanning and photogrammetry, content moderation woes, and what it could take to make this idea real. 3D printing 3D content virtual experiences

Oreo Virtual Production • A Mouthwatering Approach to Tabletop Production

Client
Mondelēz
Solutions
StudioContent Adaptation and Transcreation

A tabletop approach you can dunk on.

Twist, dunk or eat it in one mouthful—there are several ways to enjoy an Oreo, milk’s favorite cookie. With so many preferences and Oreo fans around the world, Mondelēz needed a way to whet the appetites of consumers everywhere with mouthwatering, locally relevant creative captured at scale.

A delicious blend of creativity and technology.

For global brands, tabletop production can be a costly endeavor—for both time and budget. Every market differs in package design and legal requirements, and legacy processes make it difficult to produce relevant creative at the speed and scale needed for today’s consumer packaged goods brands. So we changed the game using Unreal Engine, developed by Epic Games, to bake up and automate tabletop production.

Using the real-time engine enables local teams to switch out packs in just a few clicks—and a few seconds—rather than wait through the long rendering times that are typical in the traditional CGI process. This makes it easy to iterate and scale up, satisfying cravings everywhere.

In partnership with

Mondelēz

Client Words Our customers work at a pace that demands innovation, collaboration and iteration—all at high quality. [Monks] share this vision, and I’m honored to invite them to the Epic MegaGrant community as they work to reinvent production processes with Unreal Engine.

John Buzzell

Enterprise Licensing Lead, Unreal Engine (Americas)

Providing meaningful work for the future—in real time.

Oreo lovers weren’t the only ones left drooling; our innovative approach to using Unreal Engine in tabletop advertising also inspired a course in Unreal Futures, a learning series that prepares tomorrow’s developers and creatives for success in 3D careers across different industries. Collaborating with Epic Games, our employees walk students step-by-step through our process and challenge them to develop a 3D advertisement of their own using Unreal Engine.

So whether inspiring a purchase or inspiring the next generation of 3D creatives, we took a bite out of tired, traditional processes—maximizing quality, speed and efficiency to connect with consumers of all tastes.

Monk Thoughts From an artistic standpoint, I see so many opportunities to up-level our creative outputs in Unreal Engine, blending creativity and technology to shape better content and stronger stories that propel the industry forward.

John Paite

Chief Creative Officer India

00:00

Results

1x CSS Site of the Day

Want to talk real-time?  Get in touch.

Hey 👋

Please fill out the following quick questions so our team can get in touch with you.

Can’t get enough? Here is some related work for you!

Forrester Spotlights How to Power Up Production with Game Engine

Read more about Forrester Spotlights How to Power Up Production with Game Engine

Forrester Spotlights How to Power Up Production with Game Engine

4 min read

With many consumers still at home around the world, brands’ need to show up for audiences digitally has never felt more urgent, particularly through immersive content and experiences that recapture some of what’s lost in interacting with a brand, loved ones or product in-person.

And with this need, a perhaps unlikely tool for marketers has emerged: game engines. The role of gaming in the marketing mix has also risen, providing new, engaging environments to meet consumers. Glu Mobile’s apartment decorating game Design Home, for example, lets players customize a home using real furniture from brands like West Elm and Pottery Barn—and is now even offering its own series of real-world products through its Design Home Inspired brand, effectively turning the game into a virtual retail showroom.

But game engines aren’t just for making content consumed in games. They also provide an environment for brands to build and develop 3D assets to use both internally and to power a variety of virtualized experiences. We recently announced receiving an Epic MegaGrant to automate virtual tabletop production using Unreal Engine, and additionally we use Unity Engine—which powers 53% of the top 1,000 mobile games within the App Store and Google Play Store—to build mobile and WebGL experiences with consumer audiences in mind.

3D Adds a New Dimension to Content Production

In addition to serving increasingly digital user behavior, the use of 3D content has the potential to help brands build efficiency across the enterprise and customer decision journey. A new report from Forrester, “Scale Your Content Creation With 3D Modeling” by Ryan Skinner and Nick Barber, details how 3D content solves a critical challenge that brands face today: the need to keep up with demand for content while faced with dwindling budgets. “Particularly in e-commerce, marketers have learned they need product images to reflect every angle, every variation, and a multitude of contexts; without updated and detailed imagery, engagement and sales suffer,” the authors write.

Pick your flavor: game engines let brands tweak and change assets with ease and speed.

One solution mentioned in the report is building a CGI-powered production line. MediaMonks Founder Wesley ter Haar notes in the report that “This is very top of mind for us right now,” a sentiment that is reinforced by how we’re powering creative production at scale using tools like Unreal Engine and Unity, as mentioned above.

“What’s exciting about real-time 3D is that it ticks a bunch of boxes for a brand,” says Tim Dillon, SVP Growth at MediaMonks, whose primary focus is on our game engine-related work. “You can use it in product design, in your marketing, in retail innovation–it’s touching so many different end use cases for brands.” A CAD model used internally for product design, for example, could also be used in virtual tabletop photography, in a retail AR experience, in 3D display ads and more—reconfigured and recontextualized to accomplish several of a brand’s goals in producing content and building experiences.

When it comes to content, game engines make it easier for teams to create assets at scale through variations in lighting, environment or color—especially when augmented by machine learning and artificial intelligence. “When creating content in real time, brands not only make content faster but can react and adapt to consumer needs faster, too,” says Dillon. “Things like 3D variations, camera animation and pre-visualization become much faster to achieve—and in some cases more democratic too, by putting new 3D tools in our client teams’ hands to make these choices together,” he says.

The Genesis car configurator lets users view their customizations in real time.

A case in point is the car configurator we built for Genesis, built in Unity and covered in their recent report: “25 Ways to Extent Real Time 3D Across Your Enterprise.” The web-based configurator offers a car customization experience as detailed and fluid as you’d find in a video game, letting consumers not only see what their custom model would look like with different features, but also within different environmental factors like time of day—all in real time.

Making a Lasting Impact Through Immersive Moments

Through greater adoption of immersive storytelling technologies and ultra-fast 5G connection, we are entering a virtualized era capable of placing a persistent 3D layer across real-world environments—already made possible through Unity technology and cloud anchors by Google, which anchor augmented reality content to specific locations that people can interact with over time. Consider, for example, a virtual retail environment that never closes and provides personalized service to each customer.

These experiences have become all the more relevant with the pandemic. In the Forrester report mentioned above, ter Haar says: “With COVID, we’re seeing greater interest to demo in 3D. The tactile and physical nature of seeing something makes it easier to buy.”

But perhaps more important for brands is that immersive experiences have the power to create real, lasting memories—a focus of a recent talk by Quentin de la Martinière (Executive Producer, Extended Realities at MediaMonks) and Ron Lee (Technical Director at MediaMonks) at China Unity Tech Week.

Spacebuzz takes students on an out-of-this-world journey through VR.

“We start from the strategy of who you want to engage with,” says Lee, delving into the storytelling potential of 3D content. “From there, we try to understand the vision we want to build to grab the user’s attention and put them in the world.” This includes deciding on the best venue for a 3D experience: augmented reality, virtual reality, mixed reality or on the web? By making the right selection, brands can build experiences that explain while they entertain.

Lee and de la Martinière showed Spacebuzz as an example of how immersive experiences can have lasting impact. Through a 15-minute VR experience built in Unity, school children are taken to space where they experience the “overview effect”: a humbling shift in awareness of the earth after viewing it from a distance.

“The technology and the story bring together the vision and the message of the experience,” says Lee. “Building that immersive environment in Unity and translating this information on a deeper level creates real memories for the kids who engage with it.” Likewise, brands can leave a memorable mark on consumers through 3D content. “These extremely personalized experiences allow the brand to leave a deep impression on audiences and intensify brand favorability,” Lee says.

From streamlining production to powering experiences across a range of consumer touchpoints, the value of 3D content is building for brands. Working closely with the developers of leading game engines that enable these experiences, like Unity and Unreal Engine, we’re helping brands add an entirely new dimension to their content and storytelling for virtualized audiences.

Game engines enable brands to scale up content and drive value across the enterprise—including the customer journey. Forrester Spotlights How to Power Up Production with Game Engine From new realities to memorable moments, game engines deliver.
Game engine unity unity engine unreal engine epic games extended reality 3D content

3D Content Adds a New Dimension to ROI

Read more about 3D Content Adds a New Dimension to ROI

3D Content Adds a New Dimension to ROI

4 min read

While online platforms have traditionally merely catalogued inventory and product descriptions, today’s technology enables consumers to get up close and engage with products—a useful feature in a time when physical touch has become discouraged.

This fast-changing development has challenged brands to think in terms of new digital formats and channels, identifying untapped opportunities to strike a connection with consumers. One exciting example of this is Google’s Swirl ad format, which transforms banner ads into spaces to engage directly with 3D product models.

Google recently released a case study detailing a Swirl campaign co-developed by MediaMonks for French fragrance brand Guerlain. The ads invite users to explore the brand’s perfume, turning a digital bottle to reveal floral ingredients that visually evoke its scent in an almost synesthetic fashion. While achieving the “wow” factor of an appealing interactive experience, the ad drove results, too: Google notes a three-time increase in engagement compared to other rich media formats, a 34% increase in exposure time and a 17-point increase in customer purchase intent. The ad’s success showcases how technical innovation and creative storytelling come together to drive unique engagement opportunities.

Content That Goes Beyond the Bounds of Possibility

Swirl ads function in two ways: first, there’s the initial banner view, whose animation is triggered by the user’s scrolling down a page. Within this mode, users can rotate the product and zoom in to explore its details more closely. If they like what they see, there’s a prompt to open the experience in a full-screen view, enabling greater detail and additional features.

Swirl ads let users dive deep into product features in an engaging way.

Tommy Lacoste, who is a Senior Project Manager at MediaMonks and worked on Guerlain and other Swirl campaigns, noted that “The most compelling thing about the format is having a beautiful, 3D object with real time reflection and shadows,” mentioning the creative goal of achieving visual fidelity. Another unique aspect of the format compared to other interactive banners, he says, is that it doesn’t immediately redirect you somewhere else. Exploration and engagement are critical. “With the Swirl format, we can really dress up and contextualize the object,” says Lacoste.

Showcasing the Guerlain perfume’s ingredients digitally as a beautiful bouquet within the bottle is just one example of how brands can use 3D content to creatively build new contexts for learning about or enjoying a brand. This applies to other content like AR filters as well; for example, MediaMonks worked with Unilever to develop a Facebook Messenger-connected AR game that turns the daily habit of brushing one’s teeth into playtime, helping establish healthy habits by tapping into children’s imagination.

Striking Personalized Emotional Resonance

As shoppers increasingly turn to digital channels to research, discover and make purchases online, 3D content also offers an immersive opportunity to strike a personalized connection. While this need has ramped up after the rise of COVID-19’s spread, Swirl ads were already live well before then, demonstrating how the appetite for such content has already existed. The format serves as an effective vehicle for building emotional resonance, which is increasingly critical to differentiating the brand as consumers turn their attention toward experiences.

In the Forrester report “Navigate Four Waypoints To Build Brand Resonance,” Forrester VP and Principal Analyst Dipanjan Chatterjee notes the importance of driving emotional connections between a brand and its audience. “Brands do not just satisfy our material needs; they also speak to our subconscious,” writes Chatterjee. “The best ones connect to us emotionally in ways that secure them an unassailable position. It is much easier for competitors and entrants to innovate and replicate features and functionality than it is to displace an emotionally rooted bond.”

Monk Thoughts The challenge is to deliver on the original intent of digital.

Wesley ter Haar

Co-Founder

In search of the emotional resonance in content, MediaMonks Founder Wesley ter Haar laments that over the years, many brands and advertisers have privileged linear storytelling formats for too long, avoiding a key benefit of digital formats: interactivity. “The challenge is to deliver on the original intent of digital,” says MediaMonks founder Wesley ter Haar. “Interactive, tactile and personalized moments of magic that create conversation, conversion and commercial opportunities.”

Brands that seek new yet meaningful ways to connect with consumers digitally require a more innovative approach to the standard toolkits they’ve been working with. By rethinking how consumers can interact with physical products digitally, Swirl ads and other 3D content like AR filters encourage brands to adopt a channel-specific mindset that identifies opportunities to meet consumers in unique, but increasingly relevant, ways.

Brands Are Best Served with a 3D Strategy in Place

3D content replicates the physical experience of engaging with a product, but has the opportunity to go even further because it’s unbounded by physical constraint—aside from file sizes and loading times, anyway. But conceptually, 3D creative content offers brands a way to immerse users within the brand story at a low barrier of entry. In this respect, Lacoste recommends brands use 3D content purposefully: “In many instances, a video suffices. 3D content must be used with purpose, and made interactive for the full effect.”

Little Brush Big Brush Case Video.00_00_15_17.Still009

The "Little Brush Big Brush" AR game for Unilever demonstrates how 3D content can offer new contexts for consumers to engage digitally.

Also look for opportunities to to maximize value and efficiency. “Let’s say I’ve made a 3D model of a perfume bottle to use in a banner,” says Lacoste. “We can reuse that in an AR lens or in a marketing video.” While most brands still consider 3D content as a “nice-to-have” rather than a “must-have,” it’s worth understanding the versatility of the assets.

In fact, in a webinar hosted by the In-House Agency Forum, ter Haar advised brands to “Try to make the 3D element part of your production workflow. One of the big challenges we see is that brands don’t have the assets.” By reusing pre-existing CAD designs, for example, much of the development work is already taken care of.

Whether watching linear video advertising in the form of product unboxings or engaging in new formats, like trying on makeup using AR filters, consumers are eager to replicate tactile, tangible shopping experiences in virtual environments. As brands face a reckoning moment to support this ever-increasing desire, they must do so strategically and efficiently. Simple 3D experiences like those delivered in Swirl ads offer an accessible way for brands to upgrade their storytelling and increase engagement that converts.

3D content offers a way for brands to captivate consumers through interactive, emotionally resonant experiences that replicate physical engagement. 3D Content Adds a New Dimension to ROI Swirl ads put a spin on tactile digital engagement.
3D content google swirl 3d ad format banner ads display ads ar filters ar augmented reality

Subscribe to 3D content