Choose your language

Choose your language

The website has been translated to English with the help of Humans and AI

Dismiss

Senior AI Inference Engineer

Apply Now

Please note that we will never request payment or bank account information at any stage of the recruitment process. As we continue to grow our teams, we urge you to be cautious of fraudulent job postings or recruitment activities that misuse our company name and information. Please protect your personal information during any recruitment process. While Monks may contact potential candidates via LinkedIn, all applications must be submitted through our official website (monks.com/careers).

About the Role

As a Senior AI Inference Engineer, you’ll play a pivotal role in designing and delivering advanced agentic and visual AI systems for leading organizations across Media, Entertainment, Gaming, and Sport. You’ll partner directly with major professional sports leagues and global media brands, transforming ambiguous business problems into scalable, high-performance AI architectures that can “see” and reason about video in real time. From early-stage discovery and pre-sales through architecture, implementation, and optimization on modern GPU and cloud infrastructure, you’ll own the full lifecycle of complex AI inference solutions.

 

Responsibilities

  • Architect, implement, and optimize end-to-end AI inference services and agentic pipelines in Python.
  • Design autonomous agents that can interpret, reason about, and act on video and multi-modal content.
  • Integrate Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into robust, production-grade workflows.
  • Leverage LLM/agent orchestration frameworks (e.g., LangGraph, AutoGen, Semantic Kernel or similar) to coordinate complex visual AI tasks.
  • Deploy and operate services on Kubernetes (and potentially OpenShift or NVIDIA Holoscan), ensuring reliability and scalability under heavy media workloads.
  • Architect distributed systems on AWS, making informed trade-offs across performance, cost, and resilience.
  • Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell), focusing on real-time and high-throughput media use cases.
  • Collaborate directly with clients in MEGS, including participating in pre-sales discussions to validate feasibility, shape solutions, and clarify the “why” behind requirements.
  • Create clear architecture diagrams and technical documentation that align both technical and non-technical stakeholders.
  • Provide technical leadership to project teams, guiding implementation to stay true to the intended architecture and product value.
  • (Nice to have) Work with video tooling such as FFmpeg, GStreamer, NVENC/NVDEC, and modern codecs (H.264/5), and explore emerging tools such as Mojo or NVIDIA Holoscan for Media.
  • (Nice to have) Design and deploy AI solutions to edge devices and on-premise or hybrid clusters.

About You

Qualifications & Skills

  • Significant professional experience (senior level) building and shipping AI/ML systems in production, with strong Python and a modern data/ML stack.
  • Proven track record taking models from notebooks or prototypes into robust, low-latency inference services.
  • Extensive hands-on experience building agentic systems, especially those involving computer vision or multi-modal inputs.
  • Demonstrated experience architecting autonomous agents that can “see” and reason about video content.
  • Practical experience integrating Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into complex workflows.
  • Familiarity with LLM/agent orchestration frameworks (e.g., LangGraph, AutoGen, Semantic Kernel or equivalents) applied to visual or multi-modal tasks.
  • Strong practical experience with Kubernetes in production.
  • Experience architecting distributed systems on AWS beyond simply provisioning basic instances.
  • Understanding of modern NVIDIA GPU architectures (e.g., Ampere, Hopper, Blackwell) and how to optimize workloads for them.
  • Product-minded and value-driven: able to align technical decisions with business outcomes and ROI.
  • Excellent communication skills, with the ability to explain complex architectures to both CTO-level and non-technical stakeholders and to participate comfortably in client and pre-sales conversations.
  • Self-starter who thrives in ambiguity, enjoys reading source code, and is motivated by solving problems that lack clear existing patterns or documentation.
  • Nice to have: Experience with FFmpeg, GStreamer, NVENC/NVDEC, and modern video codecs; OpenShift or NVIDIA Holoscan for Media; Mojo language; and/or deploying AI systems on edge devices or hybrid/on-prem environments.
  • Location: Remote within North or South America.

 

We are committed to fostering an environment where a diversity of perspectives can thrive. We design our hiring practices to promote equity and inclusion and to mitigate bias. We welcome applications from candidates of all backgrounds who are excited to build cutting-edge AI systems for the Media, Entertainment, Gaming, and Sport industries.

 

#LI-JE1 #LI-Remote

About Monks

Monks is the global, purely digital, unitary operating brand of S4Capital plc. With a legacy of innovation and specialized expertise, Monks combines an extraordinary range of global marketing and technology services to accelerate business possibilities and redefine how brands and businesses interact with the world. Its integration of systems and workflows delivers unfettered content production, scaled experiences, enterprise-grade technology and data science fueled by AI—managed by the industry’s best and most diverse digital talent—to help the world’s trailblazing companies outmaneuver and outpace their competition.

Monks was named a Contender in The Forrester Wave™: Global Marketing Services. It has remained a constant presence on Adweek’s Fastest Growing lists (2019-23), ranks among Cannes Lions' Top 10 Creative Companies (2022-23) and is the only partner to have been placed in AdExchanger’s Programmatic Power Players list every year (2020-24). In addition to being named Adweek’s first AI Agency of the Year (2023), Monks has been recognized by Business Intelligence in its 2024 Excellence in Artificial Intelligence Awards program in three categories: the Individual category, Organizational Winner in AI Strategic Planning and AI Product for its service Monks.Flow. Monks has also garnered the title of Webby Production Company of the Year (2021-24), won a record number of FWAs and has earned a spot on Newsweek’s Top 100 Global Most Loved Workplaces 2023.

 

We are an equal-opportunity employer committed to building a respectful and empowering work environment for all people to freely express themselves amongst colleagues who embrace diversity in all respects. Including fresh voices and unique points of view in all aspects of our business not only creates an environment where we can all grow and thrive but also increases our potential to produce work that better represents—and resonates with—the world around us. 

Two employees working together at a computer Two employees working at a desk on a project together

Interested?
Apply for this job!

At Monks, we are committed to protecting your personal information. As part of our recruitment process, we collect and process personal data to evaluate your application and communicate with you. To understand how we handle your information, including the types of data we collect, how we use it, and your rights, please read our Monks Candidate Privacy Notice. We encourage you to review this notice to ensure you are fully informed about how your data will be managed during your application process.

Personal Details

Country

Uploads

Cover Letter

Supported Files: pdf, doc, docx, txt and rtf

Resume / CV

Supported Files: pdf, doc, docx, txt and rtf

DisabilityStatus

VeteranStatus

Race

Gender

Thank you!

We have received your application. We will be in touch via email.

Return to open roles

Choose your language

Choose your language

The website has been translated to English with the help of Humans and AI

Dismiss