Building the Future of Open Finance
Payward, the parent organization behind Kraken, NinjaTrader, Breakout, xStocks, Payward Services and CF Benchmarks, has spent the past 15 years developing one of the most modern and globally accessible financial infrastructure platforms in the industry, designed to support and advance an open, global financial system.
Prior to applying, candidates are encouraged to review the organization’s culture page to gain a clearer understanding of its values, working principles, and what drives the team.
Kraken Culture
The Team
Founded in 2011, Kraken is one of the world’s longest-established crypto platforms, trusted by over 10 million individuals and institutions globally. The platform provides a wide range of services, including spot trading, margin trading, futures, staking, and over-the-counter (OTC) services, catering to both retail and institutional clients.
The AI Infrastructure team is responsible for building and operating the production systems that support intelligent agents at scale. This team forms a core layer of the agent platform, ensuring that model inference, orchestration, and execution systems remain reliable, observable, and highly performant under real-world production workloads.
In close collaboration with the Agent Systems team and broader infrastructure groups, the team owns the foundational components that enable agents to operate safely and effectively across internal systems. The environment is large-scale and mission-critical, with systems serving millions of users while maintaining strict requirements for reliability, latency, and correctness.
This is a highly production-focused engineering team, where engineers combine strong systems expertise with applied machine learning infrastructure experience. The work is primarily built in Rust, operating distributed services where performance characteristics and failure handling are critical considerations.
Core Responsibilities
- Design and develop the infrastructure layer powering AI agent systems in production environments
- Build high-performance Rust services for model inference, orchestration, and execution at scale
- Architect scalable distributed systems capable of supporting millions of users and high request throughput
- Develop robust ML infrastructure and MLOps practices for model deployment, evaluation, and monitoring
- Define and implement guardrails, observability, and failure-handling mechanisms for agent-driven workflows
- Optimize system performance, including latency, throughput, and cost across inference and orchestration layers
- Collaborate closely with the Agent Systems team to transition experimental prototypes into production-ready systems
- Contribute to key foundational infrastructure decisions in a high-scale, high-impact engineering environment
Skills & Requirements
- The candidate is expected to have 5+ years of experience designing, building, and operating large-scale production systems
- Strong proficiency in Rust and systems-level programming is required
- A deep understanding of distributed systems, reliability engineering, and performance optimization is essential
- Proven experience operating services that support millions of users or high-throughput workloads is required
- Familiarity with machine learning infrastructure, model serving, or MLOps within production environments is highly desirable
- Experience designing observability frameworks, monitoring systems, and failure recovery mechanisms is required
- Strong collaboration skills across infrastructure, platform, and applied engineering teams are essential
- A strong ownership mindset is expected, particularly within high-stakes production environments
Nice to have
- Experience building infrastructure for agent-based systems or LLM-powered platforms
- Background in high-performance networking, asynchronous systems, or low-latency architectures
- Experience with container orchestration and cloud-native infrastructure technologies
- Familiarity with evaluation frameworks and large-scale model performance monitoring
- Experience working in fast-moving 0→1 environments or platform-building teams
Important Information
- Unless a specific application deadline is stated in the job posting, applications will be accepted on an ongoing basis.
- Applicants are permitted to redact or remove personal information from their CVs, including age, date of birth, and dates of attendance at or graduation from educational institutions.
- Qualified candidates with criminal histories are considered for employment, with assessments conducted in accordance with applicable regulations, including the San Francisco Fair Chance Ordinance.

