Back to all jobs

Senior AI Engineer

Apply
Engineering Remote (US)

About Herd Security

Herd Security is an agentic AI creative platform built for continuous security training and simulation. Founded in 2025, Herd helps organizations move beyond once-a-year compliance checkboxes, replacing static programs with dynamic curricula that evolve alongside emerging threats without adding operational overhead. Security and GRC teams use the platform as a creative partner to translate practitioner expertise into compelling content and deploy it across the channels employees frequently use, including Slack, Teams, and LMS. When a new threat surfaces, the platform enables organizations to shift from IT tickets or vendor requests to iterative microlessons delivered the same day. This puts the people behind security back at the center, empowering practitioner expertise, making awareness more habitual, and using every interaction to fuel a feedback loop that strengthens the human layer of defense.

Position Overview

We are hiring a Senior AI Engineer to own the AI systems at the heart of Herd's platform. The agentic, generative side of our product is not a feature—it's the entire thesis. You will design and build the retrieval pipelines, chatbot experiences, content generation workflows, and evaluation systems that let security and GRC teams collaborate with Herd as a creative partner. This is a senior role with real technical authority: you will make foundational decisions about how we structure knowledge, how we ground model outputs, how we evaluate quality, and how we scale AI reliably in front of enterprise customers. You will work in the same stack as the rest of engineering—TypeScript and PostgreSQL—and you will bring deep AI expertise to a team that needs a senior voice in the room.

What You'll Work On

You will build and maintain the RAG systems that ground Herd's content generation in real security expertise, customer-specific context, and emerging threat intelligence. You will design the chatbot experiences learners and administrators interact with inside Slack, Teams, and the Herd web app—experiences that have to feel helpful rather than gimmicky, and that have to behave reliably in front of enterprise security buyers. You will make the architectural calls on retrieval, embedding strategy, vector storage, reranking, orchestration frameworks, model selection, and evaluation. You will also build the evaluation harnesses—automated and human-in-the-loop—that tell us whether our AI is getting better, worse, or differently bad with each release. And you will do this in TypeScript, on top of PostgreSQL (with pgvector or equivalent), in the same codebase as the rest of the team—not as a siloed ML stack.

Responsibilities

  • Design and maintain RAG pipelines, including ingestion, chunking, embedding, retrieval, reranking, and grounding strategies
  • Build chatbot and conversational AI experiences across Slack, Teams, and web, with attention to latency, reliability, and graceful handling of edge cases
  • Own model selection and orchestration decisions across foundation model providers (OpenAI, Anthropic, open-source alternatives) based on cost, quality, and latency tradeoffs
  • Build evaluation systems—offline benchmarks, online quality metrics, human review workflows—that make AI quality measurable rather than anecdotal
  • Design prompt architectures, agent workflows, and tool-use patterns that produce consistent, grounded outputs at scale
  • Partner with product and content teams to translate practitioner expertise into retrieval corpora, prompts, and generation templates
  • Set the technical direction for AI at Herd and mentor other engineers as the team grows
  • Stay close to the rapidly evolving AI landscape and bring promising capabilities into production without chasing every shiny new thing

Required Qualifications

  • 6+ years of software engineering experience, with at least 2 years focused on production AI/LLM systems
  • Strong proficiency in TypeScript and experience working in strict type-safe codebases
  • Solid PostgreSQL experience, including schema design, query performance, and ideally experience with pgvector or comparable vector storage
  • Deep, hands-on experience building RAG systems in production—not demos, not prototypes, but RAG that real users depend on
  • Experience designing and shipping chatbot or conversational AI experiences, including handling multi-turn context, tool use, and failure modes
  • Fluency with LLM orchestration patterns, prompt engineering for structured outputs, and the realities of working with foundation model APIs (rate limits, latency, cost, non-determinism)
  • Experience building evaluation systems for AI outputs—you believe evals are infrastructure, not an afterthought
  • Ability to make senior-level architectural decisions and defend them, while staying pragmatic about what to build versus buy

Preferred Qualifications

  • Experience building AI features on top of Slack Bolt, Microsoft Bot Framework, or similar messaging platforms
  • Background with agentic frameworks or multi-step reasoning systems in production
  • Familiarity with fine-tuning, distillation, or other techniques beyond pure prompt engineering
  • Prior experience at early-stage startups where the AI roadmap was being defined in real time
  • Background in security, GRC, or other domains where output accuracy has real consequences
  • Experience mentoring engineers or setting technical direction for a team

Compensation & Benefits

  • Salary: $140,000–$200,000 OTE
  • Meaningful equity
  • Health, dental, vision, and life benefits
  • Unlimited PTO
  • WFH Stipend
  • Opportunity to own the AI foundation of an AI-native company
  • Direct impact on a small, focused team with minimal process overhead

Logistics

  • Location: Remote — US Mandatory, California Preferred
  • Type: Full-time
  • Reports to: CTO
Apply for this role