Herd Security is an agentic AI creative platform built for continuous security training and simulation. Founded in 2025, Herd helps organizations move beyond once-a-year compliance checkboxes, replacing static programs with dynamic curricula that evolve alongside emerging threats without adding operational overhead. Security and GRC teams use the platform as a creative partner to translate practitioner expertise into compelling content and deploy it across the channels employees frequently use, including Slack, Teams, and LMS. When a new threat surfaces, the platform enables organizations to shift from IT tickets or vendor requests to iterative microlessons delivered the same day. This puts the people behind security back at the center, empowering practitioner expertise, making awareness more habitual, and using every interaction to fuel a feedback loop that strengthens the human layer of defense.
Position Overview
We are hiring a Senior AI Engineer to own the AI systems at the heart of Herd's platform. The agentic, generative side of our product is not a feature—it's the entire thesis. You will design and build the retrieval pipelines, chatbot experiences, content generation workflows, and evaluation systems that let security and GRC teams collaborate with Herd as a creative partner. This is a senior role with real technical authority: you will make foundational decisions about how we structure knowledge, how we ground model outputs, how we evaluate quality, and how we scale AI reliably in front of enterprise customers. You will work in the same stack as the rest of engineering—TypeScript and PostgreSQL—and you will bring deep AI expertise to a team that needs a senior voice in the room.
What You'll Work On
You will build and maintain the RAG systems that ground Herd's content generation in real security expertise, customer-specific context, and emerging threat intelligence. You will design the chatbot experiences learners and administrators interact with inside Slack, Teams, and the Herd web app—experiences that have to feel helpful rather than gimmicky, and that have to behave reliably in front of enterprise security buyers. You will make the architectural calls on retrieval, embedding strategy, vector storage, reranking, orchestration frameworks, model selection, and evaluation. You will also build the evaluation harnesses—automated and human-in-the-loop—that tell us whether our AI is getting better, worse, or differently bad with each release. And you will do this in TypeScript, on top of PostgreSQL (with pgvector or equivalent), in the same codebase as the rest of the team—not as a siloed ML stack.
Responsibilities
Design and maintain RAG pipelines, including ingestion, chunking, embedding, retrieval, reranking, and grounding strategies
Build chatbot and conversational AI experiences across Slack, Teams, and web, with attention to latency, reliability, and graceful handling of edge cases
Own model selection and orchestration decisions across foundation model providers (OpenAI, Anthropic, open-source alternatives) based on cost, quality, and latency tradeoffs
Build evaluation systems—offline benchmarks, online quality metrics, human review workflows—that make AI quality measurable rather than anecdotal
Design prompt architectures, agent workflows, and tool-use patterns that produce consistent, grounded outputs at scale
Partner with product and content teams to translate practitioner expertise into retrieval corpora, prompts, and generation templates
Set the technical direction for AI at Herd and mentor other engineers as the team grows
Stay close to the rapidly evolving AI landscape and bring promising capabilities into production without chasing every shiny new thing
Required Qualifications
6+ years of software engineering experience, with at least 2 years focused on production AI/LLM systems
Strong proficiency in TypeScript and experience working in strict type-safe codebases
Solid PostgreSQL experience, including schema design, query performance, and ideally experience with pgvector or comparable vector storage
Deep, hands-on experience building RAG systems in production—not demos, not prototypes, but RAG that real users depend on
Experience designing and shipping chatbot or conversational AI experiences, including handling multi-turn context, tool use, and failure modes
Fluency with LLM orchestration patterns, prompt engineering for structured outputs, and the realities of working with foundation model APIs (rate limits, latency, cost, non-determinism)
Experience building evaluation systems for AI outputs—you believe evals are infrastructure, not an afterthought
Ability to make senior-level architectural decisions and defend them, while staying pragmatic about what to build versus buy
Preferred Qualifications
Experience building AI features on top of Slack Bolt, Microsoft Bot Framework, or similar messaging platforms
Background with agentic frameworks or multi-step reasoning systems in production
Familiarity with fine-tuning, distillation, or other techniques beyond pure prompt engineering
Prior experience at early-stage startups where the AI roadmap was being defined in real time
Background in security, GRC, or other domains where output accuracy has real consequences
Experience mentoring engineers or setting technical direction for a team
Compensation & Benefits
Salary: $140,000–$200,000 OTE
Meaningful equity
Health, dental, vision, and life benefits
Unlimited PTO
WFH Stipend
Opportunity to own the AI foundation of an AI-native company
Direct impact on a small, focused team with minimal process overhead
Logistics
Location: Remote — US Mandatory, California Preferred