ML + Software Engineer

Who We Are

Oction Labs is an AI-first infrastructure company building the operating system for the agentic economy. We run a 12-agent production hive that handles research, content creation, data intelligence, and client delivery — and we are deploying this same infrastructure for hospitals, government agencies, cultural communities, and enterprise clients.

This is not a job at a company "exploring AI." We are already shipping agents into production. Our stack includes LangChain, LangGraph, LiteLLM, OpenClaw, Cloudflare, local GPU nodes, and a 6-layer memory architecture. We have agents that know who they are, remember what they did last week, and route tasks to specialists based on capability.

We are also doing something no one else is doing at our scale: building AI systems for cultural preservation — embedding rare Indigenous languages, Punjabi, Hindi, and other underrepresented language corpora into production retrieval systems. This work will outlast the software.

The Role

We need an ML + Software Engineer who can operate across the full stack: from fine-tuning embedding models on low-resource language data, to hardening our de-identification pipeline, to shipping agent code that runs at the edge. You will work directly in production from week one. The codebase is Python-first with TypeScript at the edge layer.

Core Responsibilities

Agentic Infrastructure

Extend and maintain the LangChain + LangGraph agent graph powering our 12-agent hive
Build new agent nodes (personas, specialist agents, tool-call schemas) as Oction expands into new verticals
Design agent memory flows: RAG retrieval, Redis session memory, vault escalation
Optimize LiteLLM proxy routing: cost routing, model fallback chains, per-tenant key management
Monitor agent behavior in production, diagnose hallucination patterns, implement grounding fixes

Cultural Preservation and Low-Resource Language ML

Fine-tune multilingual embedding models (mBERT, XLM-R) on Punjabi, Hindi, Gurmukhi script, and Indigenous Canadian language corpora
Build retrieval systems that work in code-switched contexts (English + Punjabi in the same query)
Work with community linguists and cultural advisors to ensure accuracy and cultural appropriateness
Design training pipelines that run on our local GPU node without commercial cloud GPU for every experiment

Data Intelligence Pipeline

Maintain and extend Oction's edge-first de-identification pipeline with configurable k-anonymity
Build differential privacy layers for aggregate statistics and federated learning gradient sharing
Federated learning infrastructure: local training at client nodes, gradient aggregation, model distribution
Build synthetic data generation pipelines for medical NLP training corpora
Audit pipeline integrity against adversarial re-identification attempts

Cloud and Edge Infrastructure

Deploy and maintain agent workloads on Cloudflare Workers (TypeScript) and Python services on Oscar (local GPU node)
Manage per-tenant isolation: separate LiteLLM keys, isolated vector databases, separate agent configurations
Implement observability: structured logging, agent trace capture, cost tracking per tenant
GPU utilization optimization: model quantization, batching strategies, VRAM management

Must-Have Skills

Machine Learning

Hands-on experience fine-tuning transformer models (BERT-class or better) for specific tasks
Solid understanding of embedding models and dense retrieval (bi-encoders, cross-encoders, FAISS)
Experience building and evaluating RAG pipelines in production — not just tutorials
Familiarity with multilingual NLP: non-Latin script tokenization, code-switching, multilingual evaluation
Ability to train models on limited data: few-shot techniques, transfer learning, data augmentation
Working knowledge of differential privacy (epsilon, sensitivity, noise mechanisms)

Software Engineering

Python proficiency: async, type hints, packaging, pytest
Experience shipping software that other people depend on — production, not just notebooks
API development (FastAPI, Flask, or equivalent)
Git and basic DevOps: you can deploy your own changes

Agentic Systems (at least one)

LangChain or LangGraph experience
Experience building tool-use / function-calling systems with any major LLM provider
Production-level prompt engineering (not just chat completions)

Nice to Have

Cloudflare Workers / Pages / D1 / KV (TypeScript)
Punjabi, Hindi, or any Indigenous Canadian language knowledge — even conversational level is valuable
Healthcare data, EHR systems, or clinical NLP background
Federated learning frameworks (Flower, PySyft)
Audio transcription pipelines (Whisper-class, forced alignment)
OCR for non-Latin scripts (Tesseract, PaddleOCR)
Graph databases or knowledge graph construction
Prior work in cultural institutions, archives, or heritage organizations

Compensation

Junior — Mid

$80K–$110K

2–4 years relevant exp.

Senior

$110K–$145K

5–9 years relevant exp.

Staff

$145K–$180K

10+ years or exceptional depth

Equity in Oction Labs — pre-Series A; amount reflects role level, cap table shared during process
Remote-first, async-friendly, no mandatory video-call culture
$2,000 home office budget at onboarding + $500/year ongoing
$1,500/year for courses, conferences, and compute credits
Overlap with Eastern Time for at least 4 hours daily

Interview Process

Application

Submit your LinkedIn below. Tell us what you have built that is most relevant. No template cover letter — write it like you would write to a colleague.

Technical Screen (60 min)

We talk through your experience and go deep on one area you claim expertise in. No trick questions.

Paid Take-Home Build (4–6 hours)

A real task from our actual backlog. You will be compensated for your time. If it is good, we may actually use it.

Final Conversation (60 min)

With Brandon (CEO). Culture, direction, equity, and what you want to build. Offer within 1–2 weeks of first contact.

Apply via LinkedIn

Submit your profile and we will review it within 5 business days.

Oction Labs is committed to hiring people from all backgrounds. We particularly encourage applications from Indigenous Canadians, South Asian diaspora, and other communities whose languages and cultures are represented in our preservation work.