Location: Remote/Hybrid/On-site — Conshohocken, PA
Employment: Full-time
Reports to: Director of AI
About ZeroEyes, Inc.
ZeroEyes was founded by former Navy SEALs, self-starters and elite technologists with a mission to reduce the threat and impact of mass shootings and gun-related violence using our best-in-class artificial intelligence (AI) platform that detects visible firearms before there’s a threat. As a member of the ZeroEyes team, you’ll have the unique opportunity to join a forward-facing, purpose-driven company, and your perseverance and individual skill set will become crucial to our mission’s success.
About the role
We’re hiring a Senior AI Engineer to help lead applied research and productionization of video search, from natural-language queries to fast, scalable retrieval across archives and live streams. You’ll develop models, pipelines, and high-performance APIs. We value people who care more about truth than winning arguments, mentor generously, and take personal responsibility for the organization’s success.
What you’ll do
- Contribute to video search stack end-to-end: dataset curation, model training/fine-tuning, indexing, retrieval APIs, latency/throughput optimization, and real-world evaluation.
- Applied research → production: Evaluate and integrate V-JEPA2 style representations for video understanding and retrieval; compare/compose with CLIP/SigLIP/TimeSformer/ViViT/Video-LLMs for NL→video.
- Text–video alignment: Build query encoders for natural-language search (prompting, adapters, contrastive losses, distillation) and robust negative mining; support multilingual queries.
- Temporal grounding: Deliver moment-localization and highlight detection (segment-level embeddings, token-aligned pooling, temporal R@K / mAP).
- Indexing at scale: Stand up vector/search infra (FAISS, Milvus, pgvector, Pinecone) with sharding, HNSW/IVF/ScaNN, hybrid signals (text + metadata + structure).
- Latency & cost: Optimize preprocessing (frame sampling, shot detection), feature caching, batch inference, and low-latency serving (ONNX Runtime/TensorRT or ROCm paths).
- Cross-GPU strategies: Design and implement multi-GPU training and serving—FSDP/ZeRO, tensor & pipeline parallelism, sharded/streamed decoding, NCCL/RCCL communication tuning, mixed precision/quantization, and elastic autoscaling.
- Quality & evaluation: Define task-specific metrics (R@K, nDCG, mAP, temporal mAP), build dashboards and AB tests; run bias/robustness checks and failure-mode analyses.
- Security & compliance aware: Design for privacy, auditability, and clean separation of controlled data; collaborate with platform/DevOps on IaC, CI/CD, and observability.
- Mentor & collaborate: Level-up adjacent teams (ML Ops, backend, product). Write clear design docs and ADRs; lead design reviews.
What you’ll bring
- 6–10+ years total; 4+ years applying deep learning to video, vision, or multimodal retrieval with shipped features or products.
- Hands-on with PyTorch (preferred) and modern video backbones; practical experimentation with V-JEPA/V-JEPA2 (or JEPA-style self-supervised video objectives).
- Strong with text–image/video retrieval (CLIP-family, BLIP/BLIP-2, SigLIP, Q-Former/adapters) and contrastive training at scale.
- GPU performance & serving: mixed precision, ONNX Runtime/TensorRT (NVIDIA) or ROCm paths; profiling (nsys/nvprof/rocprof), post-training quantization, distillation.
- Cross-GPU & distributed training: FSDP/ZeRO, DDP, tensor/pipeline parallelism, NCCL/RCCL, model sharding/checkpointing, and cluster scheduling (Kubernetes + GPU operators).
- ROCm/MIGraphX experience (preferred): building/optimizing models on AMD GPUs; familiarity with MIOpen, MIGraphX backends, and ROCm toolchain.
- Search infrastructure: FAISS/Milvus/Pinecone/pgvector, ANN indexes (HNSW/IVF), re-ranking (cross-encoders), and caching strategies.
- Data & MLOps: scalable curation, labeling/weak supervision, feature stores, experiment tracking (Weights & Biases/MLflow), CI for ML, and reproducible training.
- Solid software engineering: Python (prod-grade), plus a systems language (Go/C++/Rust) or strong willingness to learn; API design; testing; code reviews.
- Clear communicator with a bias to measure, publish results, and change direction quickly when the data says so.
Nice-to-haves
- Temporal detection/segmentation, tracking, re-ID, and multi-camera association.
- Video-RAG and structured retrieval (combining embeddings with metadata/knowledge graphs).
- On-device or edge inference; WebRTC/RTSP ingest; FFmpeg/GStreamer pipelines.
- Experience in regulated or high-assurance environments (FedRAMP/HIPAA/CJIS) and privacy-preserving ML.
Values
- No jerks
- Be authentic
- Be effective
- Attention to detail
- All in, all the time
Eligibility
- Must be authorized to work in the U.S. Ability to obtain and maintain a Public Trust or other clearance may be required.
APPLY NOW
THANK YOU FOR YOUR DESIRE TO BECOME A MEMBER OF THE ZEROEYES TEAM!
Please create an application account by filling out our application form. We look forward to reviewing your application.
Submit all documents/screenshots in PDF format.
- Resumé/CV
- Personal Statement: Please provide a personal statement (maximum 200 words) that explains why you’re a great fit for our mission and the position you’re applying for. Try to use specific examples.
Please use this format when naming your files:
- For your Resume: LastName_FirstName_Resume
- For your Personal Statement: LastName_FirstName_PersonalStatement
- Examples: Doe_John_Resume, Doe_John_PersonalStatement
If you have any questions or problems during the application process, please contact recruiting@zeroeyes.com.