Founding ML Researcher - Document AI / Vision-Language Models
Location: San Francisco, CA (In-Person)
Compensation: $200K-$300K Base + Equity + Benefits
Stack: Python, machine learning, vision-language models (VLMs), document AI, multimodal systems, model training, evaluation frameworks, inference optimization, and production ML infrastructure.
TLDR
- Our client is building AI systems that transform unstructured documents into structured, actionable information, tackling one of the largest and most valuable data problems in enterprise software.
- This is a true founding-team opportunity where one of the earliest ML hires will directly shape the company’s research direction, engineering culture, and long-term technical strategy.
- Engineers own the entire machine learning lifecycle—from research and experimentation through deployment, evaluation, and production iteration—rather than focusing solely on model development.
- The company is operating in one of the fastest-moving areas of AI, applying multimodal models and document understanding systems to real-world business problems with immediate customer impact.
- Candidates will have significant ownership, direct access to the founders, and the opportunity to help define the technical foundation of the company from its earliest stages.
Requirements
- Experience building and deploying production machine learning systems, with ownership across research, experimentation, evaluation, and deployment.
- Strong background in machine learning, deep learning, computer vision, multimodal AI, document understanding, or closely related fields.
- Experience training, fine-tuning, evaluating, or deploying state-of-the-art models that operate on unstructured data.
- Strong Python skills and experience building scalable ML pipelines, experimentation frameworks, and production systems.
- Ability to operate independently in highly autonomous startup environments while working closely with founders to shape technical direction.
Bonus Skills
- Experience with vision-language models, document AI, OCR, information extraction, or multimodal reasoning systems.
- PhD or equivalent research experience in machine learning, computer vision, multimodal AI, or a related discipline.
- Publications, open-source contributions, or demonstrated technical leadership within advanced ML domains.
- Experience with model serving, inference optimization, distributed training, or large-scale ML infrastructure.
- Experience building AI systems that bridge research innovation and production deployment.
Responsibilities
- Own the end-to-end machine learning lifecycle, from research and experimentation through deployment, evaluation, and continuous improvement.
- Develop and improve models that extract, understand, and reason over complex unstructured information.
- Design evaluation frameworks and experimentation systems that drive measurable improvements in model quality and performance.
- Partner closely with engineering leadership to translate research breakthroughs into scalable production systems.
- Influence the company’s long-term ML strategy, technical roadmap, and research direction.
- Help establish engineering and research best practices that will scale with the organization as it grows.
About
- Our client is building advanced AI systems focused on understanding and extracting value from complex unstructured information, helping organizations unlock insights that were previously difficult to access or automate.
- Founding ML Researchers sit at the center of the company’s technical strategy, combining cutting-edge machine learning research with real-world product development and customer impact.
- This is a highly autonomous role with significant ownership across research, engineering, and product direction, offering the opportunity to shape both the technology and the future team.
- The role provides deep exposure to multimodal AI, vision-language models, document understanding, production ML systems, and the challenges of bringing frontier AI capabilities into real-world applications.