Together AI

Together AI operates a purpose-built GPU cloud platform for training, fine-tuning, and deploying generative AI models. The infrastructure is designed without vendor lock-in, serving developers and organizations that need to run open-source models at scale. The engineering work centers on distributed systems, model optimization, and AI infrastructure - areas where trade-offs between throughput, latency, and operational complexity define production viability.

The company maintains active contributions to open-source projects including FlashAttention, Mamba, and RedPajama. Engineers and researchers work in close proximity, with new hires taking ownership of substantial technical challenges from the start. The tech stack spans PyTorch, CUDA, TensorRT, TensorRT-LLM, vLLM, SGLang, and TGI, reflecting the requirement to support multiple inference backends and optimization paths. Work involves designing distributed inference engines and developing model architectures where performance characteristics - memory bandwidth utilization, kernel fusion opportunities, multi-GPU coordination overhead - directly impact what models can run economically in production.

Technical problems include optimizing inference for various model architectures across heterogeneous GPU clusters, managing the reliability and cost trade-offs in serving large language models, and building tooling that makes open-source AI accessible without sacrificing control over deployment parameters. The platform must handle the operational complexity of supporting diverse workloads: training runs with different parallelization strategies, fine-tuning jobs with varying dataset sizes, and inference deployments where tail latency matters.

About

Markets

Open roles at Together AI

Staff Engineer, Product UI Platform

Staff Data Warehouse Engineer

Customer Support Engineer (Inference), India

Sales and Marketing Operations Manager

Senior Backend Engineer - Together Cloud

Senior Developer Productivity Engineer

AI Researcher, Core ML

Systems Research Engineer, GPU Programming

Machine Learning Engineer - Inference

LLM Inference Frameworks and Optimization Engineer

Senior Software Engineer - Together Cloud Platform

Senior Software Engineer - Together Cloud Infrastructure

Solutions Architect

Senior Backend Engineer, Inference Platform

Machine Learning Engineer

Customer Support Engineer, India

Senior Data Engineer

Senior Platform Engineer, Voice AI

Senior Machine Learning Engineer, Voice AI

Sr. Partnerships Manager, Model Ecosystem

Similar companies