Suno operates a generative AI platform for text-to-music synthesis, producing complete songs with vocals and instrumentation from natural language prompts. The technical work centers on training models to understand melody - a non-trivial challenge requiring cross-domain expertise in audio engineering, machine learning, and music production. The company maintains three office locations across Cambridge, New York, and Los Angeles, staffing teams that combine musicians, audio engineers, and ML practitioners.
The core infrastructure involves model training pipelines purpose-built for musical understanding rather than general audio synthesis. This requires handling the specific latencies and quality trade-offs inherent to generative audio: output coherence across time, harmonic consistency, and production values that meet consumer expectations for listenable music. The team's composition - technical staff with musical backgrounds - addresses the evaluation problem directly: musical quality metrics remain difficult to automate, making human judgment from domain experts operationally necessary.
Suno's stated technical approach blends musical intuition with systems work, suggesting decision-making that weighs subjective quality alongside standard ML metrics. This creates tension between data-driven optimization and aesthetic judgment - a bottleneck common in creative ML applications where human preference doesn't reduce cleanly to loss functions. The platform targets democratized music creation, which implies scale requirements and cost constraints typical of consumer-facing generative AI: balancing inference costs against output quality while maintaining acceptable latency for interactive use.