Back to Jobs

Software Engineering / AI Infrastructure

Baseten San Francisco, CA or New York, NY
Posted 1 week, 3 days ago
Deadline: Not specified
Full Time Mid-Level Software Engineering

Baseten powers AI inference for some of the world’s leading AI-driven companies, including Sourcegraph, Writer, Gamma, and OpenEvidence. Backed by top investors and a recent $150M Series D, Baseten is scaling its engineering team to meet the growing demand for high-performance model deployment infrastructure. The company seeks a Software Engineer - Model APIs to join the Model Performance team. In this role, you will design, build, and optimize the core infrastructure that enables scalable and efficient model serving across distributed systems. You’ll work on performance-critical components from TensorRT-LLM kernels to API reliability and collaborate across teams to deliver developer-friendly solutions that set new standards for inference performance. This is a rare opportunity to shape how the world’s most dynamic AI organizations deploy and scale large language models in production. If you’re a systems-oriented engineer with a passion for performance, distributed systems, and real-world AI infrastructure, Baseten wants to hear from you.

 

Application accepted until position is filled

Requirements

1. 3+ years of experience building or maintaining distributed systems or large-scale APIs.
2. Proven track record in operating low-latency backend services (auth, quotas, metering, rate limiting).
3. Strong skills in profiling, tracing, and optimizing system performance.
4. Ability to debug complex systems across runtime and GPU layers.
5. Excellent written and verbal communication with clear documentation skills.
6. (Preferred) Experience with LLM runtimes (vLLM, SGLang, TensorRT-LLM).
7. Familiarity with Kubernetes, service meshes, or distributed scheduling systems.
8. Background in developer-facing infrastructure or open-source API systems.

Benefits

1. $150K–$230K salary with generous equity grants.
2. Inclusive, growth-focused hybrid work environment.
3. Work with top AI startups and cutting-edge model infrastructure.
4. Comprehensive compensation and performance-based rewards.
5. Opportunities for professional development and learning in AI systems engineering.

Company Size
Employment Type
Full Time
Work Mode
On-site (San Francisco, CA or New York, NY)
Apply Externally
Notice: You are about to leave RemoteWok and apply on an external site.
The application process will continue on the employer's website.
View Company Profile

Location

San Francisco, CA or New York, NY