
AI AgentsNews
SageMaker adds OpenAI-compatible APIs for self-hosted inference
Amazon SageMaker AI now supports OpenAI-compatible APIs for real-time inference endpoints, allowing developers to invoke models by simply changing the endpoint URL without custom clients or code rewrites. The feature exposes a /openai/v1 path that accepts Chat Completions requests and works with OpenAI SDK, LangChain, and Strands Agents. SageMaker routes requests based on endpoint name and supports time-limited bearer tokens, enabling multi-model hosting, agentic workflows on owned infrastructure, and deployment of fine-tuned models without application changes.
by Marc Karpยท AWS Machine Learning Blog
Source