OpenAI has yet to publicly announce its new developer product, Foundry, but screenshots shared on social media reveal a product brief from the company that confirms its launch, CMS Ware reports.
According to Travis Fischer, owner of the news tweet, Foundry allows customers to run OpenAI’s inference models at scale with dedicated capacity.
OpenAI has privately announced a new developer product called Foundry, which enables customers to run OpenAI model inference at scale w/ dedicated capacity.
It also reveals that DV (Davinci; likely GPT-4) will have up to 32k max context length in the public version. 🔥 pic.twitter.com/5KEsWLqPdc
— Travis Fischer (@transitive_bs) February 21, 2023
Foundry is designed to serve the needs of advanced users with heavier workloads by offering dedicated capacity to handle Open AI models and enabling large-scale inference while providing control over model configuration and performance profile.
Some of OpenAI-Foundry’s capabilities include:
– Static capacity allocation dedicated to the user and providing a predictable environment that can be controlled.
– Ability to monitor specific instances and optimize shared capacity models with the same tools and dashboards used by Open AI.
– Ability to realize all of the throughput, latency, and cost benefits resulting from workload optimization, including tradeoffs with caching and latency reduction.
Open AI is expected to offer “more robust fine-tuning options for its latest models very soon, and Foundry will be the platform to service those models.”
Foundry will also offer SLAs, for example for uptime and on-call engineering support, “guaranteeing” 99.5% uptime and on-call engineering support for customers.