Flex Processing: OpenAI Launches Beta with Half-Price AI Models to Compete with Google’s Budget Genius

San Francisco, California – OpenAI, a leading AI company, is unveiling a new competitive strategy to keep up with rivals like Google. The company is introducing Flex processing, an API option that offers lower prices for AI model usage in exchange for longer response times and intermittent unavailability of resources.

This new Flex processing feature is currently in beta for OpenAI’s o3 and o4-mini reasoning models. It is primarily designed for tasks that are less critical or for non-production purposes, such as model evaluations, data enrichment, and asynchronous workloads, according to OpenAI.

Notably, Flex processing cuts API costs in half for users. For the o3 model, the price is $5 per million input tokens (approximately 750,000 words) and $20 per million output tokens, compared to the standard rates of $10 per million input tokens and $40 per million output tokens. For the o4-mini model, Flex processing reduces the cost to $0.55 per million input tokens and $2.20 per million output tokens from $1.10 per million input tokens and $4.40 per million output tokens.

The introduction of Flex processing comes at a time when the cost of advanced AI technologies is rising, with competitors unveiling more budget-friendly and efficient models. Recently, Google launched Gemini 2.5 Flash, a reasoning model that outperforms DeepSeek’s R1 in performance while offering lower input token costs.

OpenAI also mentioned in an email to customers that developers in tiers 1-3 of its usage hierarchy will need to undergo a new ID verification process to access the o3 model. This verification requirement aims to prevent misuse of OpenAI’s services by unauthorized or malicious users. Reasoning summaries and streaming API support for o3 and other models are now subject to this verification process as well.