Multiverse launches CompactifAI portal for LLM ops
Thu, 19th Mar 2026
Multiverse Computing has launched a self-serve API portal that gives developers direct access to its compressed large language models, along with tools to manage usage and access.
Called the CompactifAI API Public Portal, it is aimed at development and operations teams looking to integrate compressed models into production systems with clearer oversight of consumption and permissions.
Multiverse Computing is known for "quantum-inspired" techniques that compress large language models. It positions compression as a way to cut compute requirements while maintaining model performance.
Until now, most customers accessed Multiverse Computing's models through cloud marketplaces, including the AWS Marketplace. The new portal shifts teams to direct access and changes how they manage deployments.
Direct integration
The portal supports direct API integration for Multiverse Computing's compressed models. It includes secure authentication, token management, and real-time usage monitoring, according to the company.
Developers can generate and manage API tokens in the portal. Organisations can track real-time usage metrics and model consumption, including model-level activity and indicators of scaling requirements. Account administration and access permissions are managed in the same interface.
Single Sign-On is also available as part of the authentication stack, described as "enterprise-grade" for organisations managing access to AI tools across multiple teams.
The portal is designed for deployments across cloud and on-premise environments. Multiverse Computing has not specified which infrastructure providers it expects customers to use, but says the product is compatible with different hosting choices.
Operational focus
Organisations experimenting with large language models often run into challenges beyond model accuracy, including access control, usage tracking, budgeting, and governance. Deployment also raises questions about who can call an API, how consumption is monitored, and what happens when demand spikes.
Multiverse Computing links the portal to these operational needs, arguing that compressed models alone do not remove friction if the surrounding tooling does not match how modern engineering teams work.
Enrique Lizaso, CEO of Multiverse Computing, said operational complexity is a key challenge in production deployments.
"The biggest barrier to deploying advanced AI models is often operational complexity," said Enrique Lizaso. "The CompactifAI API portal gives developers direct access to compressed models with the transparency and control needed to run them in production."
Compression approach
CompactifAI is Multiverse Computing's branding for its model compression work. The company says the approach reduces compute requirements while maintaining performance, and links compression to lower costs and a smaller infrastructure footprint.
Compressed models have become a prominent theme in the AI market as enterprises try to balance capability with inference costs. Smaller models can also simplify deployment in environments with strict latency, power, or hardware constraints. Multiverse Computing is positioning its compressed models as an option for teams that want to run large language model workloads with less compute.
Multiverse Computing has not published technical benchmarks as part of this announcement or disclosed portal pricing. It also did not specify which third-party large language models are available in compressed form through CompactifAI.
European context
The launch comes amid a broader push by European companies to reassess how and where AI workloads run. Many firms have expressed interest in reducing reliance on US hyperscalers for parts of their data and AI stacks. Procurement and compliance teams have also raised questions about data control, vendor concentration risk, and operational resilience.
Multiverse Computing presents direct API access as a more modular route for teams that want flexibility over hosting. It also points to lower compute needs, which could expand the environments where large language model inference is feasible, including on-premise deployments.
Target sectors
Multiverse Computing expects the portal to make its compressed models easier to deploy across sectors including finance, energy, and defence. These industries often have stricter governance requirements for access controls and monitoring, and typically run a mix of cloud and on-premise systems.
The company has not named launch customers or said whether the portal will support integrations beyond API access, such as observability platforms or identity providers beyond Single Sign-On.
The CompactifAI API Public Portal is available from today, giving developers and enterprises direct, self-serve access to Multiverse Computing's compressed models.