Google DeepMindFACTS Benchmark Suite: Systematically evaluating the factuality of large language models
The FACTS Benchmark Suite offers a comprehensive evaluation framework with multiple benchmarks to systematically measure and improve the factual accuracy of large language models across parametric, search-based, multimodal, and grounded contexts.
DatabricksIntroducing OfficeQA: A Benchmark for End-to-End Grounded Reasoning
OfficeQA is a newly introduced benchmark leveraging proprietary-like enterprise datasets to evaluate AI agents on end-to-end grounded reasoning tasks involving complex document parsing, information retrieval, and analytical question answering with high precision demands.
DatabricksIntroducing Databricks GenAI Partner Accelerators for Data Engineering & Migration
Databricks and its extensive partner ecosystem introduce GenAI Partner Accelerators leveraging agentic AI and advanced automation to streamline data engineering and migration, significantly accelerating modernization, improving data quality, and reducing manual effort for scalable, AI-ready cloud data platforms.
OpenAIHow Virgin Atlantic uses AI to enhance every step of travel
Virgin Atlantic leverages enterprise AI with OpenAI-powered tools and custom GPTs to accelerate delivery, power a brand-aligned digital travel concierge, and govern AI adoption through an outcomes-first, ROI-driven framework.
OpenAIHow Virgin Atlantic uses AI to enhance every step of travel
Virgin Atlantic leverages AI technologies like ChatGPT Enterprise and OpenAI’s voice API to enhance operational efficiency, customer experience, and strategic innovation while balancing governance and brand authenticity.
OpenAIThe state of enterprise AI
Enterprise AI adoption is accelerating rapidly across industries, driving increased productivity, enabling new work capabilities, and deepening integration into workflows as revealed by OpenAI's comprehensive data analysis.
OpenAIInstacart and OpenAI partner on AI shopping experiences
Instacart integrates its grocery shopping and Instant Checkout app directly within ChatGPT, leveraging OpenAI's AI models and Agentic Commerce Protocol to deliver seamless, conversational AI-powered shopping experiences from meal planning to doorstep delivery.
PinterestHow Pinterest Built a Real‑Time Radar for Violative Content using AI
Pinterest developed an AI-powered real-time radar system to measure and monitor the prevalence of policy-violating content via scalable, multimodal LLM-assisted sampling and labeling, enabling proactive trust and safety interventions.
MIT AIMIT affiliates named 2025 Schmidt Sciences AI2050 Fellows
MIT affiliates, including postdocs and faculty, have been named 2025 AI2050 Fellows to advance cutting-edge AI research addressing society’s future challenges.
SalesforceHow AI-Powered Testing Enabled Sub-Second Latency for Agentforce Voice
Leveraging AI-driven synthetic testing and semantic end-pointing algorithms, the Flash Reasoning Engine achieved sub-second latency and high accuracy for seamless real-time voice interactions in Agentforce Voice.
Lambda LabsLambda Appoints Heather Planishek as Chief Financial Officer
Lambda appoints Heather Planishek as CFO to lead financial strategy amid expanding AI cloud infrastructure serving hyperscalers, enterprises, and research labs.
CloudflarePython Workers redux: fast cold starts, packages, and a uv-first workflow
Explore how Cloudflare Python Workers deliver rapid cold starts, comprehensive package support via Pyodide, and an optimized uv-first development workflow for serverless Python at the edge.