Google CloudHow we cut Vertex AI latency by 35% with GKE Inference Gateway
How Vertex AI slashes latency by 35% with GKE Inference Gateway, using load- and content-aware routing, prefix-cache optimization, and admission-controlled queueing to handle context-heavy and bursty workloads at production scale.
OpenAIKorea privacy policy
A concise technical overview of South Korea's privacy policy landscape, outlining key regulations, data protection requirements, and implications for compliance in digital services.
Google CloudStarfish Space uses Google Cloud to accelerate satellite servicing in orbit
Starfish Space leverages Google Cloud with Compute Engine and GKE to run millions of Monte Carlo simulations for an autonomous, software-first satellite servicing vehicle, accelerating development and orbital validation.
Google CloudDelivering a secure, open, and sovereign digital world
An in-depth look at building a secure, open, and sovereign digital world with Google Cloud's Sovereign Cloud portfolio—emphasizing data residency, air-gapped/dedicated deployments, open source, and rigorous regulatory controls.
AWS MLEvaluate generative AI models with an Amazon Nova rubric-based LLM judge on Amazon SageMaker AI (Part 2)
Assess generative AI with a dynamic, rubric-based Amazon Nova LLM judge on SageMaker AI, auto-generating task-specific rubrics, calculating per-criterion scores, and delivering calibrated model comparisons across outputs (Part 2).
MIT AIHelping AI agents search to get the best results out of large language models
EnCompass enables AI agents powered by LLMs to backtrack, clone runtimes, and explore multiple execution paths through configurable search strategies to maximize task outcomes.
Google CloudShip Production Ready AI and Survive the Multimodal Frontier This February
Roadmap to Production Ready AI and Real-Time Multimodal Agents, drawn from Google Cloud's two-day roadshow covering enterprise-grade security, scalable architecture, and Graph RAG-powered memory across sessions.
AWS MLA practical guide to Amazon Nova Multimodal Embeddings
A practical guide to configuring Amazon Nova Multimodal Embeddings for cross-modal search, semantic retrieval, and scalable multimodal AI workflows across text, image, video, and audio.
AWS MLHow Associa transforms document classification with the GenAI IDP Accelerator and Amazon Bedrock
Associa leverages the GenAI IDP Accelerator on Amazon Bedrock to automatically classify millions of documents with high accuracy and low cost, using first-page OCR+image prompts and model tuning to optimize throughput and workflow integration.
Modular AIModular: The Five Eras of KVCache
Traces the evolution of KVCache in LLM serving—from early naive designs to distributed, unified memory systems—within Modular's MAX platform and Mojo-based stack.
Jane StreetI design with Claude more than Figma now
How Claude-powered prototyping transforms a designer's workflow, moving from laborious Figma mockups to live AI-driven prototypes that validate ideas quickly at Jane Street.
OpenAIIntroducing GPT-5.3-Codex
A concise technical overview of GPT-5.3-Codex, highlighting its architecture, capabilities, and potential impact on AI-assisted coding.