Apple MLKV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation
Efficient parallelization scheme KV-Runahead accelerates prompt phase in generating subsequent tokens for Large Language Model inference by leveraging key-value cache.
Apple MLpfl-research: Simulation Framework for Accelerating Research in Private Federated Learning
Accelerate FL research with pfl-research simulation framework
DatabricksResearch Survey: Productivity benefits from Databricks Assistant
Discover the productivity benefits of Databricks Assistant through a comprehensive research survey.
Modular AIModular: MAX Graph API Tutorial
Tutorial on using MAX Graph API to create symbolic graphs, compile, and execute them step-by-step
Google CloudVertex AI at I/O: Bringing new Gemini and Gemma models to Google Cloud customers
Introducing new Gemini and Gemma models to Google Cloud customers via Vertex AI at I/O event.
Google CloudAnnouncing Trillium, the sixth generation of Google Cloud TPU
Introducing Trillium, the latest Google Cloud TPU offering with enhanced performance and sustainability for advanced AI models.
AWS MLRAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models
Utilizing RAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models for generative AI applications
DatabricksBuilding DBRX-class Custom LLMs with Mosaic AI Training
Training and optimizing DBRX-class custom LLMs using Mosaic AI Training for enterprise applications.
MetaBehind the scenes of Threads for web
Exploring the development journey of Threads for web through insights from the engineers on the Threads Web Team at Meta.
AWS MLBuild generative AI applications with Amazon Titan Text Premier, Amazon Bedrock, and AWS CDK
Empower text generation with Amazon Titan Text Premier and AWS CDK for generative AI applications
AWS MLIncorporate offline and online human – machine workflows into your generative AI applications on AWS
Integrate offline and online human-machine workflows for generative AI applications on AWS
Google DeepMindGemini breaks new ground: a faster model, longer context and AI agents
Gemini introduces faster model, longer context, and AI agents for enhanced performance and efficiency