engblogs

summaries of the latest blog articles from your favorite tech companies.
Apple MLApple ML

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Efficient parallelization scheme KV-Runahead accelerates prompt phase in generating subsequent tokens for Large Language Model inference by leveraging key-value cache.

5/14/2024
Apple MLApple ML

pfl-research: Simulation Framework for Accelerating Research in Private Federated Learning

Accelerate FL research with pfl-research simulation framework

5/14/2024
DatabricksDatabricks

Research Survey: Productivity benefits from Databricks Assistant

Discover the productivity benefits of Databricks Assistant through a comprehensive research survey.

5/14/2024
Modular AIModular AI

Modular: MAX Graph API Tutorial

Tutorial on using MAX Graph API to create symbolic graphs, compile, and execute them step-by-step

5/14/2024
Google CloudGoogle Cloud

Vertex AI at I/O: Bringing new Gemini and Gemma models to Google Cloud customers

Introducing new Gemini and Gemma models to Google Cloud customers via Vertex AI at I/O event.

5/14/2024
Google CloudGoogle Cloud

Announcing Trillium, the sixth generation of Google Cloud TPU

Introducing Trillium, the latest Google Cloud TPU offering with enhanced performance and sustainability for advanced AI models.

5/14/2024
AWS MLAWS ML

RAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models

Utilizing RAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models for generative AI applications

5/14/2024
DatabricksDatabricks

Building DBRX-class Custom LLMs with Mosaic AI Training

Training and optimizing DBRX-class custom LLMs using Mosaic AI Training for enterprise applications.

5/14/2024
MetaMeta

Behind the scenes of Threads for web

Exploring the development journey of Threads for web through insights from the engineers on the Threads Web Team at Meta.

5/14/2024
AWS MLAWS ML

Build generative AI applications with Amazon Titan Text Premier, Amazon Bedrock, and AWS CDK

Empower text generation with Amazon Titan Text Premier and AWS CDK for generative AI applications

5/14/2024
AWS MLAWS ML

Incorporate offline and online human – machine workflows into your generative AI applications on AWS

Integrate offline and online human-machine workflows for generative AI applications on AWS

5/14/2024
Google DeepMindGoogle DeepMind

Gemini breaks new ground: a faster model, longer context and AI agents

Gemini introduces faster model, longer context, and AI agents for enhanced performance and efficiency

5/14/2024