engblogs

summaries of the latest blog articles from your favorite tech companies.
Apple MLApple ML

LLM in a Flash: Efficient Large Language Model Inference with Limited Memory

Efficiently deploying large language models on devices with limited memory using flash memory optimization techniques.

8/5/2024
Apple MLApple ML

Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation

Proposing a method for aligning large language models with human expectations through self-rewarding contrastive prompt distillation.

8/5/2024
NetflixNetflix

Investigation of a Cross-regional Network Performance Issue

Analysis of a cross-regional network performance issue uncovers the root cause in a Linux kernel upgrade affecting TCP receive window size calculations

8/5/2024
AWS MLAWS ML

Build an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and AWS CloudFormation

Automate the deployment of an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and AWS CloudFormation

8/5/2024
StripeStripe

Surprising findings from our analysis of 3DS transactions in the US

Insights on 3DS transactions analysis in the US reveal surprising findings that highlight differences in authentication behaviors between regions

8/5/2024
Google CloudGoogle Cloud

Continuous Delivery on Google Cloud with Gitlab CI/CD and Cloud Deploy

Automate software delivery from code commit to production release on Google Cloud using Gitlab CI/CD and Cloud Deploy.

8/5/2024
AWS MLAWS ML

Catalog, query, and search audio programs with Amazon Transcribe and Knowledge Bases for Amazon Bedrock

Catalog, query, and search audio programs efficiently using Amazon Transcribe and Knowledge Bases for Amazon Bedrock

8/5/2024
AWS MLAWS ML

Faster LLMs with speculative decoding and AWS Inferentia2

Accelerate large language model inference with speculative decoding on AWS Inferentia2

8/5/2024
MetaMeta

DCPerf: An open source benchmark suite for hyperscale compute applications

An open source benchmark suite, DCPerf, for hyperscale compute applications aimed at improving hardware and software optimization and platform design.

8/5/2024
MetaMeta

A RoCE network for distributed AI training at scale

Building a robust RoCE network for large-scale distributed AI training workloads at Meta

8/5/2024
Google DeepMindGoogle DeepMind

A new generation of African talent brings cutting-edge AI to scientific challenges

A new generation of African talent leverages cutting-edge AI for scientific challenges through the AI for Science Master’s program at AIMS

8/5/2024
DuolingoDuolingo

Friend Streak: a new way to stay motivated together

Introducing Friend Streak, a social motivation feature for learning together with friends on Duolingo.

8/5/2024