
mRAKL: Multilingual Retrieval-Augmented Knowledge Graph Construction for Low-Resourced Languages
mRAKL leverages retrieval-augmented generation to enhance multilingual knowledge graph construction for low-resourced languages by reformulating it as a question answering task and employing cross-lingual transfer.

MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains
MMAU benchmark offers a comprehensive and reproducible evaluation framework for assessing large language models' capabilities across diverse domains and core skills without complex environment setups.

How Global Calibration Strengthens Multiaccuracy
Exploring how integrating global calibration into multiaccuracy significantly enhances its effectiveness in achieving stronger fairness guarantees and agnostic learning capabilities.

Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?
Investigating how augmenting LLM-based annotation systems with external validation tools like web-search and code execution can enhance the quality of pairwise preference data in complex domains such as long-form factual content, math, and code.

ASPERA: A Simulated Environment to Evaluate Planning for Complex Action Execution
ASPERA introduces a simulated environment and dataset to rigorously evaluate large language models' ability to generate complex action execution plans using custom assistant libraries.

Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models
Investigating how targeted neuron interventions enhance cross-lingual alignment and improve retrieval accuracy in multilingual language models without costly fine-tuning.

On the Way to LLM Personalization: Learning to Remember User Conversations
Introduces PLUM, a novel pipeline for LLM personalization that leverages memory of user conversations via sequential data augmentation and parameter-efficient fine-tuning to enhance personalized dialogue accuracy.

On Information Geometry and Iterative Optimization in Model Compression: Operator Factorization
Explores the use of information geometry and iterative optimization techniques, specifically operator factorization and singular value thresholding, to improve model compression with enhanced accuracy and trainability.

The Practitioner’s Ultimate Guide to Scalable Logging
A comprehensive guide to implementing scalable, structured logging and observability in Databricks environments using standardized log collection, JSON formatting, and efficient ingestion pipelines for improved monitoring and troubleshooting.

Resolving digital threats 100x faster with OpenAI
Leveraging OpenAI technology to accelerate digital threat resolution by 100 times.

New machine-learning application to help researchers predict chemical properties
ChemXploreML is a user-friendly, offline desktop application that leverages advanced machine learning algorithms and molecular embedders to enable chemists to accurately predict key molecular properties without requiring programming expertise.

Pedestrians now walk faster and linger less, researchers find
A study using AI and computer vision reveals a 15% increase in pedestrian walking speed and a decline in public lingering since 1980, highlighting changes in urban social dynamics and informing future city planning.