@perplexity_ai - AI Dynamics

AI Encoder Tokenizer Performance: 5× Latency Improvement

By

–

27 May 2026 17h55

At production input lengths, the encoder cuts p50 latency by roughly 5× vs. HuggingFace tokenizers, 2× vs. SentencePiece C++, and 1.5× vs. IREE C. At 514 tokens, it runs in 63 µs with zero heap allocations.

→ View original post on X — @perplexity_ai,

27 May 2026

Bumblebee scanner for AI tool configs on developer machines

By

@perplexity_ai

–

22 May 2026 19h03

Today we're open-sourcing Bumblebee, a read-only scanner for macOS and Linux. It checks developer machines for risky packages, extensions, and AI tool configs. Connected to Computer, it can trigger deeper scans whenever a new supply-chain risk emerges. https://
github.com/perplexityai/b
umblebee
…

→ View original post on X — @perplexity_ai,

22 May 2026

Query-Aware Context Compression in RAG Systems

By

@perplexity_ai

–

20 May 2026 19h26

Context compression isn't new in RAG. Our contribution is making it query-aware, citation-preserving, and fast enough for orchestration. Read the full research blog:

→ View original post on X — @perplexity_ai,

20 May 2026

Perplexity develops ROSE inference engine with CuTeDSL for faster GPU kernels

By

@perplexity_ai

–

06 May 2026 17h04

We’ve developed our own inference engine Runtime-Optimized Serving Engine (ROSE) to serve models ranging from embeddings to trillion-parameter LLMs. With CuTeDSL integrated into our inference engine, Perplexity can build the specialized GPU kernels faster to bring models up to

→ View original post on X — @perplexity_ai,

6 May 2026

GPT-5.5 Launches on Perplexity and as Default Orchestration Model

By

@perplexity_ai

–

24 April 2026 20h44

GPT-5.5 is now available on Perplexity for Max subscribers. GPT-5.5 is also rolling out as the default orchestration model in Computer for both Pro and Max subscribers.

→ View original post on X — @perplexity_ai,

24 April 2026

Moonshot Releases Kimi K2.6 Open-Weight Model

By

@perplexity_ai

–

23 April 2026 4h46

Kimi K2.6, the new state-of-the-art open-weight model from Moonshot, is now available for Pro and Max subscribers.

→ View original post on X — @perplexity_ai,

23 April 2026

Perplexity’s Pipeline Improves Base Model Accuracy and Efficiency

By

@perplexity_ai

–

22 April 2026 20h15

This pipeline is why the same base model produces more accurate, better-cited, and more efficient answers inside Perplexity than out of the box. Read our research:

→ View original post on X — @perplexity_ai,

22 April 2026

Reward Design Balances Correctness Preference Efficiency

By

@perplexity_ai

–

22 April 2026 20h15

Our reward design combines correctness, preference, and efficiency. Preference only counts when the answer is correct. This keeps the model from optimizing for better-sounding wrong answers.

→ View original post on X — @perplexity_ai,

22 April 2026

Fine-tuning and On-Policy RL for Model Optimization

By

@perplexity_ai

–

22 April 2026 20h15

We first fine-tune the model to follow instructions, stay within guardrails, and keep language consistent. Then we run on‑policy RL to improve search accuracy and tool efficiency while preserving those behaviors.

→ View original post on X — @perplexity_ai,

22 April 2026

New Research: SFT+RL Pipeline Boosts Search-Augmented AI Accuracy

By

@perplexity_ai

–

22 April 2026 20h15

We've published new research on how we post-train models for accurate search-augmented answers. Our SFT + RL pipeline improves search, citation quality, instruction following, and efficiency. With Qwen models, we match or beat GPT models on factuality at a lower cost.

→ View original post on X — @perplexity_ai,

22 April 2026