euclaise

https://euclaise.xyz

euclaise

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

Cactus-Compute/needle

upvoted a paper 5 days ago

Can Muon Fine-tune Adam-Pretrained Models?

upvoted a paper 5 days ago

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

View all activity

Organizations

upvoted 3 papers 5 days ago

upvoted 10 papers 2 months ago

Efficient Exploration at Scale

Paper • 2603.17378 • Published Mar 18 • 14

V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts

Paper • 2603.10848 • Published Mar 11 • 14

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 82

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 189

Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 184

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published Feb 10, 2025 • 22

RAT: Bridging RNN Efficiency and Attention Accuracy in Language Modeling

Paper • 2507.04416 • Published Jul 6, 2025 • 1

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

Paper • 2602.18196 • Published Feb 20 • 1

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 59

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

Paper • 2603.10145 • Published Mar 10 • 13

upvoted 6 papers 3 months ago

Online Vector Quantized Attention

Paper • 2602.03922 • Published Feb 3 • 1

Softmax Linear Attention: Reclaiming Global Competition

Paper • 2602.01744 • Published Feb 2 • 1

Test-Time Training with KV Binding Is Secretly Linear Attention

Paper • 2602.21204 • Published Feb 24 • 32

On the "Induction Bias" in Sequence Models

Paper • 2602.18333 • Published Feb 20 • 4

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published Feb 24 • 7

One-step Language Modeling via Continuous Denoising

Paper • 2602.16813 • Published Feb 18 • 4

upvoted an article 3 months ago

Article

Differential Transformer V2

microsoft

•

Jan 20

• 51

euclaise

AI & ML interests

Recent Activity

Organizations

euclaise's activity

Differential Transformer V2