fangtongen 's Collections
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
• 2306.01116
• Published
• 43
FlashAttention: Fast and Memory-Efficient Exact Attention with
IO-Awareness
Paper
• 2205.14135
• Published
• 15
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper
• 2104.09864
• Published
• 17
Language Models are Few-Shot Learners
Paper
• 2005.14165
• Published
• 19
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Paper
• 2101.00027
• Published
• 10
Fast Transformer Decoding: One Write-Head is All You Need
Paper
• 1911.02150
• Published
• 9
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
• 2307.09288
• Published
• 250
LLaMA: Open and Efficient Foundation Language Models
Paper
• 2302.13971
• Published
• 20
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper
• 2306.02707
• Published
• 49
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity
Text Embeddings Through Self-Knowledge Distillation
Paper
• 2402.03216
• Published
• 6
Paper
• 2310.06825
• Published
• 58