Spaces:
Running
Running
Commit History
sync : ggml
9f6adbb
unverified
cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)
729db34
unverified
Christian Kastner
commited on
readme : add maintenance roadmap
200b3e2
unverified
ci : add stalebot
1913738
unverified
node : add max_len params in node addon (#2760)
91b432f
unverified
billyct
commited on
talk-llama : sync llama.cpp
9b44911
unverified
coreml : always convert to "neuralnetwork" (#2770)
e351248
unverified
ci : more git
b349a86
ci : install git
3f57eda
ci : use ubuntu-22.04 instead of ubuntu-latest
e96193b
cmake : sync cmake scripts
3b88d3c
sync : ggml
e4e0be6
scripts : fix sync paths
88d6566
CUDA: fix Volta FlashAttention logic (llama/11615)
6df9571
HIP: fix flash_attn_stream_k_fixup warning (llama/11604)
acfd94f
CUDA/HIP: add support for selectable warp size to mmv (llama/11519)
ed08269
uvos
commited on
HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectures for amd gpus are not supersets of eatch other (llama/11601)
4850c24
uvos
commited on
CUDA: use mma PTX instructions for FlashAttention (llama/11583)
f328957
`ci`: use sccache on windows instead of ccache (llama/11545)
9ed1962
Olivier Chafik
commited on
HIP: require at least HIP 5.5
72c425b
uvos
commited on
HIP: Prepare reduction operators for wave 64
bc1c1a4
uvos
commited on
CUDA/HIP: add warp_size to cuda_device_info
e538e2c
uvos
commited on
vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)
bd93c1b
vulkan: Catch pipeline creation failure and print an error message (llama/11436)
d4f6b2c
HIP: Supress transformation warning in softmax.cu
72c6f1d
uvos
commited on
HIP: Only call rocblas_initialize on rocblas versions with the multiple instantation bug (llama/11080)
82bb7f3
Nikita Sarychev
commited on
cmake : don't fail on `GGML_CPU=OFF` (llama/11457)
6406a6e
someone13574
commited on
SYCL : SOFTMAX F16 mask support and other fixes (llama/11261)
8aaf0c8
AMD: parse the architecture as supplied by gcnArchName (llama/11244)
04b01d8
Haus1
commited on
metal: Handle null returned from MTLCreateSystemDefaultDevice() (llama/11441)
4e38ed4
Ihar Hrachyshka
commited on
metal : use residency sets (llama/11427)
9da4d68
cmake: add ggml find package (llama/11369)
ca6577f
vulkan: compile shaders on-demand (llama/11406)
5c008f7
Hip: disable VMM on hip as it seams that it dosent work in some configurations (llama/11420)
2cc4df4
uvos
commited on
hip : Add hipGraph and VMM support to ROCM (llama/11362)
089afa0
uvos
commited on
CUDA: fix FP16 cuBLAS GEMM (llama/11396)
7b7c5d3
rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356)
6f5687a
uvos
commited on
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe
cmake : avoid -march=native when reproducible build is wanted (llama/11366)
3cae2d9
Bernhard M. Wiedemann
commited on
Vulkan-run-test: fix mmq_wg_denoms (llama/11343)
133a580
amd-dwang
commited on
vulkan: sort shaders for more deterministic binary (llama/11315)
d7c0046
vulkan: fix diag_mask_inf (llama/11323)
f76204e
rpc : better caching of the base buffer pointer (llama/11331)
81a6cae
metal : fix out-of-bounds write (llama/11314)
1101050
vulkan: fix coopmat2 validation failures (llama/11284)
f2cc7e9
SYCL: Introducing memory host pool (llama/11251)
aedb0b3
Nicolò Scipione
commited on