Package llama.cpp: Information

    Source package: llama.cpp
    Version: 8681-alt1
    Latest version according to Repology
    Build time:  Apr 7, 2026, 01:20 AM in the task #414352
    Report package bug
    License: MIT
    Summary: LLM inference in C/C++
    Description: 
    Plain C/C++ implementation (of inference of many LLM models) without
    dependencies. AVX, AVX2, AVX512, and AMX support for x86 architectures.
    Mixed F16/F32 precision. 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and
    8-bit integer quantization for faster inference and reduced memory use.
    Supports CPU, GPU, and hybrid CPU+GPU inference.
    
    Supported models:
    
       LLaMA models, Mistral 7B, Mixtral MoE, Falcon, Chinese LLaMA /
       Alpaca and Chinese LLaMA-2 / Alpaca-2, Vigogne (French), Koala,
       Baichuan 1 & 2 + derivations, Aquila 1 & 2, Starcoder models, Refact,
       Persimmon 8B, MPT, Bloom, Yi models, StableLM models, Deepseek models,
       Qwen models, PLaMo-13B, Phi models, GPT-2, Orion 14B, InternLM2,
       CodeShell, Gemma, Mamba, Grok-1, Xverse, Command-R models, SEA-LION,
       GritLM-7B + GritLM-8x7B, OLMo, GPT-NeoX + Pythia,  Snowflake-Arctic
       MoE, Smaug, Poro 34B, Bitnet b1.58 models, Flan T5, Open Elm models,
       ChatGLM3-6b + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b, SmolLM,
       EXAONE-3.0-7.8B-Instruct, FalconMamba Models, Jais, Bielik-11B-v2.3,
       RWKV-6, QRWKV-6, GigaChat-20B-A3B, Trillion-7B-preview, Ling models,
       LFM2 models, Hunyuan models, BailingMoeV2 (Ring/Ling 2.0) models
    
    Multimodal models:
    
       LLaVA 1.5 models, BakLLaVA, Obsidian, ShareGPT4V, MobileVLM 1.7B/3B
       models, Yi-VL, Mini CPM, Moondream, Bunny, GLM-EDGE, Qwen2-VL,
       LFM2-VL
    
    NOTE:
      MODELS ARE NOT PROVIDED. You'll need to download them from the original
      sites (or Hugging Face Hub).
    
    Overall this is all raw and EXPERIMENTAL, no warranty, no support.

    List of RPM packages built from this SRPM:
    libllama (x86_64, aarch64)
    libllama-debuginfo (x86_64, aarch64)
    libllama-devel (x86_64, aarch64)
    llama.cpp (x86_64, aarch64)
    llama.cpp-cpu (x86_64, aarch64)
    llama.cpp-cpu-debuginfo (x86_64, aarch64)
    llama.cpp-cuda (x86_64)
    llama.cpp-cuda-debuginfo (x86_64)
    llama.cpp-vulkan (x86_64, aarch64)
    llama.cpp-vulkan-debuginfo (x86_64, aarch64)

    Maintainer: Vitaly Chikunov

    List of contributors:
    Vitaly Chikunov


      1. cmake
      2. nvidia-cuda-devel-static
      3. ctest
      4. python3-module-jinja2
      5. gcc-c++
      6. gcc12-c++
      7. rpm-macros-cmake
      8. glslc
      9. help2man
      10. tinyllamas-gguf
      11. libcurl-devel
      12. libgomp-devel
      13. libssl-devel
      14. libstdc++-devel-static
      15. libvulkan-devel

    Last changed


    April 6, 2026 Vitaly Chikunov 1:8681-alt1
    - Update to b8681 (2026-04-06).
    March 22, 2026 Vitaly Chikunov 1:8470-alt1
    - Update to b8470 (2026-03-22).
    March 3, 2026 Vitaly Chikunov 1:8192-alt1
    - Update to b8192 (2026-03-03).