Comparing JamePeng:main...LSDJesus:main · JamePeng/llama-cpp-python

Commits on Apr 6, 2026

  1. Configuration menu

    Browse the repository at this point in the history

  2. feat: add semantic memory injection research and layer capture/skip C…

    … API extensions
    
    - Add copilot instructions for development workflow and architecture overview
    - Document semantic memory injection methodology and K/V injection findings
    - Add synthetic token architecture brainstorm and KV pruning specifications
    - Implement layer capture/skip C API extensions (llama_set_layer_capture, llama_set_layer_skip, llama_get_embeddings_layer_ith, etc.)
    - Add activation analysis and KV injection/compression/merge experiment scripts
    - Reorganize original docs into docs/Original_repo/ subdirectory
    - Update .gitignore for generated artifacts and analysis outputs
    - Extend llama_cpp.py with new ctypes bindings for layer manipulation
    - Extend _internals.py and llama.py with high-level Python APIs for layer operations
    Configuration menu

    Browse the repository at this point in the history

  3. Configuration menu

    Browse the repository at this point in the history

  4. Configuration menu

    Browse the repository at this point in the history

  5. Configuration menu

    Browse the repository at this point in the history

  6. ci: fix release tag_name for manual dispatch

    softprops/action-gh-release requires an explicit tag_name when triggered
    via workflow_dispatch (github.ref is refs/heads/main, not a tag).
    Generate tag from wheel version + short SHA for manual runs.
    Configuration menu

    Browse the repository at this point in the history

  7. docs: add qwen3-vl hacking guide and penultimate layer extraction

    - Add comprehensive hacking guide for llama.cpp qwen3-vl model modifications
    - Add penultimate hidden states PR with patch for transformer layer capture
    - Add activation steering script for semantic memory research
    - Add KV fragment testing utilities
    - Add GGUF dequantization test harness
    - Update per-layer embeddings results
    - Add steering delta coefficients for layer injection
    Configuration menu

    Browse the repository at this point in the history

  8. Configuration menu

    Browse the repository at this point in the history