Comparing oluka007:main...JamePeng:main · oluka007/llama-cpp-python

Commits on Apr 7, 2026

  1. Configuration menu

    Browse the repository at this point in the history

  2. Configuration menu

    Browse the repository at this point in the history

  3. ci: restrict cudaarch to Volta-Hopper to fix GitHub Actions timeout

    Using the `all` option for `cudaarch` on CUDA 12.4-12.6 causes the compilation process to exceed the 6-hour maximum execution limit on GitHub Actions, leading to cancelled jobs.
    
    To resolve this and reduce build times, the target architectures are now restricted to explicitly support compute capabilities 7.0 through 9.0 (`70-real` to `90-real`). This maintains support for all modern NVIDIA GPUs equipped with Tensor Cores (from Volta up to Hopper architectures) while keeping the build time safely within CI constraints.
    
    Signed-off-by: JamePeng <jame_peng@sina.com>
    Configuration menu

    Browse the repository at this point in the history

  4. Update CI Action runner version

    microsoft/setup-msbuild@v2 -> v3
    actions/checkout@v5 -> v6
    actions/upload-artifact@v4 -> v6
    actions/download-artifact@v4 -> v6
    
    Signed-off-by: JamePeng <jame_peng@sina.com>
    Configuration menu

    Browse the repository at this point in the history

  5. Configuration menu

    Browse the repository at this point in the history

  6. Configuration menu

    Browse the repository at this point in the history

  7. Configuration menu

    Browse the repository at this point in the history

  8. feat(types): align with latest OpenAI API spec and fix type issues

    - Expand `CompletionUsage` with `PromptTokensDetails` and `CompletionTokensDetails` for granular token tracking.
    - Add `usage` to `CreateChatCompletionStreamResponse` to support usage reporting in streaming mode.
    - Fix duplicate `object` field in `CreateCompletionResponse`.
    - Update `ChatCompletionRequestAssistantMessage` to accept `None` for `content` and introduce the new `refusal` field.
    - Clean up `ChatCompletionRequestMessage` Union by removing the duplicate user message type.
    - Broaden `ChatCompletionToolChoiceOption` to fully support `allowed_tools` and `custom` tool choice behaviors.
    
    Signed-off-by: JamePeng <jame_peng@sina.com>
    Configuration menu

    Browse the repository at this point in the history

Commits on Apr 8, 2026

  1. Configuration menu

    Browse the repository at this point in the history

  2. Configuration menu

    Browse the repository at this point in the history

  3. Configuration menu

    Browse the repository at this point in the history

Commits on Apr 9, 2026

  1. Configuration menu

    Browse the repository at this point in the history

  2. Configuration menu

    Browse the repository at this point in the history

  3. Configuration menu

    Browse the repository at this point in the history

  4. Configuration menu

    Browse the repository at this point in the history

  5. Configuration menu

    Browse the repository at this point in the history

Commits on Apr 11, 2026

  1. Configuration menu

    Browse the repository at this point in the history

  2. Configuration menu

    Browse the repository at this point in the history