Comparing james-willis:master...graphframes:main · james-willis/graphframes

Commits on Mar 31, 2026

  1. fix: Java 8 compatibility for KMinSampling and benchmarks (graphframe…

    …s#817)
    
    * fix: replace java.util.List.of with Collections.emptyList for Java 8 compat
    
    List.of() was added in Java 9 and breaks the release build on JDK 8
    (Spark 3.5.x). Fixes graphframes#804.
    
    * ci: add JDK 8 to Scala CI matrix
    
    JDK 8 was only tested during publish (push to main), so Java 8
    incompatibilities like List.of() were not caught on PRs.
    
    * fix: replace Path.of with Paths.get for Java 8 compat in benchmarks
    
    Path.of() was added in Java 11. Paths.get() is available since Java 7.
    
    * ci: skip tests on JDK 8, compile-only
    
    LDBC test data downloads fail on JDK 8 due to Cloudflare TLS
    fingerprinting (JA3) rejecting Java 8's older TLS stack with 403.
    JDK 8 entry only needs to verify compilation; tests run on JDK 11/17.
    
    * Revert "ci: skip tests on JDK 8, compile-only"
    
    This reverts commit 125f367.
    
    * fix: use curl for LDBC downloads to avoid Cloudflare TLS rejection on JDK 8
    
    The LDBC CDN is behind Cloudflare, which uses TLS fingerprinting (JA3)
    to detect automated traffic. Java 8's TLS stack produces a distinctive
    fingerprint that Cloudflare rejects with HTTP 403. Java 11+ rewrote the
    TLS implementation (JEP 332) so its fingerprint is not flagged.
    
    Replace URLConnection with curl in both LDBCUtils and ParquetDataLoader.
    This is consistent with the existing use of shell commands (zstd, tar)
    in these files.
    
    * style: fix scalafmt line length in LDBCUtils
    
    * style: fix scalafix import grouping in ParquetDataLoader
    
    * fix: keep old URLConnection code as commented-out TODO
    
    Per review: preserve the original download code so it can be restored
    after Spark 3.5.x EOL (~April 2026) when JDK 8 support is dropped.
    Configuration menu

    Browse the repository at this point in the history

  2. docs: add 0.11.0 release blog post (graphframes#814)

    * docs: add 0.11.0 release blog post
    
    * docs: address PR review comments on 0.11.0 blog post
    
    - Note Two Phase is the same algorithm as before, just renamed
    - Remove API refactoring section
    - Remove Pregel code example and benchmarks section
    - Add Spark 4.1.0 requirement note for DataSketches triangle counting
    - Remove bug fixes and future steps sections
    
    * docs: fix Property Graph API examples to use string group names
    
    * docs: fix RandomWalkEmbeddings API usage in blog post
    
    * docs: address SemyonSinchenko review comments
    
    - Fix Randomized Contraction description per Sem's feedback
    - Add hub-problem motivation for KMinSampling
    - Use full Hash2Vec citation
    Configuration menu

    Browse the repository at this point in the history

  3. Configuration menu

    Browse the repository at this point in the history

  4. Configuration menu

    Browse the repository at this point in the history

  5. Configuration menu

    Browse the repository at this point in the history