Change generate-copyright to generate HTML, with cargo dependencies included by jonathanpallant · Pull Request #128353 · rust-lang/rust

@rustbot rustbot added S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

T-bootstrap

Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)

labels

Jul 29, 2024

Kobzol

thejpster

thejpster

This tool now scans for cargo dependencies and includes any important looking license files.

We do this because cargo package metadata is not sufficient - the Apache-2.0 license says you have to include any NOTICE file, for example. And authors != copyright holders (cargo has the former, we must include the latter).
This format works better with large amounts of structured data.

We also mark which deps are in the stdlib

@jonathanpallant

@jonathanpallant

I can't find a way to derive rinja::Template for Node - I think because it is a recursive type. So I rendered it manually using html_escape.

@jonathanpallant

@jonathanpallant

@jonathanpallant

@jonathanpallant

@jonathanpallant

Kobzol

@bors bors added S-waiting-on-bors

Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review

Status: Awaiting review from the assignee but also interested parties.

labels

Aug 7, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request

Aug 7, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#128221 (Add implied target features to target_feature attribute)
 - rust-lang#128261 (impl `Default` for collection iterators that don't already have it)
 - rust-lang#128353 (Change generate-copyright to generate HTML, with cargo dependencies included)
 - rust-lang#128679 (codegen: better centralize function declaration attribute computation)
 - rust-lang#128732 (make `import.vis` is immutable)
 - rust-lang#128755 (Integrate crlf directly into related test file instead via of .gitattributes)
 - rust-lang#128772 (rustc_codegen_ssa: Set architecture for object crate for 32-bit SPARC)
 - rust-lang#128782 (unused_parens: do not lint against parens around &raw)

r? `@ghost`
`@rustbot` modify labels: rollup

rust-timer added a commit to rust-lang-ci/rust that referenced this pull request

Aug 7, 2024
Rollup merge of rust-lang#128353 - ferrocene:jonathanpallant/add-dependencies-to-copyright-file, r=Kobzol

Change generate-copyright to generate HTML, with cargo dependencies included

`x.py run generate-copyright` now produces `build/COPYRIGHT.html`. This includes a new format for in-tree dependencies, and also adds out-of-tree cargo dependencies.

After consulting expert opinion, I have elected to include every top-level:

* `*NOTICE*`
* `*AUTHOR*`
* `*LICENSE*`
* `*LICENCE*`, and
* `*COPYRIGHT*` file I can find - case-insensitive.

This is because the cargo package metadata's `author` field is not a list of copyright holders and does not meet the requirements of the Apache-2.0 license (which says you must include a NOTICE file with the binary if one was supplied by the author) nor the MIT license (which says you must include 'the above copyright notice').

I believe it would be appropriate to include this file with every Rust release, in order to do an even better job of appropriately recognising the efforts of the authors of the first-party and third-party libraries we are using here.

The output includes something like 524 copies of the Apache-2.0 text because they are not all identical. I think I count about 50 different variations by shasum - some differ in whitespace, while some have the boilerplate block at the bottom erroneously modified (don't modify the copy in the license, modify the copy you paste into your own source code!). Running `gzip` on the HTML file largely makes this problem go away, and the average browser is far happier with a ~6 MiB HTML file than the average Markdown viewer is with a ~6 MiB markdown file. But, if someone wants to, do they could submit a follow-up which de-dups the license text files and adds back-links to earlier identical copies (for some value of 'identical copy').

```console
$ xpy run generate-copyright
$ cd build
$ gzip -c COPYRIGHT.html > COPYRIGHT.gz
$ xz -c COPYRIGHT.html > COPYRIGHT.xz
$ ls -lh COPYRIGHT.*
-rw-r--r--  1 jonathan  staff   241K 29 Jul 17:19 COPYRIGHT.gz
-rw-r--r--@ 1 jonathan  staff   6.6M 29 Jul 11:30 COPYRIGHT.html
-rw-r--r--  1 jonathan  staff    59K 29 Jul 17:19 COPYRIGHT.xz
```

Here's an example [COPYRIGHT.gz](https://github.com/user-attachments/files/16416147/COPYRIGHT.gz).