feat: add S3 SSE configs (FsspecFileIO only) by achasnovskiy · Pull Request #3173 · apache/iceberg-python
Closes #2329
Rationale for this change
Some environments (e.g. AWS SCPs) require S3 uploads to include server-side encryption parameters (ServerSideEncryption, and for KMS also SSEKMSKeyId). PyIceberg writes table files through FileIO; for the fsspec backend this ultimately uses s3fs.S3FileSystem, which supports passing these fields via s3_additional_kwargs.
Previously, FsspecFileIO did not map any catalog/FileIO properties into s3_additional_kwargs, so users could not configure SSE for S3 writes through PyIceberg.
This change adds two optional configuration keys that are passed through to s3fs as ServerSideEncryption and SSEKMSKeyId respectively.
Note: Default s3:// FileIO selection prefers PyArrowFileIO when installed. These new properties are implemented for FsspecFileIO only; users who need them should set py-io-impl: pyiceberg.io.fsspec.FsspecFileIO (or ensure the fsspec-backed FileIO is selected).
Are these changes tested?
Yes.
make lintmake test
New unit tests assert that s3fs.S3FileSystem is constructed with the expected s3_additional_kwargs when s3.server-side-encryption and (optionally) s3.sse-kms-key-id are set.
Are there any user-facing changes?
Yes. New catalog/FileIO configuration properties:
s3.server-side-encryption— e.g.AES256oraws:kms(passed asServerSideEncryptionto S3 APIs via s3fs).s3.sse-kms-key-id— KMS key id or ARN when using SSE-KMS (passed asSSEKMSKeyId).
Documentation updated in the S3 FileIO section of the configuration docs.
Example (.pyiceberg.yaml snippet):
catalog: default: type: glue py-io-impl: pyiceberg.io.fsspec.FsspecFileIO s3.server-side-encryption: aws:kms s3.sse-kms-key-id: arn:aws:kms:us-east-1:123456789012:key/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx # ... other catalog config (region, credentials, etc.)