add check if libmagic fails by aadland6 · Pull Request #4273 · Unstructured-IO/unstructured
| # -- on some environments libmagic can return a generic/unhelpful MIME-type | ||
| # -- like octet-stream") for files that the `filetype` package identify. | ||
| # -- when that happens we retry using `filetype` `FileType.UNK` results. | ||
| if LIBMAGIC_AVAILABLE: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only need to call this when file_type == FileType.UNK and LIBMAGIC_AVAILABLE, right?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - only cases where libmagic is already available but failed to get the right file
-
mime_type is not None (otherwise the function already returned None), and
-
FileType.from_mime_type(mime_type) returned FileType.UNK, and
-
LIBMAGIC_AVAILABLE is True.
qued approved these changes Mar 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters