parser: add IANA time_zone support for naive timestamps by arnikola · Pull Request #11639 · fluent/fluent-bit
Conversation
Add optional parser setting time_zone (IANA name, e.g. from tzdata) so naive timestamps from time_format are converted with local rules and DST, instead of only fixed offset or system timezone.
[N/A]
Enter [N/A] in the box, if an item is not applicable to your change.
Testing
Before we can approve your change; please submit the following in a comment:
- Example configuration file for the change
- Debug log output from testing the change
- Attached Valgrind output that shows no leaks or memory corruption was found
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
- Run local packaging test showing all targets (including any new ones) build.
- Set
ok-package-testlabel to test for all targets (requires maintainer to do).
Documentation
- Documentation required for this feature
Backporting
- Backport to latest stable release.
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Important
Review skipped
Auto reviews are disabled on base/target branches other than the default branch.
Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 350dd3b9-7a64-4202-bef7-8e0c58c7a54a
You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.
Use the checkbox below for a quick retry:
- 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7e01d163f6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Comment on lines +71 to +74
| if (parser->time_zone) { | ||
| char *old_tz = NULL; | ||
| const char *prev; | ||
| struct tm tmp; |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip time_zone conversion when timestamp already has offset
The new time_zone branch always uses mktime in the configured zone, even when time_format includes %z/%Z and the parsed record already carries an explicit offset. In that case flb_parser_time_lookup() has already parsed the source zone, but this path ignores it and reinterprets the wall clock in parser->time_zone, producing shifted epochs (for example, a +0000 input gets converted as local New York time). Please either reject time_zone with %z/%Z formats in flb_parser_create() or bypass this branch when parser->time_with_tz is true.
Useful? React with 👍 / 👎.
Comment on lines +86 to +90
| if (setenv("TZ", parser->time_zone, 1) != 0) { | ||
| flb_free(old_tz); | ||
| pthread_mutex_unlock(&flb_time_zone_mutex); | ||
| return (time_t) -1; | ||
| } |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid mutating process TZ during per-record parsing
This code changes TZ and calls tzset() for every parsed record, which is process-global state. The mutex only serializes this function, but other paths still call mktime without this lock (e.g. flb_parser_tm2time_simple for time_system_timezone), so concurrent parsing can observe the temporary timezone and compute wrong epochs. This creates cross-parser timestamp corruption when one thread uses time_zone and another thread parses local/system timestamps at the same time.
Useful? React with 👍 / 👎.
Example config files:
config.conf:
[SERVICE]
Parsers_File parsers.conf
[INPUT]
Name tail
Path tz-example.log
Tag tz.test
Parser tz_naive
Read_from_Head true
Exit_On_Eof true
[OUTPUT]
Name stdout
Match tz.test
Format json_lines
parsers.conf
[PARSER]
Name tz_naive
Format regex
Regex ^(?<ts>[^ ]+) (?<msg>.+)$
Time_Key ts
Time_Format %Y-%m-%dT%H:%M:%S
Time_Keep On
Time_Strict On
Time_Zone Australia/Sydney
The patch direction looks great but we need to add Signed-off line on every commit. So, could you add this line?
This can be achieved by:
To add your Signed-off-by line to every commit in this branch:
Ensure you have a local copy of your branch by [checking out the pull request locally via command line](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/checking-out-pull-requests-locally).
In your local branch, run: git rebase HEAD~1 --signoff
Force push your changes to overwrite the branch: git push --force-with-lease origin arnikola/parser-timezones-dst
ref: https://github.com/fluent/fluent-bit/pull/11639/checks?check_run_id=69300784903
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters