Do not cache splits as attributes of the `EvaluationWindow` by shchur · Pull Request #86 · autogluon/fev

Issue #, if available:

Problem: When calling task.evaluation_summary(), new EvaluationWindow objects are created for each window. Each window then calls _prepare_dataset_dict() which loads and splits all data from scratch. This is a bigger problem now since all datasets are stored in memory.

Solution:

  • Remove _dataset_dict caching from EvaluationWindow
  • Add return_past and return_future flags to past_future_split() to skip unnecessary slicing
  • get_ground_truth() now only processes [id, timestamp, target] columns with return_past=False, avoiding slicing of past data entirely

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.