Make dataset snippets testable by tseaver · Pull Request #1832 · googleapis/google-cloud-python
…mmediately deploy functions to BigQuery (#1832) * Refactor function deployment to avoid code duplication This commit refactors the implementation of immediate deployment for remote functions and UDFs to eliminate code duplication introduced in a previous commit. Changes: - The `remote_function` and `udf` methods in `bigframes.functions._function_session.FunctionSession` now accept an optional `deploy_immediately: bool` parameter (defaulting to `False`). The previous `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` have been removed, and their logic is now incorporated into the unified methods. - The public API functions `bigframes.pandas.deploy_remote_function` and `bigframes.pandas.deploy_udf` now call the corresponding `FunctionSession` methods with `deploy_immediately=True`. - The public API functions `bigframes.pandas.remote_function` and `bigframes.pandas.udf` call the `FunctionSession` methods with `deploy_immediately=False` (relying on the default). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to patch the unified `FunctionSession` methods and verify the correct `deploy_immediately` boolean is passed based on which public API function is called. Note: The underlying provisioning logic in `FunctionSession` currently deploys functions immediately regardless of the `deploy_immediately` flag. This flag serves as an indicator of intent and allows for future enhancements to support true lazy deployment if desired, without further API changes. * Refactor function deployment to use distinct methods This commit corrects a previous refactoring attempt to eliminate code duplication and properly separates immediate-deployment functions from standard (potentially lazy) functions. Changes: - `bigframes.functions._function_session.FunctionSession` now has distinct methods: `remote_function`, `udf`, `deploy_remote_function`, and `deploy_udf`. The `deploy_immediately` flag has been removed from this class. - `deploy_remote_function` and `deploy_udf` methods in `FunctionSession` are responsible for ensuring immediate deployment by calling the underlying provisioning logic directly. The standard `remote_function` and `udf` methods in `FunctionSession` also currently call this provisioning logic, meaning all functions are deployed immediately as of now, but the structure allows for future lazy evaluation for standard functions without changing the deploy variants' contract. - Public API functions in `bigframes.pandas` (`remote_function`, `udf`, `deploy_remote_function`, `deploy_udf`) now correctly delegate to their corresponding distinct methods in `FunctionSession` (via the `Session` object). - Unit tests in `tests/unit/functions/test_remote_function.py` have been updated to mock and verify calls to the correct distinct methods on `bigframes.session.Session`. This resolves the issue of using a boolean flag to control deployment type and instead relies on calling specific, dedicated methods for immediate deployment, aligning with your request. * Simplify internal deploy_remote_function and deploy_udf calls This commit simplifies the implementation of `deploy_remote_function` and `deploy_udf` within `bigframes.functions._function_session.FunctionSession`. Given that the standard `remote_function` and `udf` methods in `FunctionSession` already perform immediate deployment of resources (as the underlying provisioning logic they call is immediate), the `deploy_remote_function` and `deploy_udf` methods in the same class are simplified to directly call `self.remote_function(...)` and `self.udf(...)` respectively. This change makes the distinction between the `deploy_` variants and the standard variants in `FunctionSession` primarily a matter of semantic clarity and intent at that level; both paths currently result in immediate deployment. The public API in `bigframes.pandas` continues to offer distinct `deploy_` functions that call these `FunctionSession.deploy_` methods, preserving your user-facing API and its documented behavior of immediate deployment. No changes were needed for the public API in `bigframes.pandas` or the unit tests, as they were already aligned with calling distinct methods on the `Session` object, which in turn calls the now-simplified `FunctionSession` methods. * add tests and use kwargs * add missing func argument to bpd --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>