feat: Added kuberay support · feast-dev/feast@e0b698d
@@ -9,10 +9,11 @@ The Ray offline store is a data I/O implementation that leverages [Ray](https://
991010The Ray offline store provides:
1111- Ray-based data reading from file sources (Parquet, CSV, etc.)
12-- Support for both local and distributed Ray clusters
12+- Support for local, remote, and KubeRay (Kubernetes-managed) clusters
1313- Integration with various storage backends (local files, S3, GCS, HDFS)
1414- Efficient data filtering and column selection
1515- Timestamp-based data processing with timezone awareness
16+- Enterprise-ready KubeRay cluster support via CodeFlare SDK
161717181819## Functionality Matrix
@@ -59,9 +60,15 @@ For complex feature processing, historical feature retrieval, and distributed jo
59606061## Configuration
616262-The Ray offline store can be configured in your `feature_store.yaml` file. Below are two main configuration patterns:
63+The Ray offline store can be configured in your `feature_store.yaml` file. It supports **three execution modes**:
636464-### Basic Ray Offline Store
65+1. **LOCAL**: Ray runs locally on the same machine (default)
66+2. **REMOTE**: Connects to a remote Ray cluster via `ray_address`
67+3. **KUBERAY**: Connects to Ray clusters on Kubernetes via CodeFlare SDK
68+69+### Execution Modes
70+71+#### Local Mode (Default)
65726673For simple data I/O operations without distributed processing:
6774@@ -72,7 +79,44 @@ provider: local
7279offline_store:
7380type: ray
7481storage_path: data/ray_storage # Optional: Path for storing datasets
75-ray_address: localhost:10001 # Optional: Ray cluster address
82+```
83+84+#### Remote Ray Cluster
85+86+Connect to an existing Ray cluster:
87+88+```yaml
89+offline_store:
90+type: ray
91+storage_path: s3://my-bucket/feast-data
92+ray_address: "ray://my-cluster.example.com:10001"
93+```
94+95+#### KubeRay Cluster (Kubernetes)
96+97+Connect to Ray clusters on Kubernetes using CodeFlare SDK:
98+99+```yaml
100+offline_store:
101+type: ray
102+storage_path: s3://my-bucket/feast-data
103+use_kuberay: true
104+kuberay_conf:
105+cluster_name: "feast-ray-cluster"
106+namespace: "feast-system"
107+auth_token: "${RAY_AUTH_TOKEN}"
108+auth_server: "https://api.openshift.com:6443"
109+skip_tls: false
110+enable_ray_logging: false
111+```
112+113+**Environment Variables** (alternative to config file):
114+```bash
115+export FEAST_RAY_USE_KUBERAY=true
116+export FEAST_RAY_CLUSTER_NAME=feast-ray-cluster
117+export FEAST_RAY_AUTH_TOKEN=your-token
118+export FEAST_RAY_AUTH_SERVER=https://api.openshift.com:6443
119+export FEAST_RAY_NAMESPACE=feast-system
76120```
7712178122### Ray Offline Store + Compute Engine
@@ -175,8 +219,29 @@ batch_engine:
175219|--------|------|---------|-------------|
176220| `type` | string | Required | Must be `feast.offline_stores.contrib.ray_offline_store.ray.RayOfflineStore` or `ray` |
177221| `storage_path` | string | None | Path for storing temporary files and datasets |
178-| `ray_address` | string | None | Address of the Ray cluster (e.g., "localhost:10001") |
222+| `ray_address` | string | None | Ray cluster address (triggers REMOTE mode, e.g., "ray://host:10001") |
223+| `use_kuberay` | boolean | None | Enable KubeRay mode (overrides ray_address) |
224+| `kuberay_conf` | dict | None | **KubeRay configuration dict** with keys: `cluster_name` (required), `namespace` (default: "default"), `auth_token`, `auth_server`, `skip_tls` (default: false) |
225+| `enable_ray_logging` | boolean | false | Enable Ray progress bars and verbose logging |
179226| `ray_conf` | dict | None | Ray initialization parameters for resource management (e.g., memory, CPU limits) |
227+| `broadcast_join_threshold_mb` | int | 100 | Size threshold for broadcast joins (MB) |
228+| `enable_distributed_joins` | boolean | true | Enable distributed joins for large datasets |
229+| `max_parallelism_multiplier` | int | 2 | Parallelism as multiple of CPU cores |
230+| `target_partition_size_mb` | int | 64 | Target partition size (MB) |
231+| `window_size_for_joins` | string | "1H" | Time window for distributed joins |
232+233+#### Mode Detection Precedence
234+235+The Ray offline store automatically detects the execution mode using the following precedence:
236+237+1. **Environment Variables** (highest priority)
238+ - `FEAST_RAY_USE_KUBERAY`, `FEAST_RAY_CLUSTER_NAME`, etc.
239+2. **Config `kuberay_conf`**
240+ - If present → KubeRay mode
241+3. **Config `ray_address`**
242+ - If present → Remote mode
243+4. **Default**
244+ - Local mode (lowest priority)
180245181246#### Ray Compute Engine Options
182247@@ -385,6 +450,8 @@ job.persist(hdfs_storage, allow_overwrite=True)
385450386451### Using Ray Cluster
387452453+#### Standard Ray Cluster
454+388455To use Ray in cluster mode for distributed data access:
3894563904571. Start a Ray cluster:
@@ -406,6 +473,53 @@ offline_store:
406473ray start --address='head-node-ip:10001'
407474```
408475476+#### KubeRay Cluster (Kubernetes)
477+478+To use Feast with Ray clusters on Kubernetes via CodeFlare SDK:
479+480+**Prerequisites:**
481+- KubeRay cluster deployed on Kubernetes
482+- CodeFlare SDK installed: `pip install codeflare-sdk`
483+- Access credentials for the Kubernetes cluster
484+485+**Configuration:**
486+487+1. Using configuration file:
488+```yaml
489+offline_store:
490+ type: ray
491+ use_kuberay: true
492+ storage_path: s3://my-bucket/feast-data
493+ kuberay_conf:
494+ cluster_name: "feast-ray-cluster"
495+ namespace: "feast-system"
496+ auth_token: "${RAY_AUTH_TOKEN}"
497+ auth_server: "https://api.openshift.com:6443"
498+ skip_tls: false
499+ enable_ray_logging: false
500+```
501+502+2. Using environment variables:
503+```bash
504+export FEAST_RAY_USE_KUBERAY=true
505+export FEAST_RAY_CLUSTER_NAME=feast-ray-cluster
506+export FEAST_RAY_AUTH_TOKEN=your-k8s-token
507+export FEAST_RAY_AUTH_SERVER=https://api.openshift.com:6443
508+export FEAST_RAY_NAMESPACE=feast-system
509+export FEAST_RAY_SKIP_TLS=false
510+511+# Then use standard Feast code
512+python your_feast_script.py
513+```
514+515+**Features:**
516+- The CodeFlare SDK handles cluster connection and authentication
517+- Automatic TLS certificate management
518+- Authentication with Kubernetes clusters
519+- Namespace isolation
520+- Secure communication between client and Ray cluster
521+- Automatic cluster discovery
522+409523### Data Source Validation
410524411525The Ray offline store validates data sources to ensure compatibility: