⚡ perf: Stream BigQuery results to Cloud Storage to prevent OOM by max-ostapenko · Pull Request #259 · HTTPArchive/dataform
Refactored the BigQuery to Google Cloud Storage export process to use streams instead of loading the entire result set into a massive memory array. This resolves potential Out-Of-Memory (OOM) errors in Cloud Run and significantly improves overall memory efficiency for large exports. - Updated `infra/dataform-service/src/index.js` to utilize `bigquery.queryResultsStream()`. - Refactored `StorageUpload.exportToJson` in `infra/dataform-service/src/storage.js` to accept a stream. - Implemented a custom `Transform` stream to efficiently format object chunks into a proper JSON array structure while buffering in batches of 1000 for high performance. - Removed unused memory-bound `Readable` initialization from `StorageUpload` constructor.