⚡ perf: Stream BigQuery results to Cloud Storage to prevent OOM by max-ostapenko · Pull Request #259 · HTTPArchive/dataform

@max-ostapenko

Refactored the BigQuery to Google Cloud Storage export process to use streams instead of loading the entire result set into a massive memory array. This resolves potential Out-Of-Memory (OOM) errors in Cloud Run and significantly improves overall memory efficiency for large exports.

- Updated `infra/dataform-service/src/index.js` to utilize `bigquery.queryResultsStream()`.
- Refactored `StorageUpload.exportToJson` in `infra/dataform-service/src/storage.js` to accept a stream.
- Implemented a custom `Transform` stream to efficiently format object chunks into a proper JSON array structure while buffering in batches of 1000 for high performance.
- Removed unused memory-bound `Readable` initialization from `StorageUpload` constructor.