Convert Initializers to OrtValues Phase 2 by yuslepukhin · Pull Request #25320 · microsoft/onnxruntime
added 2 commits
July 7, 2025 19:35
jywu-msft
changed the title
Convert Initializers to OrtValues
Convert Initializers to OrtValues Phase 2
adrianlizarraga added a commit that referenced this pull request
Jul 24, 2025…tValues (#25482) ### Description - Adds APIs to get information (file path, file offset, byte size) for initializers with data in external files. This allows EPs to do their own custom memory-mapping of initializer data. By default, EPs that don't have specific requirements can still use `ValueInfo_GetInitializerValue` to get an `OrtValue` with memory-mapped initializer data. - Updates `OrtGraph` to only load `OrtValue` for external initializers on demand. This prevents having to memory map all external initializers before the first call to `OrtEp::GetCapability`. Follow up to #25320 New API functions: | Function | Summary| |-----------|--------------| | `ValueInfo_GetExternalInitializerInfo` | Get `OrtExternalInitializerInfo` from `OrtValueInfo` (or `NULL`). Must be released with `ReleaseExternalInitializerInfo`| | `ReleaseExternalInitializerInfo` | Releases the `OrtExternalInitializerInfo` instance | | `ExternalInitializerInfo_GetFilePath` | Returns the relative path to the file that stores the initializer's data | | `ExternalInitializerInfo_GetFileOffset` | Returns the byte offset within the file where the initializer's data is stored | | `ExternalInitializerInfo_GetByteSize` | Returns the size in bytes of the initializer's data within the file | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>
RyanMetcalfeInt8 pushed a commit to RyanMetcalfeInt8/onnxruntime that referenced this pull request
Jul 29, 2025…tValues (microsoft#25482) ### Description - Adds APIs to get information (file path, file offset, byte size) for initializers with data in external files. This allows EPs to do their own custom memory-mapping of initializer data. By default, EPs that don't have specific requirements can still use `ValueInfo_GetInitializerValue` to get an `OrtValue` with memory-mapped initializer data. - Updates `OrtGraph` to only load `OrtValue` for external initializers on demand. This prevents having to memory map all external initializers before the first call to `OrtEp::GetCapability`. Follow up to microsoft#25320 New API functions: | Function | Summary| |-----------|--------------| | `ValueInfo_GetExternalInitializerInfo` | Get `OrtExternalInitializerInfo` from `OrtValueInfo` (or `NULL`). Must be released with `ReleaseExternalInitializerInfo`| | `ReleaseExternalInitializerInfo` | Releases the `OrtExternalInitializerInfo` instance | | `ExternalInitializerInfo_GetFilePath` | Returns the relative path to the file that stores the initializer's data | | `ExternalInitializerInfo_GetFileOffset` | Returns the byte offset within the file where the initializer's data is stored | | `ExternalInitializerInfo_GetByteSize` | Returns the size in bytes of the initializer's data within the file | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>
This was referenced
Aug 1, 2025carzh pushed a commit that referenced this pull request
Aug 7, 2025### Description Make protobuf weights refer to OrtValues on load. Create OrtValues for initializers that are loaded from ORT format for uniformity. Create OrtValues for ORT format initializers. Adjust exporting Graph::ToGraphProto() so it does not export in memory references in external data. Make CoreML process external data including in memory references so it can copy it. ### Motivation and Context Follow up for #23979
adrianlizarraga pushed a commit that referenced this pull request
Aug 8, 2025### Description It is related to #25320 #23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With #25320 #23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <mingyue@amd.com>
adrianlizarraga pushed a commit that referenced this pull request
Aug 9, 2025### Description It is related to #25320 #23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With #25320 #23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <mingyue@amd.com>
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request
Aug 11, 2025### Description Make protobuf weights refer to OrtValues on load. Create OrtValues for initializers that are loaded from ORT format for uniformity. Create OrtValues for ORT format initializers. Adjust exporting Graph::ToGraphProto() so it does not export in memory references in external data. Make CoreML process external data including in memory references so it can copy it. ### Motivation and Context Follow up for microsoft#23979
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request
Aug 11, 2025…tValues (microsoft#25482) ### Description - Adds APIs to get information (file path, file offset, byte size) for initializers with data in external files. This allows EPs to do their own custom memory-mapping of initializer data. By default, EPs that don't have specific requirements can still use `ValueInfo_GetInitializerValue` to get an `OrtValue` with memory-mapped initializer data. - Updates `OrtGraph` to only load `OrtValue` for external initializers on demand. This prevents having to memory map all external initializers before the first call to `OrtEp::GetCapability`. Follow up to microsoft#25320 New API functions: | Function | Summary| |-----------|--------------| | `ValueInfo_GetExternalInitializerInfo` | Get `OrtExternalInitializerInfo` from `OrtValueInfo` (or `NULL`). Must be released with `ReleaseExternalInitializerInfo`| | `ReleaseExternalInitializerInfo` | Releases the `OrtExternalInitializerInfo` instance | | `ExternalInitializerInfo_GetFilePath` | Returns the relative path to the file that stores the initializer's data | | `ExternalInitializerInfo_GetFileOffset` | Returns the byte offset within the file where the initializer's data is stored | | `ExternalInitializerInfo_GetByteSize` | Returns the size in bytes of the initializer's data within the file | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request
Aug 11, 2025### Description It is related to microsoft#25320 microsoft#23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With microsoft#25320 microsoft#23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <mingyue@amd.com>
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request
Sep 2, 2025### Description It is related to microsoft#25320 microsoft#23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With microsoft#25320 microsoft#23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <mingyue@amd.com>
snnn
mentioned this pull request
snnn pushed a commit that referenced this pull request
Oct 8, 2025When Constant nodes have tensors larger than 127 bytes, they are converted to OrtValues with in-memory external data for efficiency. However, ONNX shape inference rejects TensorProtos with data_location=EXTERNAL, as it cannot distinguish between in-memory and file-based external data. This fix modifies InferenceContextImpl::getInputData() to detect in-memory external data and materialize it into a temporary TensorProto with embedded data that ONNX shape inference can process. Fixes #26261 The issue was introduced in commit 3b97d79 (PR #25320) which converted large initializers to OrtValues. This regression caused models with Constant nodes having tensors just over 127 bytes to fail loading with shape inference errors. Changes: - Modified getInputData() to check for in-memory external data using utils::HasExternalDataInMemory() - When detected, retrieves the OrtValue and creates a temporary TensorProto with embedded data (use_tensor_buffer=false) - Added temp_tensor_protos_ member to store these temporary protos so they outlive the shape inference call
snnn
mentioned this pull request
2 tasks
yuslepukhin added a commit that referenced this pull request
Oct 14, 2025## Description Fixes #26261 This PR resolves a regression introduced in v1.23.0 where models with Constant nodes containing tensors larger than 127 bytes fail to load with a shape inference error. ### Root Cause Commit 3b97d79 (PR #25320) introduced an optimization to convert large Constant node tensors (> 127 bytes) into OrtValues with in-memory external data references for better memory management. However, ONNX shape inference cannot distinguish between in-memory and file-based external data, and rejects any TensorProto with `data_location = EXTERNAL`. ### The Fix Modified `InferenceContextImpl::getInputData()` to: 1. Detect tensors with in-memory external data using `utils::HasExternalDataInMemory()` 2. Retrieve the corresponding OrtValue 3. Create a temporary TensorProto with embedded data (not external reference) 4. Provide this temporary proto to ONNX shape inference This allows ONNX shape inference to access the actual tensor data without rejecting it as external. ### Memory Impact This fix introduces a minor and temporary increase in memory usage during the model loading phase. - **When:** The additional memory is allocated only when the shape inference engine needs to access the data of a constant tensor that is larger than 127 bytes. This is a one-time event during the initial analysis of the model. - **What:** The fix creates a temporary in-memory copy of the tensor data. - **Duration:** This temporary copy is released as soon as shape inference is complete. The impact on the overall peak memory usage of the application is expected to be negligible. The memory usage during inference is not affected. While it is theoretically possible for the temporary tensor to be large if a multi-gigabyte constant tensor is used for shape inference, this is a highly unlikely scenario in practice for well-designed models. ### Testing - Tested with the problematic model from issue #26261 - All optimization levels now work correctly (DISABLE_ALL, BASIC, EXTENDED, ALL) - Unit tests to be added ### Changes - **onnxruntime/core/graph/graph.cc**: - Modified `getInputData()` method in `InferenceContextImpl` class - Added `temp_tensor_protos_` member to store temporary TensorProtos during shape inference ## TODO - [ ] Add unit tests - [ ] Run full test suite --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
apsonawane pushed a commit that referenced this pull request
Oct 17, 2025## Description Fixes #26261 This PR resolves a regression introduced in v1.23.0 where models with Constant nodes containing tensors larger than 127 bytes fail to load with a shape inference error. ### Root Cause Commit 3b97d79 (PR #25320) introduced an optimization to convert large Constant node tensors (> 127 bytes) into OrtValues with in-memory external data references for better memory management. However, ONNX shape inference cannot distinguish between in-memory and file-based external data, and rejects any TensorProto with `data_location = EXTERNAL`. ### The Fix Modified `InferenceContextImpl::getInputData()` to: 1. Detect tensors with in-memory external data using `utils::HasExternalDataInMemory()` 2. Retrieve the corresponding OrtValue 3. Create a temporary TensorProto with embedded data (not external reference) 4. Provide this temporary proto to ONNX shape inference This allows ONNX shape inference to access the actual tensor data without rejecting it as external. ### Memory Impact This fix introduces a minor and temporary increase in memory usage during the model loading phase. - **When:** The additional memory is allocated only when the shape inference engine needs to access the data of a constant tensor that is larger than 127 bytes. This is a one-time event during the initial analysis of the model. - **What:** The fix creates a temporary in-memory copy of the tensor data. - **Duration:** This temporary copy is released as soon as shape inference is complete. The impact on the overall peak memory usage of the application is expected to be negligible. The memory usage during inference is not affected. While it is theoretically possible for the temporary tensor to be large if a multi-gigabyte constant tensor is used for shape inference, this is a highly unlikely scenario in practice for well-designed models. ### Testing - Tested with the problematic model from issue #26261 - All optimization levels now work correctly (DISABLE_ALL, BASIC, EXTENDED, ALL) - Unit tests to be added ### Changes - **onnxruntime/core/graph/graph.cc**: - Modified `getInputData()` method in `InferenceContextImpl` class - Added `temp_tensor_protos_` member to store temporary TensorProtos during shape inference ## TODO - [ ] Add unit tests - [ ] Run full test suite --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
apsonawane pushed a commit that referenced this pull request
Oct 20, 2025## Description Fixes #26261 This PR resolves a regression introduced in v1.23.0 where models with Constant nodes containing tensors larger than 127 bytes fail to load with a shape inference error. ### Root Cause Commit 3b97d79 (PR #25320) introduced an optimization to convert large Constant node tensors (> 127 bytes) into OrtValues with in-memory external data references for better memory management. However, ONNX shape inference cannot distinguish between in-memory and file-based external data, and rejects any TensorProto with `data_location = EXTERNAL`. ### The Fix Modified `InferenceContextImpl::getInputData()` to: 1. Detect tensors with in-memory external data using `utils::HasExternalDataInMemory()` 2. Retrieve the corresponding OrtValue 3. Create a temporary TensorProto with embedded data (not external reference) 4. Provide this temporary proto to ONNX shape inference This allows ONNX shape inference to access the actual tensor data without rejecting it as external. ### Memory Impact This fix introduces a minor and temporary increase in memory usage during the model loading phase. - **When:** The additional memory is allocated only when the shape inference engine needs to access the data of a constant tensor that is larger than 127 bytes. This is a one-time event during the initial analysis of the model. - **What:** The fix creates a temporary in-memory copy of the tensor data. - **Duration:** This temporary copy is released as soon as shape inference is complete. The impact on the overall peak memory usage of the application is expected to be negligible. The memory usage during inference is not affected. While it is theoretically possible for the temporary tensor to be large if a multi-gigabyte constant tensor is used for shape inference, this is a highly unlikely scenario in practice for well-designed models. ### Testing - Tested with the problematic model from issue #26261 - All optimization levels now work correctly (DISABLE_ALL, BASIC, EXTENDED, ALL) - Unit tests to be added ### Changes - **onnxruntime/core/graph/graph.cc**: - Modified `getInputData()` method in `InferenceContextImpl` class - Added `temp_tensor_protos_` member to store temporary TensorProtos during shape inference ## TODO - [ ] Add unit tests - [ ] Run full test suite --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
fs-eire pushed a commit that referenced this pull request
Oct 24, 2025## Description Fixes #26261 This PR resolves a regression introduced in v1.23.0 where models with Constant nodes containing tensors larger than 127 bytes fail to load with a shape inference error. ### Root Cause Commit 3b97d79 (PR #25320) introduced an optimization to convert large Constant node tensors (> 127 bytes) into OrtValues with in-memory external data references for better memory management. However, ONNX shape inference cannot distinguish between in-memory and file-based external data, and rejects any TensorProto with `data_location = EXTERNAL`. ### The Fix Modified `InferenceContextImpl::getInputData()` to: 1. Detect tensors with in-memory external data using `utils::HasExternalDataInMemory()` 2. Retrieve the corresponding OrtValue 3. Create a temporary TensorProto with embedded data (not external reference) 4. Provide this temporary proto to ONNX shape inference This allows ONNX shape inference to access the actual tensor data without rejecting it as external. ### Memory Impact This fix introduces a minor and temporary increase in memory usage during the model loading phase. - **When:** The additional memory is allocated only when the shape inference engine needs to access the data of a constant tensor that is larger than 127 bytes. This is a one-time event during the initial analysis of the model. - **What:** The fix creates a temporary in-memory copy of the tensor data. - **Duration:** This temporary copy is released as soon as shape inference is complete. The impact on the overall peak memory usage of the application is expected to be negligible. The memory usage during inference is not affected. While it is theoretically possible for the temporary tensor to be large if a multi-gigabyte constant tensor is used for shape inference, this is a highly unlikely scenario in practice for well-designed models. ### Testing - Tested with the problematic model from issue #26261 - All optimization levels now work correctly (DISABLE_ALL, BASIC, EXTENDED, ALL) - Unit tests to be added ### Changes - **onnxruntime/core/graph/graph.cc**: - Modified `getInputData()` method in `InferenceContextImpl` class - Added `temp_tensor_protos_` member to store temporary TensorProtos during shape inference ## TODO - [ ] Add unit tests - [ ] Run full test suite --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
naomiOvad pushed a commit to naomiOvad/onnxruntime that referenced this pull request
Nov 2, 2025…6263) ## Description Fixes microsoft#26261 This PR resolves a regression introduced in v1.23.0 where models with Constant nodes containing tensors larger than 127 bytes fail to load with a shape inference error. ### Root Cause Commit 3b97d79 (PR microsoft#25320) introduced an optimization to convert large Constant node tensors (> 127 bytes) into OrtValues with in-memory external data references for better memory management. However, ONNX shape inference cannot distinguish between in-memory and file-based external data, and rejects any TensorProto with `data_location = EXTERNAL`. ### The Fix Modified `InferenceContextImpl::getInputData()` to: 1. Detect tensors with in-memory external data using `utils::HasExternalDataInMemory()` 2. Retrieve the corresponding OrtValue 3. Create a temporary TensorProto with embedded data (not external reference) 4. Provide this temporary proto to ONNX shape inference This allows ONNX shape inference to access the actual tensor data without rejecting it as external. ### Memory Impact This fix introduces a minor and temporary increase in memory usage during the model loading phase. - **When:** The additional memory is allocated only when the shape inference engine needs to access the data of a constant tensor that is larger than 127 bytes. This is a one-time event during the initial analysis of the model. - **What:** The fix creates a temporary in-memory copy of the tensor data. - **Duration:** This temporary copy is released as soon as shape inference is complete. The impact on the overall peak memory usage of the application is expected to be negligible. The memory usage during inference is not affected. While it is theoretically possible for the temporary tensor to be large if a multi-gigabyte constant tensor is used for shape inference, this is a highly unlikely scenario in practice for well-designed models. ### Testing - Tested with the problematic model from issue microsoft#26261 - All optimization levels now work correctly (DISABLE_ALL, BASIC, EXTENDED, ALL) - Unit tests to be added ### Changes - **onnxruntime/core/graph/graph.cc**: - Modified `getInputData()` method in `InferenceContextImpl` class - Added `temp_tensor_protos_` member to store temporary TensorProtos during shape inference ## TODO - [ ] Add unit tests - [ ] Run full test suite --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters