chore(deps): update dependency pyarrow to v23 by renovate[bot] · Pull Request #242 · A-aung/python-docs-samples
ℹ️ Note
This PR body was truncated due to platform limits.
This PR contains the following updates:
| Package | Change | Age | Confidence |
|---|---|---|---|
| pyarrow | ==3.0.0 → ==23.0.1 |
Warning
Some dependencies could not be looked up. Check the Dependency Dashboard for more information.
Release Notes
apache/arrow (pyarrow)
v6.0.1
Bug Fixes
- ARROW-14437 - [Python] Make CSV cancellation test more robust
- ARROW-14492 - [JS] Fix export for browser bundles
- ARROW-14513 - [Release][Go] Add /v6 suffix to release-6.0.0
- ARROW-14519 - [C++] joins segfault when data contains list column
- ARROW-14523 - [C++] Fix potential data loss in S3 multipart upload
- ARROW-14538 - [R] Work around empty tr call on Solaris
- ARROW-14550 - [Doc] Remove the JSON license; a non-free one.
- ARROW-14583 - [R][C++] Crash when summarizing after filtering to no rows on partitioned data
- ARROW-14584 - [Python][CI] Python sdist installation fails with latest setuptools 58.5
- ARROW-14620 - [Python] Missing bindings for existing_data_behavior makes it impossible to maintain old behavior
- ARROW-14630 - [C++] DCHECK in GroupByNode when error encountered
- ARROW-14739 - [JS][Docs] Point to wrong source
- ARROW-15071 - [C#] Fixed a bug in Column.cs ValidateArrayDataTypes method
- ARROW-15072 - [R] Error: This build of the arrow package does not support Datasets
New Features and Improvements
- ARROW-13156 - [R] bindings for str_count
- ARROW-14181 - [C++][Compute] Hash Join support for dictionary
- ARROW-14189 - [Docs] Add version dropdown to the sphinx docs
- ARROW-14310 - [R] Make expect_dplyr_equal() more intuitive
- ARROW-14365 - [R] Update README example to reflect new capabilities
- ARROW-14390 - [Packaging][Ubuntu] Add support for Ubuntu 21.10
- ARROW-14433 - [Release][APT] Skip arm64 Ubuntu 21.04 verification
- ARROW-14450 - [R] Old macos build error
- ARROW-14459 - [Doc] Update the pinned sphinx version to 4.2
- ARROW-14480 - [R] Expose arrow::dataset::ExistingDataBehavior to R
- ARROW-14486 - [Packaging][deb] Add missing libthrift-dev dependency
- ARROW-14490 - [Doc] Regenerate CHANGELOG.md to include all versions
- ARROW-14496 - [Docs] Create relative links for R / JS / C/Glib references in the sphinx toctree using stub pages
- ARROW-14499 - [Docs] Version dropdown side-by-side with search box
- ARROW-14514 - [C++][R] UBSAN error on round kernel
- ARROW-14580 - [Python] update trove classifiers to include Python 3.10
- ARROW-14623 - [Packaging][Java] Upload not only .jar but also .pom
- ARROW-14628 - [Release][Python] Use python -m pytest
- ARROW-15058 - [Java] Remove log4j2 dependency in performance module
v6.0.0
Bug Fixes
- ARROW-6946 - [Go] Run tests with assert build tag enabled to ensure safety
- ARROW-8452 - [Go] support proper nested nullable flags
- ARROW-8453 - [Go][Integration] Support and enable recursive nested type integration tests
- ARROW-8999 - [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build
- ARROW-9948 - [C++] Fix scale handling in Decimal{128, 256}::FromString
- ARROW-10213 - [C++] Temporal cast from timestamp to date rounds instead of extracting date component
- ARROW-10373 - [C++] Validate null_count in Array::ValidateFull()
- ARROW-10773 - [R] parallel as.data.frame.Table hangs indefinitely on Windows
- ARROW-11518 - [C++][Parquet] Fix buffer allocation when reading/skipping boolean columns
- ARROW-11579 - [R] read_feather hanging on Windows
- ARROW-11634 - [C++][Parquet] Parquet statistics (min/max) for dictionary columns are incorrect
- ARROW-11729 - [R] Add examples to datasets documentation
- ARROW-12011 - [C++] Fix crashes and incorrect results when printing extreme date values
- ARROW-12072 - [Go] Fix panics in ipc writer for sliced records
- ARROW-12087 - [C++] Allow sorting durations, timestamps with timezones
- ARROW-12321 - [R][C++] Arrow opens too many files at once when writing a dataset
- ARROW-12513 - [C++][Parquet] Parquet Writer always puts null_count=0 in Parquet statistics for dictionary-encoded array with nulls
- ARROW-12540 - [C++] Implementing casting support from date32/date64 to uft8/large_utf8
- ARROW-12636 - [JS] ESM Tree-Shaking produces broken code
- ARROW-12700 - [R] Read/Write_feather stuck forever after bad write, R, Win32
- ARROW-12837 - [C++] Do not crash when printing invalid arrays
- ARROW-13134 - [C++][CI] Unpin conda package for aws-sdk-cpp
- ARROW-13151 - [C++][Parquet] Propagate schema changes from selection all the way up the stack
- ARROW-13198 - [C++][Dataset] Async scanner occasionally segfaulting in CI
- ARROW-13293 - [R] open_dataset followed by collect hangs (while compute works)
- ARROW-13304 - [C++] Unable to install nightly on Ubuntu 21.04 due to day of week options
- ARROW-13336 - [Doc] Make clean in docs should clean generated docs
- ARROW-13422 - [R] Clarify README about S3 support on Windows
- ARROW-13424 - [C++] Remove needless workaround for conda and benchmark
- ARROW-13425 - [Archery] Avoid importing PyArrow indirectly
- ARROW-13429 - [C++][Gandiva] Fix Gandiva codegen for if-else expression with binary type
- ARROW-13430 - [Go] fix handling of zero value for FromBigInt
- ARROW-13436 - [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns
- ARROW-13437 - [C++] Relax FixedSizeList validation to allow excess child values
- ARROW-13441 - [C++][CSV] Skip empty batches in column decoder
- ARROW-13443 - [C++] : Fix the incorrect mapping from flatbuf::MetadataVersion to arrow::ipc::MetadataVersion
- ARROW-13445 - [Java][Packaging] Fix artifact patterns for the Java jars
- ARROW-13446 - [Release] Fix verification on amazon linux
- ARROW-13447 - [Release] Verification script for arm64 and universal2 macOS wheels
- ARROW-13450 - [Python][Packaging] Set deployment target to 10.13 for universal2 wheels
- ARROW-13469 - [C++] Suppress -Wmissing-field-initializers in DayMilliseconds arrow/type.h
- ARROW-13474 - [Python] Fix crash in take/filter of empty ExtensionArray
- ARROW-13477 - [Release] Pass ARTIFACTORY_API_KEY to the upload script
- ARROW-13484 - [Release] Add support for uploading Amazon Linux 2 packages
- ARROW-13490 - [R][CI] Need to gate duckdb examples on duckdb version
- ARROW-13492 - [R][CI] Move r tools 35 build back to per-commit/pre-PR
- ARROW-13493 - [C++] Anonymous structs in an anonymous union are a GNU extension
- ARROW-13495 - [C++][Compute] Fixing unaligned memory access in GrouperFastImpl
- ARROW-13496 - [CI][R] Repair r-sanitizer job
- ARROW-13497 - [C++][R] FunctionOptions not used by aggregation nodes
- ARROW-13499 - [R] Aggregation on expression doesn't NSE correctly
- ARROW-13500 - [C++] Fix using '-Wno-unknown-warning-option' with GCC
- ARROW-13504 - [Python] Move marks from fixtures to individual tests/params
- ARROW-13507 - [R] LTO job on CRAN fails
- ARROW-13509 - [C++] Take kernel with empty inputs
- ARROW-13522 - [C++] Fix regression in UTF8 trim functions
- ARROW-13523 - [C++] Normalize test executable name
- ARROW-13524 - [C++] Fix description for ApplicationVersion::VersionEq
- ARROW-13529 - [Go] Fixing too many releases in IPC writer
- ARROW-13538 - [R][CI] Don't test DuckDB in the minimal build
- ARROW-13543 - [R] Handle summarize() with 0 arguments or no aggregate functions
- ARROW-13556 - [C++] Add protobuf to linking for flight
- ARROW-13559 - [CI][C++] Move the test-conda-cpp-valgrind nightly build to azure
- ARROW-13560 - [R] Allow Scanner$create() to accept filter / project even with arrow_dplyr_querys
- ARROW-13580 - [C++] quoted_strings_can_be_null only applied to string columns
- ARROW-13597 - [C++][Compute] Remove AddOnLoad helper
- ARROW-13600 - [C++] Fix maybe uninitialized warnings
- ARROW-13602 - [C++] Fix strict aliasing warning in bit util test
- ARROW-13603 - [GLib] Fix typos in GARROW_VERSION_CHECK()
- ARROW-13605 - [C++] Capture node with shared_ptr to avoid TSan warning
- ARROW-13608 - [R] vendor cpp11 to fix segfault under LTO
- ARROW-13611 - [C++] Scanning datasets does not enforce back pressure
- ARROW-13624 - [R] readr short type mapping has T and t backwards
- ARROW-13628 - [Format][C++][Java] Add MONTH_DAY_NANO interval type
- ARROW-13630 - [CI][C++][s390x] Reduce parallelism to build Arrow library
- ARROW-13632 - [C++] Fix filtering of sliced FixedSizeList array
- ARROW-13638 - [C++] Hold owned copy of function options in GroupByNode
- ARROW-13639 - [C++] Fix out-of-bounds access in Concatenate with null slots and empty dictionary
- ARROW-13654 - [C++][Parquet] Avoid infinite loop when appending a FileMetaData to itself
- ARROW-13655 - [C++][Parquet] Disable Thrift message size protections
- ARROW-13662 - [CI] Fix failing strftime test with older pandas
- ARROW-13662 - [CI] Failing test test_extract_datetime_components with pandas 0.24
- ARROW-13669 - [C++] Fix variant emplace methods (add brackets)
- ARROW-13671 - [Dev] Fix conda recipe on Arm 64k page system
- ARROW-13676 - [C++][Parquet] Avoid potential invalid access.
- ARROW-13681 - [C++] Fix list_parent_indices behaviour on chunked array
- ARROW-13685 - [C++] Cannot write dataset to S3FileSystem if bucket already exists
- ARROW-13689 - [C#][Integration] Initial commit of C# Integration tests
- ARROW-13694 - [R] Arrow filter crashes (R aborted session)
- ARROW-13743 - [CI] OSX job fails due to incompatible git and libcurl
- ARROW-13744 - [CI] c++14 and 17 nightly job fails
- ARROW-13747 - [Python][CI] Requiring s3fs >= 2021.8
- ARROW-13755 - [Python] Allow writing datasets using a partitioning that only specifies field_names
- ARROW-13761 - [R] arrow::filter() crashes (aborts R session)
- ARROW-13784 - [Python] Table.from_arrays should raise an error when array is empty but names is not
- ARROW-13786 - [R][CI] Don't fail the RCHK build if arrow doesn't build
- ARROW-13788 - [C++] Temporal component extraction functions don't support date32/64
- ARROW-13792 - [Java] : The toString representation is incorrect for unsigned integer vectors
- ARROW-13799 - [R] case_when error handling is capturing strings
- ARROW-13800 - [R] Use divide instead of divide_checked
- ARROW-13812 - [C++] Fix Valgrind error in Grouper.BooleanKey test
- ARROW-13814 - [CI] Fix Spark master integration tests
- ARROW-13819 - [C++] Initialize subseconds in value_parsing.h
- ARROW-13846 - [C++] Fix crashes on invalid IPC file
- ARROW-13850 - [C++] Fix crashes on invalid Parquet data
- ARROW-13860 - [R] arrow 5.0.0 write_parquet throws error writing grouped data.frame
- ARROW-13865 - [C++][R] Writing moderate-size parquet files of nested dataframes from R slows down/process hangs
- ARROW-13872 - [Java] ExtensionTypeVector does not work with RangeEqualsVisitor
- ARROW-13876 - [C++] Add trivial null kernels to arithmetic, sort functions
- ARROW-13877 - [C++] Support FixedSizeList in generic list kernels
- ARROW-13878 - [C++] Implement fixed-size-binary support for several kernels
- ARROW-13880 - [C++] Compute function sort_indices does not support timestamps with time zones
- ARROW-13881 - [C++][FlightRPC][Packaging] Ensure Flight is packaged with advanced TLS options on Windows
- ARROW-13882 - [C++] Improve min_max/hash_min_max type support
- ARROW-13884 - [JS] Move source files into a separate directory
- ARROW-13912 - [R] TrimOptions implementation breaks test-r-minimal-build due to dependencies
- ARROW-13913 - [C++] Don't segfault if IndexOptions omitted
- ARROW-13915 - [R][CI] R UCRT C++ bundles are incomplete
- ARROW-13916 - [C++] Implement strftime on date32/64 types
- ARROW-13921 - [Python][Packaging] Pin minimum setuptools version for the macos wheels
- ARROW-13940 - [R] Turn on multithreading with Arrow engine queries
- ARROW-13961 - [C++] Fix use of non-const references, declaration without initialization
- ARROW-13976 - [C++] Add path to libjvm.so in ARM CPU
- ARROW-13978 - [C++] Bump gtest to 1.11 to unbreak builds with recent clang
- ARROW-13981 - [Java] VectorSchemaRootAppender doesn't work for BitVector
- ARROW-13982 - [C++] Don't stall in async scanner if a fragment generates no batches
- ARROW-13983 - [C++] Avoid raising error if fadvise() isn't supported
- ARROW-13996 - [Go][Parquet] Fix file offsets in go impl
- ARROW-13997 - [C++] restore exec node based query performance
- ARROW-14001 - [Go] Fixing AppendBoolean function in BitmapWriter
- ARROW-14004 - [Python][Doc] Document nullable dtypes handling and usage of types_mapper in to_pandas conversion
- ARROW-14014 - [Java] Fix Flight parseTrailers for :status keys
- ARROW-14017 - [C++] NULLPTR is not included in type_fwd.h
- ARROW-14020 - [R] Writing datafames with list columns is slow and scales poorly with nesting level
- ARROW-14024 - [C++] Test that batch size is respected for IPC/CSV
- ARROW-14026 - [C++] Enable batch parallelism in Parquet scanner
- ARROW-14027 - [C++] Handle scalars in Grouper
- ARROW-14040 - [C++] Fix result order dependence in scanner test
- ARROW-14053 - [C++][CSV] Use atomic counter for async tests
- ARROW-14057 - [C++] Bump aws-c-common version
- ARROW-14063 - [R] open_dataset() does not work on CSVs without header rows
- ARROW-14076 - Unable to use `red-arrow` gem on Heroku/Ubuntu 20.04 (focal)
- ARROW-14090 - [C++][Parquet] rows_written_ should be int64_t instead of int
- ARROW-14103 - [R] [C++] Allow min/max in grouped aggregation
- ARROW-14109 - [C++] Fix segfault when parsing JSON with duplicate keys.
- ARROW-14124 - [R] Timezone support in R <= 3.4
- ARROW-14129 - [C++][Python] Fix unique/value_counts on empty dictionary arrays
- ARROW-14139 - [IR][C++] Table flatbuffer object fails to compile on older GCCs
- ARROW-14141 - [IR][C++] Join missing from RelationImpl
- ARROW-14156 - [C++] Properly synthesize validity buffer in StructArray::Flatten
- ARROW-14162 - [R] Simple arrange %>% head does not respect ordering
- ARROW-14173 - [IR] Allow typed null literals to be represented
- ARROW-14179 - [C++][C] Do not export/import null bitmap for union and null types
- ARROW-14184 - [C++] allow joins where the keys include new columns on the left
- ARROW-14192 - [C++][Dataset] Backpressure broken on ordered scans
- ARROW-14195 - [R] Fix ExecPlan binding annotations
- ARROW-14197 - [C++][Compute] Fixing wrong buffer size in GrouperFastImpl
- ARROW-14200 - [R] strftime on a date should not use or be confused by timezones
- ARROW-14203 - [C++] Fix description of ExecBatch.length for Scalars in aggregate kernels
- ARROW-14204 - [C++] Fails to compile Arrow without RE2 due to missing ifdef guard in MatchLike
- ARROW-14206 - [Go][Parquet] Clean up s390x and arm build code
- ARROW-14206 - [Go][CI] Fix build on s390x and ARM
- ARROW-14208 - [C++] Fix compilation on Windows
- ARROW-14210 - [C++] Add AR and RANLIB flags to bzip2
- ARROW-14211 - [C++][Compute] Fixing thread sanitizer problems in hash join node
- ARROW-14214 - [Python][CI] Fix tests using OrcFileFormat for Python 3.6 + orc not built
- ARROW-14216 - [R] Disable auto-cleaning of duckdb tables
- ARROW-14219 - [R][CI] DuckDB valgrind failure
- ARROW-14220 - [C++] Missing ending quote in thirdpartyversions
- ARROW-14221 - [R][CI] DuckDB tests fail on R < 4.0
- ARROW-14223 - [C++] add missing third-party dependency
- ARROW-14224 - [C++] Try to reduce build time/memory usage
- ARROW-14226 - [R] Handle n_distinct() (and others) with args != 1
- ARROW-14237 - [R][CI] Disable altrep in R <= 3.5
- ARROW-14240 - [C++] Fix wrong nlohmann-json header path
- ARROW-14246 - [C++] Fix wrong find_package() usage in build_google_cloud_cpp_storage()
- ARROW-14247 - [C++] Fix Valgrind errors in parquet-arrow-test
- ARROW-14249 - [R] Slow down in dataframe-to-table benchmark
- ARROW-14252 - [R] Partial matching of arguments warning
- ARROW-14255 - [Python] Fix FlightClient.do_action
- ARROW-14257 - [Python][Docs] Fix usage of sync scanner in dataset writing docs
- ARROW-14260 - [C++] GTest linker error with vcpkg and Visual Studio 2019
- ARROW-14283 - [CI][C++] Use LLVM 12 on macOS GHA builds
- ARROW-14285 - [C++] Fix crashes when pretty-printing data from valid IPC file
- ARROW-14299 - [Dev][CI] Avoid downloading MinIO multiple times
- ARROW-14300 - [C++][R][CI] Work around missing include in xsimd
- ARROW-14301 - [C++] use consistent CMAKE_CXX_STANDARD definition
- ARROW-14302 - [C++] Valgrind errors
- ARROW-14305 - [C++][Compute] Fixing Valgrind errors in hash join node tests
- ARROW-14307 - [R] crashes when reading empty feather with POSIXct column
- ARROW-14313 - [Doc] Make Archery installation docs more accurate
- ARROW-14321 - [R] segfault converting dictionary ChunkedArray with 0 chunks
- ARROW-14340 - [C++] Bump xsimd to fix build error on Apple M1
- ARROW-14370 - [C++] Fix memory leak in SeqMergedGeneratorTestFixture.ErrorItem
- ARROW-14373 - [Packaging][Java] Missing LLVM dependency in the macOS java-jars build
- ARROW-14377 - [Packaging][Python] Python 3.9 installation fails in macOS wheel build
- ARROW-14381 - [CI][Python] Fix Spark integration failures
- ARROW-14382 - [C++][Compute] Remove duplicated ThreadIndexer definition
- ARROW-14392 - [C++] Bundled gRPC misses bundled Abseil include path
- ARROW-14393 - [C++] GTest linking errors during the source release verification
- ARROW-14397 - [C++] Fix valgrind error in test utility
- ARROW-14406 - [CI] Skip failing test on dask-master nightly build
- ARROW-14411 - [Release][Integration] Go integration tests fail for 6.0.0-RC1
- ARROW-14417 - [R] Joins ignore projection on left dataset
- ARROW-14423 - [Python] Fix version constraints in pyproject.toml
- ARROW-14424 - [Packaging][Python] Disable windows wheel testing for python 3.6
- ARROW-14434 - R crashes when making an empty selection for Datasets with DateTime
- ARROW-14439 - [Python][C++] Segfault with read_json when a field is missing
- PARQUET-2067 - [C++][Parquet] Fix Parquet null count stats for enclosing null lists
- PARQUET-2089 - [C++] Align RowGroup file_offset with specification
New Features and Improvements
- ARROW-1565 - [C++] Implement TopK/BottomK
- ARROW-1568 - [C++] Implement Drop Null Kernel for Arrays
- ARROW-4333 - [C++] Sketch out design for kernels and "query" execution in compute layer
- ARROW-4700 - [C++] Added support for decimal128 and decimal256 json converted
- ARROW-5002 - [C++] Implement Hash Aggregation query execution node
- ARROW-5244 - [C++] Remove experimental marker from some APIs
- ARROW-6072 - [C++] Implement casting List <-> LargeList
- ARROW-6607 - [Python] Support for set/list columns when converting from Pandas
- ARROW-6626 - [Python] Support converting nested sets when converting to arrow
- ARROW-6870 - [C#] Add Support for Dictionary Arrays and Dictionary Encoding
- ARROW-7102 - [Python] Make filesystems compatible with fsspec
- ARROW-7179 - [C++][Python][R] Consolidate coalesce/fill_null
- ARROW-7901 - [Go][Integration] enable integration tests for null case
- ARROW-8022 - [C++] Add static and small vector implementations
- ARROW-8147 - [C++] add GCS library to ThirdpartyToolchain
- ARROW-8379 - [R] Investigate/fix thread safety issues (esp. Windows)
- ARROW-8621 - [Release] Add post release step to add tags for Go versioning
- ARROW-8780 - [Python][Doc] Document the fsspec wrapper for pyarrow.fs filesystems
- ARROW-8928 - [C++] Add microbenchmarks to help measure ExecBatchIterator overhead
- ARROW-9226 - [Python] Support core-site.xml default filesystem.
- ARROW-9434 - [C++] Store type code in UnionScalar
- ARROW-9719 - [Python] Improve HadoopFileSystem docstring
- ARROW-10094 - [Python][Doc] Document missing pandas to arrow conversions
- ARROW-10415 - [R] Support for dplyr::distinct()
- ARROW-10898 - [C++] Improve table sort performance
- ARROW-11238 - [Python] Make SubTreeFileSystem print method more informative
- ARROW-11243 - [C++] Recognize time types in CSV files
- ARROW-11460 - [R] Use system libraries if present on Linux
- ARROW-11691 - [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables
- ARROW-11748 - [C++] Ensure Decimal fields are in native endian order
- ARROW-11828 - [C++] Expose CSVWriter object in api
- ARROW-11885 - [R] Turn off some capabilities when LIBARROW_MINIMAL=true
- ARROW-11981 - [C++] Implement Union ExecNode
- ARROW-12063 - [C++] Add null placement option to sort functions
- ARROW-12181 - [C++][R] The "CSV dataset" in test-dataset.R is failing on RTools 3.5
- ARROW-12216 - [R] Proactively disable multithreading on RTools3.5 (32bit?)
- ARROW-12359 - [C++] Deprecate FileSystem::OpenAppendStream
- ARROW-12388 - [C++][Gandiva] Implement cast numbers from varbinary functions in gandiva
- ARROW-12410 - [C++][Gandiva] Implement regexp_replace function on Gandiva
- ARROW-12479 - [C++][Gandiva] Implement castBigInt, castInt, castIntervalDay and castIntervalYear extra functions
- ARROW-12563 - [C++][Gandiva] Add add_months and datediff functions for string
- ARROW-12615 - [C++] Add options for handling NAs to stddev and variance
- ARROW-12650 - [Doc][Python] Improve documentation regarding dealing with memory mapped files
- ARROW-12657 - [C++] Adding String hex to numeric conversion
- ARROW-12669 - [C++][Python] Implement a new scalar function: list_element
- ARROW-12673 - [C++] Add callback to handle incorrect column counts
- ARROW-12688 - [R] Use DuckDB to query an Arrow Dataset
- ARROW-12714 - [C++] String title case kernel
- ARROW-12725 - [C++][Compute] Column at a time hash and comparison in group by
- ARROW-12728 - [C++] Implement count_distinct/distinct hash aggregate kernels
- ARROW-12744 - [C++][Compute] Add rounding kernel
- ARROW-12759 - [C++][Compute] Add ExecNode for group by
- ARROW-12763 - [R] Optimize dplyr queries that use head/tail after arrange
- ARROW-12846 - [Release] Reduce download/upload bandwidth for APT/Yum repositories
- ARROW-12866 - [C++][Gandiva] Implement STRPOS function on Gandiva
- ARROW-12871 - [R] upgrade to testthat 3e
- ARROW-12876 - [R] Fix build flags on Raspberry Pi
- ARROW-12944 - [C++] String capitalize kernel
- ARROW-12946 - [C++] String swap case kernel
- ARROW-12953 - [C++][Compute] Refactor CheckScalar* to take Datum arguments
- ARROW-12959 - [C++][R] Option for is_null(NaN) to evaluate to true
- ARROW-12965 - [Java] C Data Interface implementation
- ARROW-12980 - [C++] Kernels to extract datetime components should be timezone aware
- ARROW-12981 - [R] Install source package from CRAN alone
- ARROW-13033 - [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)
- ARROW-13056 - [MATLAB] Add a matlab label for dev Pull Requests
- ARROW-13067 - [C++][Compute] Implement integer to decimal cast
- ARROW-13089 - [Python] Allow creating RecordBatch from Python dict
- ARROW-13112 - [R] altrep vectors for strings and other types
- ARROW-13132 - [C++] Add Scalar validation
- ARROW-13138 - [C++][R] Implement extract temporal components (year, month, day, etc) from date32/64 types
- ARROW-13141 - [Python] Update HadoopFileSystem docs to clarify setting CLASSPATH env variable is required
- ARROW-13163 - [C++][Gandiva] Implement REPEAT function on Gandiva
- ARROW-13164 - [R] altrep vectors from Array with nulls
- ARROW-13172 - [Java] Make TYPE_WIDTH publicly accessible
- ARROW-13174 - [C++][Compute] Add strftime kernel
- ARROW-13202 - [MATLAB] Enable GitHub Actions CI for MATLAB Interface on Linux
- ARROW-13218 - [Format] Clarify interpretation of timestamp values
- ARROW-13220 - [C++] Implement 'choose' function
- ARROW-13222 - [C++] Improve type support for case_when
- ARROW-13227 - [Documentation][Compute] Document ExecNode
- ARROW-13257 - [Java][Dataset] Allow passing empty columns for projection
- ARROW-13268 - [C++][Compute] Add ExecNode for semi and anti-semi join
- ARROW-13279 - [R] Use C++ DayOfWeekOptions in wday implementation instead of manually calculating via Expression
- ARROW-13287 - [C++] [Dataset] FileSystemDataset::Write should use an async scan
- ARROW-13295 - [C++] add hash_mean, hash_variance, hash_stddev kernels
- ARROW-13298 - [C++] Implement any/all hash aggregate kernels
- ARROW-13307 - [C++] Remove reflection-based enums
- ARROW-13311 - [C++][Documentation] Document hash aggregate kernels
- ARROW-13317 - [Python] Improve documentation on what 'use_threads' does in 'read_feather'
- ARROW-13326 - [R][Archery] Add linting to dev CI
- ARROW-13327 - [C++][Python] Improve consistency of explicit C++ types in PyArrow files
- ARROW-13330 - [Go][Parquet] Add the rest of the Encoding package
- ARROW-13344 - [R] Initial bindings for ExecPlan/ExecNode
- ARROW-13345 - [C++] Add basic implementation for log to base b
- ARROW-13358 - [C++] Improve type support in if_else
- ARROW-13379 - [Dev][Docs] Improvements to archery docs
- ARROW-13390 - [C++] Implement coalesce for remaining types
- ARROW-13397 - [R] Update arrow.Rmd vignette
- ARROW-13399 - [R] Update dataset.Rmd vignette
- ARROW-13402 - [R] Update flight.Rmd vignette
- ARROW-13403 - [R] Update developing.Rmd vignette
- ARROW-13404 - [Doc][Python] Improve PyArrow documentation for new users
- ARROW-13405 - [Doc] Guide users to the documentation for their own platform
- ARROW-13416 - [C++] Implement mod compute function
- ARROW-13420 - [JS] Update dependencies
- ARROW-13421 - [C++][Python] Add CSV convert option to change decimal point
- ARROW-13433 - [R] Remove CLI hack from Valgrind test
- ARROW-13434 - [R] group_by() with an unnammed expression
- ARROW-13435 - [R] Add function arrow_table() as alias for Table$create()
- ARROW-13444 - [C++] Remove usage of deprecated std::result_of
- ARROW-13448 - [R] Bindings for strftime
- ARROW-13453 - [R] DuckDB has not yet released 0.2.8
- ARROW-13455 - [C++][Docs] Typo in RecordBatch::SetColumn
- ARROW-13458 - [C++][Docs] Typo in RecordBatch::schema
- ARROW-13459 - [C++][Docs] Missing param docs for RecordBatch::SetColumn
- ARROW-13461 - [Python][Packaging] Build M1 wheels for python 3.8
- ARROW-13463 - [Release][Python] Verify python 3.8 macOS arm64 wheel
- ARROW-13465 - [R] to_arrow() from duckdb
- ARROW-13466 - [R] make installation fail if Arrow C++ dependencies cannot be installed
- ARROW-13468 - [Release] Fix binary download/upload failures
- ARROW-13472 - [R] Remove .engine = "duckdb" argument
- ARROW-13475 - [Release] Don't consider rust tarballs when cleaning up old releases
- ARROW-13476 - [Doc][Python] Switch ipc/io doc to use context managers
- ARROW-13478 - [Release] Unnecessary rc-number argument for the version bumping post-release script
- ARROW-13480 - [C++] Fix possible deadlock when dataset produces an error
- ARROW-13482 - [C++][Compute] Refactoring away from hard coded ExecNode factories to a registry
- ARROW-13485 - [Release] Replace ${PREVIOUS_RELEASE}.9000 in r/NEWS.md by post-12-bump-versions.sh
- ARROW-13488 - [Website] Update Linux packages install information for 5.0.0
- ARROW-13489 - [R] Bump CI jobs after 5.0.0
- ARROW-13501 - [R] Bindings for count aggregation
- ARROW-13502 - [R] Bindings for min/max aggregation
- ARROW-13503 - [GLib][Ruby][Flight] Add support for DoGet
- ARROW-13506 - [C++][Java] Upgrade ORC to 1.6.9
- ARROW-13508 - [C++] Support custom retry strategies in S3Options
- ARROW-13510 - [CI][R][C++] Add -Wall to fedora-clang-devel as-cran checks
- ARROW-13511 - [CI][R] Fail in the docker build step if R deps don't install
- ARROW-13516 - [C++] Detect --version-script flag availability
- ARROW-13519 - [R] Make doc examples less noisy
- ARROW-13520 - [C++] Implement hash_aggregate tdigest kernel
- ARROW-13521 - [C++][Docs] Add note about tdigest in compute functions docs
- ARROW-13525 - [Python] Mention alternative deprecation message for ParquetDataset.partitions
- ARROW-13528 - [R] Bindings for mean, var, sd aggregation
- ARROW-13532 - [C++][Compute] - adding set membership type filtering to hash table interface
- ARROW-13534 - [C++] Improve csv chunker
- ARROW-13540 - [C++] Add order by sink node
- ARROW-13541 - [C++][Python] Implement ExtensionScalar
- ARROW-13542 - [C++][Compute][Dataset] Add dataset::WriteNode for writing rows from an ExecPlan to disk
- ARROW-13544 - [Java] : Remove APIs that have been deprecated for long (Changes to ArrowBuf)
- ARROW-13544 - [Java] : Remove APIs that have been deprecated for long (Changes to JDBC)
- ARROW-13544 - [Java] : Remove APIs that have been deprecated for long (Changes to Vectors)
- ARROW-13548 - [C++] Implement temporal difference kernels
- ARROW-13549 - [C++] Add casts from timestamp to date/time
- ARROW-13550 - [R] Support .groups argument to dplyr::summarize()
- ARROW-13552 - [C++] Remove deprecated APIs
- ARROW-13557 - [Packaging][Python] Skip test_cancellation test case on M1
- ARROW-13561 - [C++] Implement week kernel that accepts WeekOptions
- ARROW-13562 - [R] Styler followups
- ARROW-13565 - [Packaging][Ubuntu] Drop support for 20.10
- ARROW-13572 - [C++][Datasets] Add ORC support to Datasets API
- ARROW-13573 - [C++] Support dictionaries natively in case_when
- ARROW-13574 - [C++] Add 'count all' option to count kernels
- ARROW-13575 - [C++] Add hash_product kernel
- ARROW-13576 - [C++] Replace ExecNode::InputReceived with ::MakeTask
- ARROW-13577 - [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error
- ARROW-13585 - [GLib] Add support for C ABI interface
- ARROW-13587 - [R] Handle --use-LTO override
- ARROW-13595 - [C++] Add debug mode check for compute kernel output type
- ARROW-13604 - [Java] : Remove deprecation annotations for APIs representing unsupported operations
- ARROW-13606 - [R] Actually disable LTO
- ARROW-13613 - [C++] Add decimal support to (hash) sum/mean/product
- ARROW-13614 - [C++] Add decimal support to min_max/hash_min_max
- ARROW-13618 - [R] Use Arrow engine for summarize() by default
- ARROW-13620 - [R] Binding for n_distinct()
- ARROW-13626 - [R] Bindings for log base b
- ARROW-13627 - [C++] Fully support ScalarAggregateOptions in (hash) any/all/sum/product/mean
- ARROW-13629 - [Ruby] Add support for building/converting map
- ARROW-13633 - [Packaging][Debian] Add support for bookworm
- ARROW-13634 - [R] Update distro() in nixlibs.R to map from "bookworm" to 12
- ARROW-13635 - [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds
- ARROW-13637 - [Python] Fix docstrings
- ARROW-13642 - [C++][Compute] Hash join node supporting all semi, anti, inner, outer join types
- ARROW-13645 - [Java] : Allow NullVectors to have distinct field names
- ARROW-13646 - [Go][Parquet] adding the parquet metadata package
- ARROW-13648 - [Dev] Use #!/usr/bin/env instead of #!/bin where possible
- ARROW-13650 - [C++] Create dataset writer to encapsulate dataset writer logic
- ARROW-13651 - [Ruby][Symbol] to Arrow array
- ARROW-13652 - [Python] Expose copy_files in pyarrow.fs
- ARROW-13660 - [C++] Remove seq_num from ExecNode::InputReceived
- ARROW-13670 - [C++] add virtual destructors
- ARROW-13674 - [CI] PR checks should check for JIRA components
- ARROW-13675 - [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook
- ARROW-13679 - [GLib][Ruby] Add support for group aggregation
- ARROW-13680 - [C++] Create an asynchronous nursery to simplify capture logic
- ARROW-13682 - [C++] Add TDigest API to merge one TDigest
- ARROW-13684 - [C++][Compute] Strftime kernel follow-up
- ARROW-13686 - [Python] Update deprecated pytest yield_fixture functions
- ARROW-13687 - [Ruby] Add support for loading table by Arrow Dataset
- ARROW-13691 - [C++] Support skip_nulls/min_count in VarianceOptions
- ARROW-13693 - [Website] arrow-site should pin down a specific Ruby version and leverage toolings like rbenv
- ARROW-13696 - [Python] Support for MapType with Fields
- ARROW-13699 - [Python][Docs] Improve filesystem documentation
- ARROW-13700 - [Docs][C++] Clarify DayOfWeekOptions args
- ARROW-13702 - [Python] Add dataset mark to test_parquet_dataset_deprecated_properties
- ARROW-13704 - [C#] Add support for reading streaming format delta dictionaries
- ARROW-13705 - [Website] Pin node version
- ARROW-13721 - [Doc][Cookbook] Specifying Schemas - Python
- ARROW-13733 - [Java] : Allow JDBC adapters to
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Never, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
- If you want to rebase/retry this PR, check this box
This PR was generated by Mend Renovate. View the repository job log.