ARROW-3366: [R] Dockerfile for docker-compose setup by jameslamb · Pull Request #2770 · apache/arrow
Thank you to @romainfrancois and others who have pushed forward the R side of this project!
This PR is my attempt to address ARROW-3336, providing a testing container for the R package.
This follows up on work done by @kszucs in #2572 in an R-specific way.
NOTE: This PR is a WIP. R CMD INSTALL currently fails because it cannot find wherever I installed arrow to. But I felt that this is far enough along to put up for review.
@jameslamb great, thanks for implementing it!
I've recently changed the semantics of the docker-compose containers to optimize the build times, We need to adjust the image a little bit.
@wesm @romainfrancois sorry for the delay, no this is not ready to go yet.
I will update it this weekend and @ you when it's ready. Hit a weird problem with how R's search path for packages works, I think I know how to resolve it.
@kszucs I got past the install issue (by updating LD_LIBRARY_PATH) but hitting a new error.
Running
docker-compose build cpp
docker-compose build r
docker-compose run r
Makes it through building the C++ package and compiling the source code in the R package, but fails to install the R library with this error:
installing to /usr/local/lib/R/site-library/arrow/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/usr/local/lib/R/site-library/arrow/libs/arrow.so':
/usr/local/lib/R/site-library/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed
* removing '/usr/local/lib/R/site-library/arrow'
I found the same problem reported in #1319 so going to look into that diff.
Oh also I rebased to master as of an hour ago, so this is reflective of the most recent changes to the project.
I put the R build instructions (like R CMD INSTALL and R CMD check calls) into a ci/docker_build_r.sh for consistency with how other languages set up their stuff.
But still stuck on this undefined symbol error. Any advice would be appreciated!
@jameslamb it looks like one part of the toolchain used -D_GLIBCXX_USE_CXX11_ABI=0 and another didn't
R CMD check runs now, however a couple of tests are failing.
docker-compose run r recompiles the R package every time, is it possible to cache that in the mounted /build/r directory?
@kszucs thanks for the help with this!
IMHO it wouldn't be worthwhile to cache the built R package, since checking whether it can be recompiled (given some change in the arrow C++ lib) is an important piece of testing it.
As for the test errors...I can replicate them inside the container but they don't show up on my Mac or on Travis, so definitely something I need to look into. I don't think we need to worry about the portability warning in the Makefile right now (since this project has a way to go before it is CRAN-ready anyway) but I do think those test failures should block this PR. Going to look into them.
I have add #2888 to deal with one problem with the R package (undeclared dependency) but I'm still struggling. Current error I get when trying to install the package:
** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/arrow/r/arrow.Rcheck/arrow/libs/arrow.so':
/arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed
I'm actually able to install the package with
R CMD INSTALL arrow_0.0.0.9000.tar.gz
And can technically library it in
Rscript -e "library(arrow)"
But that undefined symbol error comes up when I run devtools::load_all(), devtools::test(), or R CMD check arrow_0.0.0.9000.tar.gz --as-cran.
I'm not sure what to do next. I've looked through the Travis setup for R (was succeeding at least as of this build) but nothing in there is striking me as materially different from what I've done.
coming back to this tonight. I'm still stuck on that undefined symbol error. Will report back if I make progress.
You have a build toolchain problem (compiler flags). One component is using the gcc5 ABI while another is not.
ahhhh I see this in .travis.yml
MATRIX_EVAL="CC=gcc-4.9 && CXX=g++-4.9"
And confirmed I'm running gcc 7.x in the container. Ok this is progress!
See my comment above for the flag you need to pass to the part that is being built with the newer ABI
I'm already using this:
-D_GLIBCXX_USE_CXX11_ABI=0 -Wabi-tag
And I can tell from the R logs that R is respecting that flag. Is that the flag you're talking about?
The symbol it's complaining about is in c_glib/example/lua/stream-to-torch-tensor.lua.
Is there some other environment variable I need to be setting to avoid trying to build the torch stuff? It's not obvious to me looking at .travis.yml.
So in the error message
/arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
This means that whatever shared library is linking to libarrow.so has inlined some functions from the Arrow headers with the gcc5 ABI. Unmangled the symbol is
arrow::timestamp(arrow::TimeUnit::type, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
so in some compilation unit the CXXFLAGS isn't being propagated
This was referenced
Nov 10, 2018Here's what I'm seeing running this locally:
https://gist.github.com/wesm/ac8f6f8072620e0bb4c9de07f7deb5f3
So the extension appears to link fine. In this part
* checking whether package 'arrow' can be installed ... OK
* checking installed package size ... NOTE
installed size is 9.5Mb
sub-directories of 1Mb or more:
R 1.6Mb
libs 7.8Mb
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
Files 'README.md' or 'NEWS.md' cannot be checked without 'pandoc' being installed.
"checking whether package 'arrow' can be installed ... OK" took a LONG time. Is it compiling the extension again? I speculate that it's at this point that the CXXFLAGS in the environment is being cleansed. It seems logical to me that R CMD CHECK would want to be robust to snowflake-y environment variables
@romainfrancois is there a way to force a certain value in CXXFLAGS to be appended without relying on an environment variable?
@wesm I'm going to try setting it directly in r/src/Makevars.in. I bet that's it.
Will report back shortly
That wouldn't be a long term solution (since this variable needs to match libarrow.so, which might have been built with the newer ABI), but at least it will unblock us here for now
yeah I mean at least if that works, we've at least isolated the problem and I can open an issue documenting it for someone to pick up in the future.
I ALMOST GOT IT!!!!
I think I'm one failing unit test away from this working. This test (https://github.com/apache/arrow/blob/master/r/tests/testthat/test-RecordBatch.R#L140) on RecordBatch is breaking the build. Somehow testthat::expect_error is not trapping the error, and the std::bad_alloc that gets thrown is breaking the tests.
I have a few ideas, testing now.
(also, btw, I force pushed because I rebased to as-of-5-minutes-ago master).
Updates to follow.
| batch4 <- read_record_batch(mmap_file) | ||
| batch5 <- read_record_batch(bytes) | ||
| batch6 <- read_record_batch(buf_reader) | ||
| expect_error(read_record_batch(buf_reader)) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romainfrancois I believe this PR is now working! However, I want to talk about this test.
I had to remove this to get R CMD CHECK or devtools::test() to complete. I think the issue is that expect_error cannot trap C++ exceptions. When running with this test restored, testing immediately stops with error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted
Have you ever seen that issue with expect_error() before? I tried googling around and nothing obvious jumped out to me. I have no idea how this could be working on Travis but not in my setup on docker :/.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question is why an exception is being thrown at all
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s what Rcpp::stop does, but this is supposed to be caught by the generated code.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters