ARROW-3366: [R] Dockerfile for docker-compose setup by jameslamb · Pull Request #2770 · apache/arrow

@jameslamb

Thank you to @romainfrancois and others who have pushed forward the R side of this project!

This PR is my attempt to address ARROW-3336, providing a testing container for the R package.

This follows up on work done by @kszucs in #2572 in an R-specific way.

NOTE: This PR is a WIP. R CMD INSTALL currently fails because it cannot find wherever I installed arrow to. But I felt that this is far enough along to put up for review.

@romainfrancois

I don't really speak docker, but that LGTM

@kszucs

@jameslamb great, thanks for implementing it!

I've recently changed the semantics of the docker-compose containers to optimize the build times, We need to adjust the image a little bit.

kszucs

kszucs

kszucs

kszucs

@wesm

@jameslamb

@wesm @romainfrancois sorry for the delay, no this is not ready to go yet.

I will update it this weekend and @ you when it's ready. Hit a weird problem with how R's search path for packages works, I think I know how to resolve it.

@jameslamb

@kszucs I got past the install issue (by updating LD_LIBRARY_PATH) but hitting a new error.

Running

docker-compose build cpp
docker-compose build r
docker-compose run r

Makes it through building the C++ package and compiling the source code in the R package, but fails to install the R library with this error:

installing to /usr/local/lib/R/site-library/arrow/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/arrow/libs/arrow.so':
  /usr/local/lib/R/site-library/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed
* removing '/usr/local/lib/R/site-library/arrow'

I found the same problem reported in #1319 so going to look into that diff.

@jameslamb

Oh also I rebased to master as of an hour ago, so this is reflective of the most recent changes to the project.

@codecov-io

@jameslamb

I put the R build instructions (like R CMD INSTALL and R CMD check calls) into a ci/docker_build_r.sh for consistency with how other languages set up their stuff.

But still stuck on this undefined symbol error. Any advice would be appreciated!

@wesm

@jameslamb it looks like one part of the toolchain used -D_GLIBCXX_USE_CXX11_ABI=0 and another didn't

kszucs

@kszucs

R CMD check runs now, however a couple of tests are failing.

docker-compose run r recompiles the R package every time, is it possible to cache that in the mounted /build/r directory?

@jameslamb

@kszucs thanks for the help with this!

IMHO it wouldn't be worthwhile to cache the built R package, since checking whether it can be recompiled (given some change in the arrow C++ lib) is an important piece of testing it.

As for the test errors...I can replicate them inside the container but they don't show up on my Mac or on Travis, so definitely something I need to look into. I don't think we need to worry about the portability warning in the Makefile right now (since this project has a way to go before it is CRAN-ready anyway) but I do think those test failures should block this PR. Going to look into them.

@jameslamb

I have add #2888 to deal with one problem with the R package (undeclared dependency) but I'm still struggling. Current error I get when trying to install the package:

** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/arrow/r/arrow.Rcheck/arrow/libs/arrow.so':
  /arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed

I'm actually able to install the package with

R CMD INSTALL arrow_0.0.0.9000.tar.gz

And can technically library it in

Rscript -e "library(arrow)"

But that undefined symbol error comes up when I run devtools::load_all(), devtools::test(), or R CMD check arrow_0.0.0.9000.tar.gz --as-cran.

I'm not sure what to do next. I've looked through the Travis setup for R (was succeeding at least as of this build) but nothing in there is striking me as materially different from what I've done.

@jameslamb

coming back to this tonight. I'm still stuck on that undefined symbol error. Will report back if I make progress.

@wesm

You have a build toolchain problem (compiler flags). One component is using the gcc5 ABI while another is not.

@wesm

In Travis CI I think we are using gcc < 5 so this can't happen there

@jameslamb

oh interesting ok! Looking into it.

@jameslamb

ahhhh I see this in .travis.yml

MATRIX_EVAL="CC=gcc-4.9 && CXX=g++-4.9"

And confirmed I'm running gcc 7.x in the container. Ok this is progress!

@wesm

See my comment above for the flag you need to pass to the part that is being built with the newer ABI

@jameslamb

I'm already using this:

-D_GLIBCXX_USE_CXX11_ABI=0 -Wabi-tag

And I can tell from the R logs that R is respecting that flag. Is that the flag you're talking about?

The symbol it's complaining about is in c_glib/example/lua/stream-to-torch-tensor.lua.

Is there some other environment variable I need to be setting to avoid trying to build the torch stuff? It's not obvious to me looking at .travis.yml.

@wesm

So in the error message

/arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

This means that whatever shared library is linking to libarrow.so has inlined some functions from the Arrow headers with the gcc5 ABI. Unmangled the symbol is

arrow::timestamp(arrow::TimeUnit::type, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)

so in some compilation unit the CXXFLAGS isn't being propagated

This was referenced

Nov 10, 2018

@wesm

Here's what I'm seeing running this locally:

https://gist.github.com/wesm/ac8f6f8072620e0bb4c9de07f7deb5f3

So the extension appears to link fine. In this part

* checking whether package 'arrow' can be installed ... OK
* checking installed package size ... NOTE
  installed size is  9.5Mb
  sub-directories of 1Mb or more:
    R      1.6Mb
    libs   7.8Mb
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
Files 'README.md' or 'NEWS.md' cannot be checked without 'pandoc' being installed.

"checking whether package 'arrow' can be installed ... OK" took a LONG time. Is it compiling the extension again? I speculate that it's at this point that the CXXFLAGS in the environment is being cleansed. It seems logical to me that R CMD CHECK would want to be robust to snowflake-y environment variables

@romainfrancois is there a way to force a certain value in CXXFLAGS to be appended without relying on an environment variable?

@jameslamb

@wesm I'm going to try setting it directly in r/src/Makevars.in. I bet that's it.

Will report back shortly

@wesm

That wouldn't be a long term solution (since this variable needs to match libarrow.so, which might have been built with the newer ABI), but at least it will unblock us here for now

@jameslamb

yeah I mean at least if that works, we've at least isolated the problem and I can open an issue documenting it for someone to pick up in the future.

@jameslamb

I ALMOST GOT IT!!!!

I think I'm one failing unit test away from this working. This test (https://github.com/apache/arrow/blob/master/r/tests/testthat/test-RecordBatch.R#L140) on RecordBatch is breaking the build. Somehow testthat::expect_error is not trapping the error, and the std::bad_alloc that gets thrown is breaking the tests.

I have a few ideas, testing now.

(also, btw, I force pushed because I rebased to as-of-5-minutes-ago master).

Updates to follow.

@jameslamb

@jameslamb

jameslamb

batch4 <- read_record_batch(mmap_file)
batch5 <- read_record_batch(bytes)
batch6 <- read_record_batch(buf_reader)
expect_error(read_record_batch(buf_reader))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romainfrancois I believe this PR is now working! However, I want to talk about this test.

I had to remove this to get R CMD CHECK or devtools::test() to complete. I think the issue is that expect_error cannot trap C++ exceptions. When running with this test restored, testing immediately stops with error:

terminate called after throwing an instance of 'std::bad_alloc'
    what():  std::bad_alloc
  Aborted

Have you ever seen that issue with expect_error() before? I tried googling around and nothing obvious jumped out to me. I have no idea how this could be working on Travis but not in my setup on docker :/.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question is why an exception is being thrown at all

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s what Rcpp::stop does, but this is supposed to be caught by the generated code.

@romainfrancois

I’ll have a look but this should not happen.

kszucs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.