ScalaCL lets you run Scala code on GPUs through OpenCL (BSD-licensed).
WORK IN PROGRESS (see ScalaCL if you want something that works, albeit only on Scala 2.9.x).
See slides from ScalaCL + Reified talk @ scala.io 2013.
Features of the new design (v3, rewritten from scratch again!):
- Much better asynchronicity support (now requires OpenCL 1.1), and much better performance in general
- Support for captures of constants and OpenCL arrays
- Support for lazy clones for fast zipping
- Kernels are now fully specialized on static types and generated at compile-time (allows much faster startup and caching at runtime)
- ScalaCL Collections no longer fit in regular Scala Collections, to avoid silent data transfers / conversions when using unaccelerated methods (syntax stays the same, though)
- No more CLRange: expecting compiler to do its job
TODO
- Finish Scalaxy/Reified integration (started under CLFunc / CLFuncUtils)
- Add more tests: DataIO, CodeConversion, scheduling, uniqueness / caching of kernels
- Implement more DataIO[T], support case classes as tuples
- Catch up with compiler plugin:
- Auto-vectorization
- 1D works
- Add 2D
- add filters
- Import Scalaxy streams, make them work with scala.reflection.api.Universe
- Auto-vectorization
- Plug some v2 runtime code back (filtered array compaction, reduceSymmetric, parallel sums...)
- Benchmarks!
- Wanna help? Ping the NativeLibs4Java mailing-list!
Usage
scalaVersion := "2.11.4" libraryDependencies += "com.nativelibs4java" %% "scalacl" % "0.3-SNAPSHOT" // Avoid sbt-related macro classpath issues. fork := true // Scalaxy/Reified snapshots are published on the Sonatype repository. resolvers += Resolver.sonatypeRepo("snapshots")
Examples
The following example currently works:
import scalacl._ case class Matrix(data: CLArray[Float], rows: Int, columns: Int) (implicit context: Context) { def this(rows: Int, columns: Int) (implicit context: Context) = this(new CLArray[Float](rows * columns), rows, columns) def this(n: Int) (implicit context: Context) = this(n, n) def putProduct(a: Matrix, b: Matrix): Unit = { assert(a.columns == b.rows) assert(a.rows == rows) assert(b.columns == columns) kernel { // This block will either be converted to an OpenCL kernel or cause compilation error for (i <- 0 until rows; j <- 0 until columns) { // c(i, j) = sum(k, a(i, k) * b(k, j)) data(i * columns + j) = ( for (k <- 0 until a.columns) yield a.data(i * a.columns + k) * b.data(k * b.columns + j) ).sum } } } def putSum(a: Matrix, b: Matrix): Unit = { assert(a.columns == b.columns && a.columns == columns) assert(a.rows == b.rows && a.rows == rows) kernel { for (i <- 0 until rows; j <- 0 until columns) { val offset = i * columns + j data(offset) = a.data(offset) + b.data(offset) } } } } implicit val context = Context.best val n = 10 val a = new Matrix(n) val b = new Matrix(n) val out = new Matrix(n) out.putProduct(a, b) println(out.data)