• public class Matrix
    extends Object

    Matrix encapsulates a SystemDS matrix. It allows for easy conversion to various other formats, such as RDDs, JavaRDDs, DataFrames, and double[][]s. After script execution, it offers a convenient format for obtaining SystemDS matrix data in Scala tuples.

    • Constructor Summary

      Constructors 
      Constructor Description
      Matrix​(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)

      Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.

      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)

      Convert a Spark DataFrame to a SystemDS binary-block representation.

      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)

      Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.

      Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)

      Convert a Spark DataFrame to a SystemDS binary-block representation.

      Matrix​(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)  
    • Constructor Detail

      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame,
                      MatrixMetadata matrixMetadata)

        Convert a Spark DataFrame to a SystemDS binary-block representation.

        Parameters:
        dataFrame - the Spark DataFrame
        matrixMetadata - matrix metadata, such as number of rows and columns
      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame,
                      long numRows,
                      long numCols)

        Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.

        Parameters:
        dataFrame - the Spark DataFrame
        numRows - the number of rows
        numCols - the number of columns
      • Matrix

        public Matrix​(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> binaryBlocks,
                      MatrixMetadata matrixMetadata)

        Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.

        Parameters:
        binaryBlocks - the JavaPairRDD<MatrixIndexes, MatrixBlock> matrix
        matrixMetadata - matrix metadata, such as number of rows and columns
      • Matrix

        public Matrix​(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)

        Convert a Spark DataFrame to a SystemDS binary-block representation.

        Parameters:
        dataFrame - the Spark DataFrame
    • Method Detail

      • toMatrixObject

        public MatrixObject toMatrixObject()

        Obtain the matrix as a SystemDS MatrixObject.

        Returns:
        the matrix as a SystemDS MatrixObject
      • to2DDoubleArray

        public double[][] to2DDoubleArray()

        Obtain the matrix as a two-dimensional double array

        Returns:
        the matrix as a two-dimensional double array
      • toJavaRDDStringIJV

        public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringIJV()

        Obtain the matrix as a JavaRDD<String> in IJV format

        Returns:
        the matrix as a JavaRDD<String> in IJV format
      • toJavaRDDStringCSV

        public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringCSV()

        Obtain the matrix as a JavaRDD<String> in CSV format

        Returns:
        the matrix as a JavaRDD<String> in CSV format
      • toRDDStringCSV

        public org.apache.spark.rdd.RDD<String> toRDDStringCSV()

        Obtain the matrix as a RDD<String> in CSV format

        Returns:
        the matrix as a RDD<String> in CSV format
      • toRDDStringIJV

        public org.apache.spark.rdd.RDD<String> toRDDStringIJV()

        Obtain the matrix as a RDD<String> in IJV format

        Returns:
        the matrix as a RDD<String> in IJV format
      • toDF

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()

        Obtain the matrix as a DataFrame of doubles with an ID column

        Returns:
        the matrix as a DataFrame of doubles with an ID column
      • toDFDoubleWithIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleWithIDColumn()

        Obtain the matrix as a DataFrame of doubles with an ID column

        Returns:
        the matrix as a DataFrame of doubles with an ID column
      • toDFDoubleNoIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleNoIDColumn()

        Obtain the matrix as a DataFrame of doubles with no ID column

        Returns:
        the matrix as a DataFrame of doubles with no ID column
      • toDFVectorWithIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorWithIDColumn()

        Obtain the matrix as a DataFrame of vectors with an ID column

        Returns:
        the matrix as a DataFrame of vectors with an ID column
      • toDFVectorNoIDColumn

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorNoIDColumn()

        Obtain the matrix as a DataFrame of vectors with no ID column

        Returns:
        the matrix as a DataFrame of vectors with no ID column
      • toBinaryBlocks

        public org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,​MatrixBlock> toBinaryBlocks()

        Obtain the matrix as a JavaPairRDD<MatrixIndexes, MatrixBlock>

        Returns:
        the matrix as a JavaPairRDD<MatrixIndexes, MatrixBlock>
      • toMatrixBlock

        public MatrixBlock toMatrixBlock()

        Obtain the matrix as a MatrixBlock

        Returns:
        the matrix as a MatrixBlock
      • getMatrixMetadata

        public MatrixMetadata getMatrixMetadata()

        Obtain the matrix metadata

        Returns:
        the matrix metadata
      • toString

        public String toString()

        If MatrixObject is available, output MatrixObject.toString(). If MatrixObject is not available but MatrixMetadata is available, output MatrixMetadata.toString(). Otherwise output Object.toString().

        Overrides:
        toString in class Object
      • hasBinaryBlocks

        public boolean hasBinaryBlocks()

        Whether or not this matrix contains data as binary blocks

        Returns:
        true if data as binary blocks are present, false otherwise.
      • hasMatrixObject

        public boolean hasMatrixObject()

        Whether or not this matrix contains data as a MatrixObject

        Returns:
        true if data as binary blocks are present, false otherwise.