The String API is the most basic and straightforward way to select columns in Kotlin DataFrame operations.
In String API operation overloads, selected column names are provided directly as String values in function arguments:
// Select "name" and "info" columns df.select("name", "info")
String Column Accessors
The String API can also be used inside the Columns Selection DSL and row expressions via String column accessors.
String column accessors allow you to access nested columns and combine them with the extensions properties or with any other CS DSL methods.
String column accessors are created using special functions. In the Columns Selection DSL, they have the special type ColumnAccessor, while in row expressions they resolve to concrete value types.
You can optionally specify the column type as a type argument of the String column accessor creation function. This is required for row expressions and for some operations with a column selection. If the specified type does not match the actual column type, a runtime exception may be thrown.
Columns Seletcion DSL | Row Expressions | |
|---|---|---|
|
| Resolves into general |
|
| Resolves into |
|
| Resolves into |
|
| Resolves into |
Example
Consider a simple hierarchical dataframe from example.csv.
This table consists of two columns: name, which is a String column, and info, which is a column group containing two nested value columns — age of type Int, and height of type Double.
name | info | |
|---|---|---|
age | height | |
Alice | 23 | 175.5 |
Bob | 27 | 160.2 |
Columns Selection DSL
Get a single "height" subcolumn from the "info" column group
df.getColumn { colGroup("info").col("height") }
Select the "age" subcolumn from the "info" column group and the "name" column
df.select { colGroup("info").col("age") and col("name") }
Calculate the mean value of the ("info"/"age") column; specify the column type as a col type argument
df.mean { colGroup("info").col<Int>("age") }
Combine Extensions Properties and String Column Accessors. Select "height" and "name" columns, assuming we have extensions properties for "info" and "name" columns but not for the ("info"/"height") column
df.select { "info".col("height") and name }
Combine Columns Selection DSL and String Column Accessors. Remove all Number columns from the dataframe except ("info"/"age")
df.remove { colsAtAnyDepth().colsOf<Number>() except colGroup("info").col("age") }
Select all subcolumns from the "info" column group
df.select { colGroup("info").select { col("age") and col("height") } } // or df.select { colGroup("info").allCols() }
Row Expressions
Add a new "heightInt" column by casting the "height" column values to Int
df.add("heightInt") { "info"["height"]<Double>().toInt() }
Filter rows where the ("info"/"age") column value is greater than or equal to 18
df.filter { "info"["age"]<Int>() >= 18 }
Invoked String API
Alternatively, you can use the String invocation (optional typed argument) for column accessor creation. It will create the same column accessors as in the Columns Selection DSL. You can access nested columns using the String.get or String.invoke operators or using the String.select {} function, where the receiver is the column group name.
// Columns Selection DSL // Get a single "height" subcolumn from the "info" column group df.getColumn { "info"["height"]<Double>() } // Select the "age" subcolumn of the "info" column group // and the "name" column df.select { "info"["age"] and "name"() } // Calculate the mean value of the ("info"/"age") column; // specify the column type as an invocation type argument df.mean { "info" { "age"<Int>() } } // Select all subcolumns from the "info" column group df.select { "info" { "age"() and "height"() } } // or df.select { "info".allCols() } // Row Expressions // Add a new "heightInt" column by // casting the "height" column values to `Int` df.add("heightInt") { "info"["height"]<Double>().toInt() } // Filter rows where the ("info"/"age") column value // is greater than or equal to 18 df.filter { "info"["age"]<Int>() >= 18 }
When should I use the String API?
The String API is a good starting point for learning the library and understanding how column selection works.
For production code we strongly recommend using the Extension Properties API instead. It is more concise, fully type-safe, and provides better IDE support.
However, note that sometimes the usage of Extension Properties API is not possible or may require too many excess actions. In such cases, use String Column Accessors.
10 March 2026