@blosc2.dsl_kernel lets you write kernels with Python function syntax while executing through the miniexpr DSL path.
Use DSL kernels when you want:
A vectorized UDF model (operate over NDArray chunks/blocks, not Python scalar loops)
Optional JIT compilation via miniexpr backends (for example
tcc/cc) without requiring NumbaEarly syntax validation and actionable diagnostics for unsupported constructs
This tutorial complements 03.lazyarray-udf.ipynb (generic Python UDFs).
For the canonical DSL syntax contract, see the DSL syntax reference.
Choosing the Right Interface¶
Goal |
Recommended API |
|---|---|
Elementwise formulas using built-in functions/operators |
|
Arbitrary Python logic (including numba) over blocks/chunks |
|
DSL subset with early syntax checks and optional miniexpr JIT |
``
@blosc2.dsl_kernel``
+
|
import numpy as np import blosc2
1. Define a DSL Kernel¶
A valid DSL kernel has to be decorated with @blosc2.dsl_kernel. After that, it can be used with blosc2.lazyudf(...) like a regular UDF.
@blosc2.dsl_kernel def kernel_index_ramp(x): # _i* and _n* are reserved DSL index/shape symbols, so disable linter warnings return x + _i0 * _n1 + _i1 # noqa: F821
shape = (5, 10) x = blosc2.ones(shape, dtype=np.float32) expr = blosc2.lazyudf(kernel_index_ramp, (x,), dtype=np.float32) res = expr[:] res
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[11., 12., 13., 14., 15., 16., 17., 18., 19., 20.],
[21., 22., 23., 24., 25., 26., 27., 28., 29., 30.],
[31., 32., 33., 34., 35., 36., 37., 38., 39., 40.],
[41., 42., 43., 44., 45., 46., 47., 48., 49., 50.]], dtype=float32)
Zero-Parameter DSL Kernel¶
Kernels with no parameters are also valid. When inputs is empty, you must pass an explicit output shape to lazyudf(...).
@blosc2.dsl_kernel def kernel_no_inputs(): return _i0 + 10 * _i1 # noqa: F821 expr0 = blosc2.lazyudf(kernel_no_inputs, (), dtype=np.int32, shape=(3, 4)) res0 = expr0[:] res0
array([[ 0, 10, 20, 30],
[ 1, 11, 21, 31],
[ 2, 12, 22, 32]], dtype=int32)
DSL Kernel with Multiple Parameters¶
Kernels with more than one parameter work the same way; all inputs are passed through lazyudf(...) in a tuple.
@blosc2.dsl_kernel def kernel_weighted_mix(x, y, b): return 0.25 * x + 2.0 * y + b xw = blosc2.asarray(np.arange(12, dtype=np.float32).reshape(3, 4)) yw = blosc2.ones((3, 4), dtype=np.float32) bw = 32.4 resw = blosc2.lazyudf(kernel_weighted_mix, (xw, yw, bw), dtype=np.float32)[:] resw[:2, :3]
array([[34.4 , 34.65, 34.9 ],
[35.4 , 35.65, 35.9 ]], dtype=float32)
2. Preflight Validation (validate_dsl)¶
You can validate a kernel and inspect diagnostics without executing it.
report_ok = blosc2.validate_dsl(kernel_index_ramp) report_ok
{'valid': True,
'dsl_source': 'def kernel_index_ramp(x):\n # _i* and _n* are reserved DSL index/shape symbols, so disable linter warnings\n return x + _i0 * _n1 + _i1 # noqa: F821',
'input_names': ['x'],
'error': None}
Invalid Syntax Examples¶
validate_dsl helps catch unsupported constructs early, before running lazyudf(...). For example:
@blosc2.dsl_kernel def kernel_invalid_ternary(x): return 1 if x else 0
report_bad_ternary = blosc2.validate_dsl(kernel_invalid_ternary) print(report_bad_ternary["valid"]) print(report_bad_ternary["error"])
False Ternary expressions are not supported in DSL; use where(cond, a, b) at line 2, column 14 DSL kernel source: 1 | def kernel_invalid_ternary(x): 2 | return 1 if x else 0 | ^ See: https://github.com/Blosc/python-blosc2/blob/main/doc/getting_started/dsl_syntax.md
Common Diagnostics Cheat Sheet¶
Ternary expression (
a if cond else b) is unsupported: usewhere(cond, a, b).Reserved names (
int,float,bool,print,_ndim,_i*,_n*) cannot be reused.Missing return on an executed path can fail at runtime, even if compilation succeeds.
4. Control Flow and Casts¶
The DSL supports if/else blocks and cast intrinsics such as float(...).
@blosc2.dsl_kernel def kernel_clip_and_scale(x): if x < 0: y = 0 else: y = x return float(y) * 0.5 x2_np = np.linspace(-2.0, 2.0, num=10, dtype=np.float32).reshape(2, 5) x2 = blosc2.asarray(x2_np) res2 = blosc2.lazyudf(kernel_clip_and_scale, (x2,), dtype=np.float32)[:] res2
array([[0. , 0. , 0. , 0. , 0. ],
[0.11111111, 0.33333334, 0.5555556 , 0.7777778 , 1. ]],
dtype=float32)
5. Loops and Reserved ND Symbols¶
You can use for ... in range(...) together with reserved symbols like _i0, _i1, _n0, _n1 and _flat_idx.
@blosc2.dsl_kernel def kernel_add_triangular_col_index(x): acc = 0 for j in range(_i1 + 1): # noqa: F821 acc += j return x + acc x3 = blosc2.zeros((2, 5), dtype=np.float32) res3 = blosc2.lazyudf(kernel_add_triangular_col_index, (x3,), dtype=np.float32)[:] res3
array([[ 0., 1., 3., 6., 10.],
[ 0., 1., 3., 6., 10.]], dtype=float32)
expected = np.array([0, 1, 3, 6, 10], dtype=np.float32) np.allclose(res3[0], expected), res3[0]
(True, array([ 0., 1., 3., 6., 10.], dtype=float32))
6. Advanced Examples¶
For more advanced real-world DSL kernels, see:
examples/ndarray/mandelbrot-dsl.ipynbexamples/ndarray/black-scholes_hist-dsl.ipynb
GitHub links: