Discussion: Managing `Data.__str__` with `dask`

This issue is for discussing ways in which Data.__str__ can be made to perform nicely when its data is stored in a dask array.

Exposition

The string representation of a Data object is currently inherited from cfdm, and looks like:

>>> import cf
>>> d = cf.example_field(0).data
>>> str(d)
[[0.007, ..., 0.013]] 1

I.e. it prints the first and last elements (and the second element if there only 3 of them).

With dask representing the data, and using the code inherited from cfdm with no changes, printing these elements could

  1. trigger an expensive and slow computation
  2. require the reading from disk of an entire dask chunk per element printed. If each chunk has the default size of 128 MiB, then that could entail reading 256 MiB from disk just to print two numbers.