Add MultiprocessingAuthkeyPlugin to propagate authkey to Dask workers by moi90 · Pull Request #9124 · dask/distributed

@moi90

Closes #9122

  • Tests added / passed
  • Passes pre-commit run --all-files

@moi90

@moi90

@moi90

I would still need help to write a useful test...

@github-actions

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    27 files  ± 0      27 suites  ±0   9h 46m 19s ⏱️ + 3m 6s
 4 113 tests + 1   4 006 ✅  - 1    104 💤 ±0   3 ❌ + 2 
51 529 runs  +13  49 330 ✅ ±0  2 184 💤  - 1  15 ❌ +14 

For more details on these failures, see this check.

Results for commit e54b412. ± Comparison against base commit c9e7aca.

♻️ This comment has been updated with latest results.

jacobtomlinson

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a PR to distributed we don't need to do this in a plugin, this could happen directly in the Cluster object for handing off the key and the Worker object for recieving it. We could use the Dask config as the transport for the key to avoid us just injecting it into the environment all the time.

For testing we could just ensure the key is successfully set, transmitted and recieved.

@moi90 moi90 marked this pull request as draft

October 14, 2025 10:38

@moi90

Could you give me a hint how to test this properly?

The following works instantly (without any change), because the cluster started by gen_cluster likely relies on the multiprocessing module which obviously correctly forwards the authkey to child processes.

def _get_authkey():
    import multiprocessing.process
    return multiprocessing.process.current_process().authkey
    
@gen_cluster(client=True)
async def test_authkey(c, s, a, b):
    import multiprocessing.process

    worker_authkey = await c.submit(_get_authkey)

    assert worker_authkey == multiprocessing.process.current_process().authkey

Is there a factory for a cluster that is started using, say, subprocess instead?

@jacobtomlinson

I don't think there is a factory, but you could use subprocess to run the dask scheduler and dask worker commands separately.

@moi90

@moi90

you could use subprocess

That seems to work, thanks!

We could use the Dask config as the transport for the key

Is this secure? Is the config visible or even serialized to disk? The Python developers make a big deal about the fact that the authkey must always remain inaccessible from outside...

@jacobtomlinson

Is this secure?

When you add it to the config in a Python process it would only be held in memory, we never write config back to the disk, it's only read once when dask.config is imported.

When workers are launched it would be transferred via environment variables, which is the same as the original proposal here.