Add shared_task<T> and shared_lazy_task<T> classes
The ability to have multiple consumers wait on the result of a task is required for some scenarios.
eg. where you want to pass a prerequisite task into multiple sub-tasks that each need to await that task.
The task<T> and lazy_task<T> classes are move-only and support only a single awaiting coroutine at a time.
This issue is proposing to add a shared_task<T> class and a shared_lazy_task<T> class that support copy-construction and assignment with reference-counting semantics and support multiple concurrent awaiting coroutines.
It should be possible to implement in a lock-free fashion using std::atomic pointers.