Supertrait Auto-impl by dingxiangfei2009 · Pull Request #3851

Supertrait Auto-impl by dingxiangfei2009 · Pull Request #3851 · rust-lang/rfcs

I still do not quite get what you are getting to and I am curious about your ideas. Would you mind elaborating? A few sketches will help.

Sure, let me elaborate: Let’s say we have this kind of boiler-platy list of implementations:

struct Foo(u8);
impl Ord for Foo {
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
        self.0.cmp(&other.0)
    }
}
impl PartialOrd for Foo {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
        Some(self.cmp(other))
    }
}
impl Eq for Foo {}
impl PartialEq for Foo {
    fn eq(&self, other: &Self) -> bool {
        self.0 == other.0
    }
}
impl std::hash::Hash for Foo {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        self.0.hash(state);
    }
}

Currently, one could simplify this with macros. Probably truly proc-macros would be the most convenient for users, but let's leave it as a simple macro_rules macro for the purposes of this demo, e.g.:

// a general helper crate for Ord and Hash impls

mod convenient_impl_helpers {

    #[macro_export]
    macro_rules! eq_ord_by_key {
        ($(@[$($TyArgs:tt)*] $(where [$($Bounds:tt)*])?)? $T:ty, |$x:ident| $key:expr) => {
            impl$(<$($TyArgs)*>)? Ord for $T $($(
                where $($Bounds)*
            )?)?
            {
                fn cmp(&self, other: &Self) -> std::cmp::Ordering {
                    let $x = self;
                    let key1 = &$key;
                    let $x = other;
                    let key2 = &$key;
                    Ord::cmp(key1, key2)
                }
            }
            impl$(<$($TyArgs)*>)? PartialOrd for $T $($(
                where $($Bounds)*
            )?)?
            {
                fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
                    Some(self.cmp(other))
                }
            }

            impl$(<$($TyArgs)*>)? Eq for $T $($(
                where $($Bounds)*
            )?)? {}
            impl$(<$($TyArgs)*>)? PartialEq for $T $($(
                where $($Bounds)*
            )?)?
            {
                fn eq(&self, other: &Self) -> bool {
                    let $x = self;
                    let key1 = &$key;
                    let $x = other;
                    let key2 = &$key;
                    *key1 == *key2
                }
            }
        }
    }

    #[macro_export]
    macro_rules! hash_by_key {
        ($(@[$($TyArgs:tt)*] $(where [$($Bounds:tt)*])?)? $T:ty, |$x:ident| $key:expr) => {
            impl$(<$($TyArgs)*>)? std::hash::Hash for $T $($(
                where $($Bounds)*
            )?)?
            {
                fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
                    let $x = self;
                    let key = &$key;
                    std::hash::Hash::hash(key, state);
                }
            }
        }
    }
}

// let's use that nice helper crate someone else (ideally) has already written now!

struct Foo2(u8);

eq_ord_by_key!(Foo2, |x| x.0);
hash_by_key!(Foo2, |x| x.0);

(playground)

Macros are kind of annoying to use though. Special syntax and visibility rules for macro_rules, or crate separation for proc macros, macro_rules is also annoying with generic parameters (since < > are not true parentheses)… of course these points aren’t completely on-topic and most are fixable in the long run… last but not least though, a macro isn’t type-checked! Its like a C++ template but even worse; you don’t get any errors until you actually use it, not only for type / resolution errors but also basic syntax errors.

So generally traits are Rust’s solution for nicer metaprogramming like this, with maximal type-checking at definition time, no weird syntax etc… Hence naturally this RFC’s features would definitely be used by people to improve on macros such as the one presented above. So now, your helper crate author can rewrite their macro into (this much nicer code):

// a general helper crate for Ord and Hash impls
mod convenient_impl_helpers {

    pub trait EqOrdByKey: Ord {
        fn key(&self) -> &impl Ord;
        auto impl Ord {
            fn cmp(&self, other: &Self) -> std::cmp::Ordering {
                self.key().cmp(other.key())
            }
        }
        auto impl PartialOrd {
            fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
                Some(self.cmp(other))
            }
        }
        auto impl Eq {}
        auto impl PartialEq {
            fn eq(&self, other: &Self) -> bool {
                self.key() == other.key()
            }
        }
    }

    pub trait HashByKey: Hash {
        fn key(&self) -> &impl Hash;
        auto impl std::hash::Hash {
            fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
                self.key().hash(state);
            }
        }
    }
}

and as the user we can write the following to use it:

struct Foo3(u8);

impl EqOrdByKey for Foo3 {
    fn key(&self) -> &impl Ord {
        &self.0
    }
}
impl HashByKey for Foo3 {
    fn key(&self) -> &impl Hash {
        &self.0
    }
}

To demonstrate the semver point in this example: Let’s say we now realize, Foo3 should also offer a Borrow<u8> implementation, so let’s just add it:

impl Borrow<u8> for Foo3 {
    fn borrow(&self) -> &u8 {
        &self.0
    }
}

and by now it’s starting to become boilerplaty again.

We find out there’s different helper crate that’s offering exactly what we need for handling our use-case more cleanly. WIth the different helper crate, we can simply write:

struct Foo3(u8);
impl BorrowOrdHash<u8> for Foo3 {
    fn borrow(&self) -> &u8 {
        &self.0
    }
}

and it’ll handle the rest for us!

Implementation of that “different helper crate”…

// a different helper crate

mod borrow_delegate_convenience {
    pub trait BorrowOrdHash<T: Ord + Hash>: Borrow<T> + Ord + Hash {
        fn borrow(&self) -> &T;
        auto impl Ord {
            fn cmp(&self, other: &Self) -> std::cmp::Ordering {
                self.borrow().cmp(other.borrow())
            }
        }
        auto impl PartialOrd {
            fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
                Some(self.cmp(other))
            }
        }
        auto impl Eq {}
        auto impl PartialEq {
            fn eq(&self, other: &Self) -> bool {
                self.borrow() == other.borrow()
            }
        }
        auto impl std::hash::Hash {
            fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
                self.borrow().hash(state);
            }
        }
        auto impl std::borrow::Borrow<T> {
            fn borrow(&self) -> &T {
                BorrowOrdHash::borrow(self)
            }
        }
    }
}

This is all very nice and clean and convenient and people will use it this way. But compared to the macro-based approach, it results in publicly visible, yet unwanted, implementations of these traits like EqOrdByKey, HashByKey or BorrowOrdHash; and switching to the more convenient not-a-macro-based solution from borrow_delegate_convenience additionally comprised a breaking API change on Foo3 that’s annoyingly also only due to the API of those publicly visible implementation of EqOrdByKey, HashByKey that we didn’t really want in the first place.

The idea I mentioned of having “something like a trait” is to make these helper things, EqOrdByKey, HashByKey, BorrowOrdHash, into things that are like partially like a trait in that you can write an impl for them; but explicitly they’re not like a trait in that you can not:

call any of their methods yourself – the methods (key, or borrow) only serve as input for the impl, not as additional API that is supposed to be made available
use their bound in any constraints: i.e. you cannot write T: EqOrdByKey anywhere – this way implementations of these not-a-trait things can be removed without any breakage.

As one possible way to implement this let me just introduce a new keyword, call these things something like template trait and keep the rest of the syntax like in this RFC, and we can just change it into:

pub template trait BorrowOrdHash<T: Ord + Hash>: Borrow<T> + Ord + Hash {
    fn borrow(&self) -> &T;
    auto impl Ord {
        fn cmp(&self, other: &Self) -> std::cmp::Ordering {
            self.borrow().cmp(other.borrow())
        }
    }
    auto impl PartialOrd {
        fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
            Some(self.cmp(other))
        }
    }
    auto impl Eq {}
    auto impl PartialEq {
        fn eq(&self, other: &Self) -> bool {
            self.borrow() == other.borrow()
        }
    }
    auto impl std::hash::Hash {
        fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
            self.borrow().hash(state);
        }
    }
    auto impl std::borrow::Borrow<T> {
        fn borrow(&self) -> &T {
            BorrowOrdHash::borrow(self)
        }
    }
}

and we’d use it still like before

struct Foo4(u8);
impl BorrowOrdHash<u8> for Foo3 {
    fn borrow(&self) -> &u8 {
        &self.0
    }
}

but Foo4 does not visibly implement BorrowOrdHash now because BorrowOrdHash isn’t actually a trait; just Borrow<u8> and Ord and Hash.

This kind of feature is of course not really needed for the concrete motivation of refactoring something like std::fmt::Read – or, one other thing I have in mind is refactoring (eventually) Iterator into a special case of a generic “lending iterator” kind of trait.

Here’s a fun (or “interesting”?) follow-up, especially given your RFC already mentioned blanket impls for comparison: this kind of feature for refactoring traits into a hierarchy may potentially even go hand-in-hand with a blanket implementation but the other way around. That is: Assume besides Iterator { type Item; } we gain LendingIterator { type LendingItem<'lt>; }, then any lending iterator where the LendingItem type does not depend on 'lt can in principle always be used as an Iterator, and this fact/implication can become a blanket impl, like

impl<T, Itm> Iterator for T
where
    for<'lt> T: LendingIterator<LendingItem<'lt> = Itm>,
{
    type Item = Itm;
    // methods left out for brevity
}

though this also only works if the RFC is somehow extended… significantly… essentially:

first, we need to allow something like the template trait mentioned above
```
template trait IteratorTempl {
    type Item;
    // methods left out for brevity
    auto impl LendingIterator {
        type LendingItem<'_lt> = Item;
        // methods left out for brevity
    }
}
```
and then whoever writes impl IteratorTempl for MyType actually implements LendingIterator through the template; then the blanket impl in turn gives you a MyType: Iterator impl.
- for backwards compatibility then, next, we need a way to allow IteratorTempl to be re-exported under the same name Iterator as the Iterator trait itself. (Maybe literalls pub use LendingIterator as Iterator into the same module as the real trait Iterator). To avoid ambiguity, maybe Iterator can become a language-supported sealed trait; then it’s going to be clear that you would never write a downstream impl for that, and we could say there is never ambiguity between a sealed trait and a template trait because they are used downstream for mutually-exclusive purposes (the latter only for writing impls, the former for everything else).
- to minimize confusion, this construct of fusing together a sealed trait and a template trait into a single public-API name could come with a compiler-enforced limitation to cases where the template trait does result (either directly or through blanket impls) in an actual implementation of the crate whose name it shares. Though this would preclude the possibility for opt-out of some of the auto impls. Maybe, as a minimal change, the syntax solution could be to make the auto keyword optional, and allow the extern opt-out only when auto was present. Alternatively the final keyword, like for final methods (RFC 3678), could be considered.

In case you wonder how I’m coming up with these thoughts on-the-fly: I don’t. It’s rather that I have already myself, occasionally, spent some time thinking thought this general problem space for a decent while already… let me check… it apparently it has been ~4 years already 😲!

If you’ll allow me, let me share just two more thoughts I found worth mentioning (and by all means, please feel free to move the conversation elsewhere; separate review thread; or Zulip; etc; if you want to continue any of the more tangential discussion points):

[Click to expand.]

AFAICT, the RFC limits the auto impl connection to cases where the Self type is shared between the outer and the auto impl’d trait, while allowing more freedom with other trait parameters. As someone who came into Rust from Haskell, which has no concept of Self and just treats all parameters to type classes the same, I’m always in the camp of viewing “generic traits” rather just as “relations of more than 1 argument”, and would love to minimize the special treatment of Self parameters in traits
- this is however a thing that can be left out of an MVP, as it doesn’t really suffer the property of “annoying limitation that leads people to write more brittle APIs”, but it’s only an “annoying limitation” which can be lifted at a later point. It may be worth considering though as future possibility and with this possibility in mind, possibly the syntax should be auto impl Trait for Self { … }, not just auto impl Trait { … } [thogh I’m not sure yet what I’d prefer; maybe keep the latter as sugar, anyway?]
another possible concern [and touching on sort-of a counter-point to the above] is user error messages from concrete auto trait powered impls that turn out illegal for coherence-rules kind of reasons; there are a few error cases with this that only appear once the auto impl is actually used.
- Of course the possibility for overlapping impl error messages that stem from the auto impl, not the top-level one; those are nothing new though, you can also get these from blanket impls.
- A unique/new concern however is from orphan rules. To show a nontrivial example: if you have this:
```
// methods left out for brevity
trait Foo<A, B> {
    auto impl Bar<B, A> {
        // ...
    }
}
trait Bar<B, A> {}
```
  then a downstream user might write an impl as follows
```
struct MyStruct {}
impl<T> Foo<MyStruct, T> for String {}
```
  this would then run into orphan rules violation not for Foo but one for Bar (due to the order of the parameters) from the auto impl (to see the kind of error message, see this playground)
  This isn’t a huge concern, but it’s at least worth considering. If some of these possible “template instantiation”-time-ish errors are deemed potentially-too-confusing¹, we could consider whether it’s possible and desirable to defined additional limitations on the auto impls that somehow prevent such orphan-rules issues.

In particular, I could imagine that leaving this unaccounted for could allow users to refactor traits in ways that are breaking changes even when they weren’t supposed to be breaking, and where catching the breaking use-case is sufficiently nontrivial that it’s easily overlooked by people not 100% fluent with the last details of Rust’s orphan/coherence rules. ↩