Email or username:

Password:

Forgot your password?
Joshua Barretto

Fielding opinions on this because I've not yet heard a clear consensus, despite much back and forth: Do you consider #[doc(hidden)] on a trait method to be sufficient to pull aspects of the trait's implementation out of a crate's public API and hence not subject to semver concerns? I've heard several perspectives on this, and it's a decision that has practical consequences for one of my crates. #rustlang

Anonymous poll

Poll

Yes, doc(hidden) is sufficient
31
40.3%
No, techniques like trait sealing are required
22
28.6%
Don't know / see results
24
31.2%
77 people voted.
Voting ended 31 Oct 2023 at 12:12.
43 comments
Joshua Barretto

To be more specific: I want to avoid crate users implementing the `Parser` trait here: docs.rs/chumsky/1.0.0-alpha.6/

Right now, I'm being ultra-conservative with a sealed trait. Problem is, this sealed trait seriously degrades the quality of error messages and the searchability of trait impls.

I'm thinking of removing the sealed trait in favour of doc(hidden). Obviously such a change would make the semver guarantee *social* rather than *mechanical*.

TheZoq2

@jsbarretto I'd say #[doc(hidden)] with a comment saying this is not public API should be sufficient

Predrag Gruevski

@jsbarretto I think it has to be sealed. Tools like cargo-semver-checks would probably consider it public API otherwise.

Have you tried all the various different ways of sealing the trait, though? Do all of them have bad error messages?

I would have assumed that taking a pub-in-priv token type as a fn argument wouldn't change the error messages much, but perhaps there's something I failed to consider.

predr.ag/blog/definitive-guide

Joshua Barretto

@predrag Yes, I've taken a look at that. We currently use cargo-semver-checks (and I think we've spoken before about similar things!). The problem with the pub-in-priv token approach is that it doesn't actually protect downstream users: Rust doesn't express divergence in the type system so it's perfectly possible to write things like

my_foo.call(loop {})

and have that compile just fine, even if the type of the argument cannot be constructed or even named via the public API.

Predrag Gruevski

@jsbarretto I'm not sure I understand why diverging expressions are a problem.

When the argument expression diverges, `my_foo.call()` is never actually called. So I'm not sure how this is different than any use of a diverging expression not related to this trait. That would already produce a dead code lint, so perhaps the compiler is already sufficiently trying to protect the user from themselves?

I'm quite curious to hear more, if you don't mind writing it!

Joshua Barretto

@predrag But something being called or not called isn't the point of semver, right? Or, at least, that's my interpretation. Divergence (things that can return `!`) is useful in a whole load of legitimate domains and from the perspective of the compiler, you'd still end up with a breaking change that halts compilation. IMO, if a change means that code that previously compiled suddenly fails to compile, that's a semver break.

Predrag Gruevski

@jsbarretto I am of course not an authoritative source on Rust semver, but IMHO that falls under the "useless code" carve-out in the API evolution RFC.

Paraphrasing, it says that we should ignore breaking changes that can only happen in code that serves no purpose other than to be broken. In my book, breakage in unreachable code seems to fit under that umbrella.

I dig into this policy in this post, complete with example code and a direct link to that RFC section: predr.ag/blog/some-rust-breaki

@jsbarretto I am of course not an authoritative source on Rust semver, but IMHO that falls under the "useless code" carve-out in the API evolution RFC.

Paraphrasing, it says that we should ignore breaking changes that can only happen in code that serves no purpose other than to be broken. In my book, breakage in unreachable code seems to fit under that umbrella.

Joshua Barretto

@predrag Hmm, I'm not sure I'm convinced by that argument. Divergence isn't just a theoretical thing, it's often a useful way to construct APIs that guarantee non-termination or that certain things will or won't be called, particularly when combined with higher-order functions. I could definitely come up with realistic examples that would break but also aren't 'useless' by the definition given in that post.

Predrag Gruevski

@jsbarretto Definitely not saying it's theoretical. Merely that in this case, intentionally "calling" a function with a diverging expression feels like it falls under that umbrella.

I'm not currently familiar with any examples of useful divergence that would break on an API change in unreachable code. I'm very curious about it, so if you have the cycles to write that example, I'd absolutely love to see it!

Joshua Barretto

@ekuber Sure. So my current approach is to have a `ParserSealed` trait that's a supertrait of `Parser`, then a blanket impl `impl<T: ParserSealed> Parser for T { ... }`. The result of this is that implementors of `ParserSealed` are not visible in the crate docs. Consider the difference in the accessibility of the docs for the `Just` combinator before I switched to sealed traits here:

docs.rs/chumsky/1.0.0-alpha.2/

with the docs afterward here:

docs.rs/chumsky/1.0.0-alpha.6/

@ekuber Sure. So my current approach is to have a `ParserSealed` trait that's a supertrait of `Parser`, then a blanket impl `impl<T: ParserSealed> Parser for T { ... }`. The result of this is that implementors of `ParserSealed` are not visible in the crate docs. Consider the difference in the accessibility of the docs for the `Just` combinator before I switched to sealed traits here:

Joshua Barretto

@ekuber In the former case you can see that the value you give to `Just` needs to implement `OrderedSeq`, and it's much easier to figure out more concrete obligations. In the latter case, we just get 'Just must implement ParserSealed to implement Parser' but with no information about the requirements of the former.

Esteban K�ber :rust:

@jsbarretto could you make a min repro of this last case? I think one of the merged linked PRs might be addressing that.

Joshua Barretto

@ekuber Oh, in this specific case I'm talking purely about rustdoc visibility, which I know is at least mostly independent of compiler diagnostics

Joshua Barretto

@ekuber I also have some more specific ideas/recommendations for improving compiler errors around complex APIs like this if you're interested?

Joshua Barretto

@ekuber So just to give some background: chumsky is a parser combinator crate, and the `Parser` trait is, at least in terms of the role it plays, kind of like `Iterator` but on steroids. You combine parsers together and end up with enormous parser types that require `impl Trait` syntax to manipulate.

Joshua Barretto

@ekuber Ideally, the compiler would try to hide the specific type wherever it can: It's not functionally useful, all that matters is the exact form of the `Parser` trait that it does or doesn't implement. It's pretty common for us to get complaints from users that look like this: github.com/zesterer/chumsky/is

Joshua Barretto

@ekuber Something I've noticed a lot is that the compiler often translates some subtle but failed obligation way, way down in the gritty details of a type (let's say, a missing `Clone` bound/impl) into saying that the *entire type* doesn't implement `Parser`. This is a big confusion for users because they just typed out some combinators and the compiler is effectively denying reality for them. A better approach, in cases where there is one 'obvious' (not sure how to define this) impl (1/2)

Joshua Barretto

@ekuber Would be to try to instead report the failed obligation right at the lowest depth if at all possible. The compiler *knows* that the problem is a missing `Clone` bound, but it doesn't report it: instead, it goes for a breadth-first approach and reports the highest level obligation that failed.

Esteban K�ber :rust:

@jsbarretto I could have sworn we show both, and that the most common problem is the other way around. A repro to play with would help me figure out what the issue is. As for why these errors are hard to deal with for us, trait bound obligations are conceptually a tree, but the current trait solver treats them at best as a linked list, and each chain in that list needs to be added manually to avoid breaking the chain (so we might be missing cases). There's a new solver in the works.

Joshua Barretto

@ekuber That makes sense. As of a few months ago I've written a trait solver myself, so I can understand how logistically complicated it is to coerce it into doing the right thing, especially when there's a tension between diagnostic quality and other factors like performance. I'll see what I can do to come up with an example.

Esteban K�ber :rust:

@jsbarretto our current solution there is to cut the types to a manageable size and write them to disk if you really need them. There was a recent change to E0277 to do that in more cases, but if you can help us identify where we're not doing that already we can significantly clean those up very easily.

Joshua Barretto

@ekuber Yeah, that was definitely a very appreciated change. The errors at least stay on users screen now, I think a lot of people have found value there. That said, it doesn't do much to actually aid the user in solving the problem: the reported error is still very, very distant compared to the problem that they need to fix to get the thing working.

Esteban K�ber :rust:

@jsbarretto for sure. I would like to see the failure patterns to see if we could teach rustc to understand them better, but of course it is tricky.

Joshua Barretto

@ekuber I must admit, a few of the uglier cases I've noticed in the past have improved substantially! I just tried a case that produced a horrible obligation error a few months ago and it's now giving a very reasonable error instead, so congratulations on that! There are a few other cases that I've seen issues with though, so I'll try to get you a solid example of that soon.

Predrag Gruevski

@jsbarretto btw, might the degraded diagnostics quality and searchability be worth filing some issues?

I'd bet other crates are affected by this as well. It might be a motivating use case and a good opportunity to improve the UX for everyone.

Nilstrieb

@jsbarretto I'd make a doc(hidden) method with a funny name.

Andrew Lilley Brinker

@jsbarretto *whispers* SemVer has _always_ been a social contract

:verified: Pasties

@jsbarretto I think it depends on the seriousness of the consequences of someone misuses the interface. Usually doc(hidden), maybe with an extra comment explicitly indicating it's not part of any versioning stability for people that find it via the source.

If the guarantees you're providing by your crate depend on a correct implementation and the potential consequences would be severe such as loss of data due to a corrupt file or cryptographic internals that is when I think sealed traits are appropriate.

If there are any cases where someone might have legitimate useful and safe ways to use it, even without compatibility guarantees I try to avoid sealed traits.

@jsbarretto I think it depends on the seriousness of the consequences of someone misuses the interface. Usually doc(hidden), maybe with an extra comment explicitly indicating it's not part of any versioning stability for people that find it via the source.

If the guarantees you're providing by your crate depend on a correct implementation and the potential consequences would be severe such as loss of data due to a corrupt file or cryptographic internals that is when I think sealed traits are appropriate.

Joshua Barretto

Wow, seems like I'm not the only one that can't come to a solid decision about this. I think that's at least a bit of a concern given that people both are and aren't choosing to bake these two approaches into supposedly stable APIs today.

Peter Hartley

@jsbarretto I feel that Hyrum's Law applies: given enough users, _all_ observable behaviors of your system
will be depended on by somebody. And if someone _does_ ill-advisedly use your hidden-but-public feature, it's not primarily them who'll suffer from it: it's users of their crate, whose builds will suddenly start failing after a cargo update. I'd argue for making those invalid states unrepresentable.

Joshua Barretto

@TalesFromTheArmchair I would definitely agree if that were a position without compromise: but going for the sealed approach comes with substantial downsides for most users, so I think there's space for both perspectives here.

Alonely0 🦀

@jsbarretto if your private methods leak into public traits, you're doing it wrong; you should be using sealed supertraits. However, I do like the idea of having `doc(hidden)` stuff be implementation details that you shouldn't have to use but are kinda allowed to; thus being semi-private interfaces that aren't subject to semver.

Joshua Barretto

@Alonely0 I'm already using sealed supertraits. Problem is, they make impls much harder to observe in the docs and result in substantially worse error messages.

Alonely0 🦀

@jsbarretto I think I've found the definitive answer to your answer. It turns out `doc(hidden)` is not part of the public API, and as such, is considered private in terms of semver. rust-lang.github.io/api-guidel

Joshua Barretto

@Alonely0 The phrase 'not allowed for user to call' is, IMO, not an explicit enough guarantee here, and it seems there's at least a little divergence of opinion in the community over this.

Kornel

@jsbarretto I’d say if you combine it with a scary name like “_private_unstable_dont_use” then doc(hidden) is fine. Method privacy is mainly for humans, so you just need to prevent usage by mistake.

Soso

@jsbarretto `doc(hidden)` is most often used as a way to "hide" internal unstable APIs, mostly for macros.

There are only few uses of doc(hidden) in real-world crates that are still meant to be used. I reviewed a lot of the most downloaded crates for uses of doc(hidden) and only found 1 actual use of doc(hidden) on a stable API. I later found another one.

I suggested a clippy lint for that, which I believe should be warn by default: github.com/rust-lang/rust-clip

Joshua Barretto

@sgued I'd love to see this lint be a thing, I think it's a great idea! Nice to hear that someone's done the legwork of looking through the ecosystem. I still have some reservations because contract-by-documentation is very much opposed to the Rust ethos, but I suspect it's probably satisfactory for now.

Go Up