Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish between fenced and indented code blocks in error message #64162

Open
lightclient opened this issue Sep 5, 2019 · 15 comments
Open
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools A-doctests Area: Documentation tests, run by rustdoc C-enhancement Category: An issue proposing an enhancement or a PR with one. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@lightclient
Copy link

lightclient commented Sep 5, 2019

EDIT: It turns out that the issue I was facing was caused by my doc comment being interpreted as a code block, based on the indented code block rule in markdown.

--

Summary

The following doc comment fails to compile using cargo test --doc:

///
///    \
///x

Example repository here.

Error Output

> cargo test --doc
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Doc-tests weird

running 1 test
test src/lib.rs - main (line 2) ... FAILED

failures:

---- src/lib.rs - main (line 2) stdout ----
error: unknown start of token: \
 --> src/lib.rs:3:1
  |
3 | \
  | ^

error: aborting due to previous error

Couldn't compile the test.

failures:
    src/lib.rs - main (line 2)

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--doc'

What I've Tried

I've found a few things (I'm sure this is not an exhaustive list to describe the problem):

  • There must be an empty line above the backslash
  • There must be a non-empty line below the backslash
  • There must be at least 4 spaces between the backslash and the doc comment ///

Expected

The snippet compiles without issue.

Meta

I've verified the issue is present using 1.37.0 on both x86_64-apple-darwin and x86_64-unknown-linux-gnu.

@estebank
Copy link
Contributor

estebank commented Sep 5, 2019

The four leading chars start a markdown code block. The error is the same as having a .rs file whose only contents are a backslash.

@lightclient
Copy link
Author

lightclient commented Sep 5, 2019

@estebank is this the expected behavior? I can't find it documented anywhere.

@estebank
Copy link
Contributor

estebank commented Sep 5, 2019

It is. It is equivalent to writing

```
\
```

Which is the prevalent style in the echosystem, but you can also write it

    \

like you did here. These code blocks get identified as rust code and get compiled and run as tests.

@ehuss
Copy link
Contributor

ehuss commented Sep 5, 2019

I don't think it is clearly documented as such, but indented content is a code block per the markdown spec, and here it is documented that code blocks default to the Rust language, and are tested by default.

@lightclient
Copy link
Author

If I replace my above example with:

///    \

It compiles successfully, which seems to contradict the markdown specification @ehuss pointed to. Am I missing something more?

@ehuss
Copy link
Contributor

ehuss commented Sep 5, 2019

I believe that is caused by the unindent pass which detects the indentation level and removes it.

@lightclient
Copy link
Author

Cool, thanks for clearing this up for me @ehuss & @estebank.

Please close this if you feel like no clarifications to the documentation should be made. I'd be happy to submit a PR if, however, that would be a helpful addition.

@ehuss
Copy link
Contributor

ehuss commented Sep 5, 2019

It does occasionally trip people up. I think there are a few things that could be done:

This is also mentioned in syntax reference. I'm not sure if repeating it in the docs will really help, since I doubt most people will thoroughly read the entire rustdoc manual (and remember everything).

But I think it's up to the rustdoc team if they want more documentation, or changing the error message.

@estebank estebank added A-diagnostics Area: Messages for errors, warnings, and lints A-doctests Area: Documentation tests, run by rustdoc A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools labels Sep 5, 2019
@estebank
Copy link
Contributor

estebank commented Sep 5, 2019

I do think it would be interesting to extend parser errors emitted by rustdoc to have a note similar to "errors found while parsing code block in doc string". The interaction of indented code blocks and unindent is non-obvious and I would guess not entirely intended.

@GuillaumeGomez
Copy link
Member

I do think it would be interesting to extend parser errors emitted by rustdoc to have a note similar to "errors found while parsing code block in doc string". The interaction of indented code blocks and unindent is non-obvious and I would guess not entirely intended.

I agree, it's been already a few times that people are having issues with this.

@lightclient lightclient changed the title Doc comment fails to compile with specific backslash pattern Distinguish between fenced and indented code blocks in error message Sep 5, 2019
@jonas-schievink jonas-schievink added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Mar 6, 2020
@kupiakos
Copy link
Contributor

kupiakos commented Jun 7, 2022

Rustdoc auto-treating indented text as a Rust code block has been a massive pain point for us transitioning from C code with existing documentation. The default should be to treat all indented code blocks as if it were the text language were specified:

Syscall info:

    Arg1: Takes a thing
    Arg2: Takes another thing
    Returns: Whether the thing was thinged

should be treated semantically identically to:

Syscall info:

```text
Arg1: Takes a thing
Arg2: Takes another thing
Returns: Whether the thing was thinged
```

rather than trying to interpret the indents as Rust code. Is there data on how often rustdoc tests are written with indented blocks? My suspicion is next-to-never.

@hkBst
Copy link
Member

hkBst commented Mar 12, 2025

@kupiakos If you remove the blank line, then it should not interpret it as a code block anymore:

Syscall info:
    Arg1: Takes a thing
    Arg2: Takes another thing
    Returns: Whether the thing was thinged

according to "An indented code block is composed of one or more indented chunks separated by blank lines."

@kupiakos
Copy link
Contributor

@hkBst Tips on how to modify the text to avoid accidental parsing as code is not helpful. There's already a better workaround - surround everything with backticks. The issue is that the rustdoc default to treat all indented blocks as Rust code is unintuitive as evidenced by the number of similar issues. It requires cognizant modification to docs ported from C (or opting out of Rustdoc) in order to prevent breakage, rather than simply rendering poorly to be fixed later.

I would be surprised if distinguishing indented code blocks from fenced code blocks actually broke anyone, especially if it were done on an edition boundary.

@hkBst
Copy link
Member

hkBst commented Mar 14, 2025

There's already a better workaround - surround everything with backticks.

I don't see why that would be better. The blank lines are essential to the interpretation as code blocks, so if they are not there, then there is no code block and no need to mark it as "text..." or use a workaround.

The issue is that the rustdoc default to treat all indented blocks as Rust code is unintuitive as evidenced by the number of similar issues. It requires cognizant modification to docs ported from C (or opting out of Rustdoc) in order to prevent breakage, rather than simply rendering poorly to be fixed later.

I also don't see how this breakage which can be easily fixed by removing erroneous/extraneous double newlines is so much worse than a poor rendering. But if that really is a problem, can't you get back to poor rendering by just removing all double newlines?

I would be surprised if distinguishing indented code blocks from fenced code blocks actually broke anyone, especially if it were done on an edition boundary.

If you silently reinterpret code blocks that were previously treated as doc tests as mere text, then doctests would silently stop working. The only indication would be the sudden decrease in the number of such tests reported when running cargo test manually... How is that any good?

@kupiakos
Copy link
Contributor

I don't see why that would be better

@hkBst Because placing a text backtick fence around a few paragraphs affects multiple issue spots at once, is simpler to do en-masse, and doesn't modify the appearance of the original author's text. For these blocks not written with markdown in mind, it's a more faithful rendering anyways.

If you silently reinterpret code blocks that were previously treated as doc tests as mere text, then doctests would silently stop working. The only indication would be the sudden decrease in the number of such tests reported when running cargo test manually... How is that any good?

I asserted that few people intentionally write tests like this, not that we should ever make the change without checking. The right way to migrate users from current behavior to a newer default is to identify the impact and then inform them of the change. Something like:

  1. Solidify which edition this change is made in, or if it must be tied to an edition.
  2. Make all indented code blocks fail, then run a crater test - what proportion of the ecosystem is affected? This is key data to drive the rest of the process forward - are people actually using indented code blocks in rustdoc?
  3. Add a warn-by-default lint for indentation-doctests, informing users of the change in a future edition.
  4. Ensure cargo fix --edition can adjust affected doctests
  5. Release associated blog posts informing users of the change
  6. Land the new edition changing the default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools A-doctests Area: Documentation tests, run by rustdoc C-enhancement Category: An issue proposing an enhancement or a PR with one. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
Status: No status
Development

No branches or pull requests

7 participants