Rust: Raw string literals

r#"What is this?"#

While working with Rust, you will often come across r#"something like this"#, especially when working with JSON and TOML files. It defines a raw string literal. When would you use a raw string literal and what makes a valid raw string literal?

When would you use a raw string literal?

First, let’s understand what a string literal is. According to the The Rust Reference1, A string literal is a sequence of any Unicode characters enclosed within two U+0022 (double-quote) characters, with the exception of U+0022 itself2. Escape characters in the string literal body are processed. The string body cannot contain a double-quote. If you need to insert one, you have to escape it like this: \".

Escaping double-quotes can be cumbersome in some cases such as writing regular expressions or defining a JSON object as a string literal. In these situations, raw string literals are helpful since they allow you to write the literal without requiring escapes.

Here is a snippet from the toml3 crate:

Or another from serde-rs4:

So, raw string literals are helpful, but what makes a valid one?

What makes a raw string literal?

The Rust Reference defines a raw string literal as starting with the character U+0072 (r), followed by zero or more of the character U+0023 (#) and a U+0022 (double-quote) character. The raw string body can contain any sequence of Unicode characters and is terminated only by another U+0022 (double-quote) character, followed by the same number of U+0023 (#) characters that preceded the opening U+0022 (double-quote) character.5

Escape characters in the raw string body are not processed.

Therefore the following raw string literals are all valid:

Try it on playpen

If you need to include double-quote character in a raw string, you must tag the start and end of the raw string with hash/pound signs(#).

Try it on playpen

The raw string body can contain any sequence of UNICODE characters except "# since it would terminate the literal. If you want to include the particular sequence, you have to change the number of # that precede the opening double-quote. For instance:

Try it on playpen

Likewise, if "## is to be included, you can add another # to the starting and ending delimiters.

Wrap Up

Raw string literals are helpful when you need to avoid escaping characters within a literal. The characters in a raw string represent themselves. Informally, a raw string literal is an r, followed by N hashes (where N can be zero), a quote, any characters, then a quote followed by N hashes.6

Here’s how visualising7 raw string literals works for me:

Raw string literal railroad
Image generated using Railroad-Diagram-Generator

That’s it for now!


Want to get updates on these blog posts?

powered by TinyLetter

Enjoyed this post?

Buy me a coffeeBuy me a coffee