Text: Characters and Strings


Character and string representations

The Rust char type is 32-bits in size. From Rust by Example: Strings

A String is stored as a vector of bytes (Vec(lt;u8>), but guaranteed to always be a valid UTF-8 sequence. String is heap allocated, growable and not null terminated. &str is a slice (&[u8]) that always points to a valid UTF-8 sequence, and can be used to view into a String, just like &[T] is a view into Vec<T>.

Unicode normalization

There are several crates on crates.io which offer Unicode normalization.

Converting strings to and from UTF-8

Rust strings have the function as_bytes() which returns a byte slice of the stringss contents.

An array of u8 (presumably of UTF-8 bytes) can be converted to a string by str::from_utf8()

Internationalized domain names

There are several crates on crates.io which offer conversion to IDN.

