Commit 2d9c4792ae

Andrew Kelley <andrew@ziglang.org>
2024-01-22 00:19:54
std.fmt: clarify the use of "character"
Currently, std.fmt has a misguided, half-assed Unicode implementation with an ambiguous definition of the word "character". This commit does almost nothing to mitigate the problem, but it lets me close an open PR. In the future I will revert 473cb1fd74d6d478bb3d5fda4707ce3f6e6e5bf6 as well as 279607cae58f7be46335793df6a4a753d0a800aa, and redo the whole std.fmt API, breaking everyone's code and unfortunately causing nearly every Zig user to have a bad day. std.fmt will go back to only dealing in bytes, with zero Unicode awareness whatsoever. I suggest a third party package provide Unicode functionality as well as a more advanced text formatting function for when Unicode awareness is needed. I have always suggested this, and I sincerely apologize for merging pull requests that compromised my stance on this matter. Most applications should, instead, strive to make their code independent of Unicode, dealing strictly in encoded UTF-8 bytes, and never attempt operations such as: substring manipulation, capitalization, alignment, word replacement, or column number calculations. Exceptions to this include web browsers, GUI toolkits, and terminals. If you're not making one of these, any dependency on Unicode is probably a bug or worse, a poor design decision. closes #18536
1 parent 559bbf1
Changed files (1)
lib
lib/std/fmt.zig
@@ -40,9 +40,9 @@ pub const FormatOptions = struct {
 ///   - when using a field name, you are required to enclose the field name (an identifier) in square
 ///     brackets, e.g. {[score]...} as opposed to the numeric index form which can be written e.g. {2...}
 /// - *specifier* is a type-dependent formatting option that determines how a type should formatted (see below)
-/// - *fill* is a single character which is used to pad the formatted text
-/// - *alignment* is one of the three characters `<`, `^`, or `>` to make the text left-, center-, or right-aligned, respectively
-/// - *width* is the total width of the field in characters
+/// - *fill* is a single unicode codepoint which is used to pad the formatted text
+/// - *alignment* is one of the three bytes '<', '^', or '>' to make the text left-, center-, or right-aligned, respectively
+/// - *width* is the total width of the field in unicode codepoints
 /// - *precision* specifies how many decimals a formatted number should have
 ///
 /// Note that most of the parameters are optional and may be omitted. Also you can leave out separators like `:` and `.` when