really cool how we've stored the sum of all human knowledge in a format that is near impossible to parse
i'm not aware of any wikitext parser in existence that isn't imperfect in some way, they're all just *slightly* divergent as far as i can tell, and wikitext is so context-dependent that without having knowledge of the contents of every template used in the page (and every template used by said templates) you always need to just, guess how it should be interpreted
this isn't even a case of "mediawiki's parser is the only correct parser that really exists", it's a case of "mediawiki doesn't actually have a parser at all, it just incrementally translates wikitext to html, and does so imperfectly at that, since it's possible for valid wikitext (there is no such thing as invalid wikitext) to generate invalid html"
but the "imperfections" in mediawiki are actually just how wikitext works, since there isn't any specification; mediawiki's implementation just *is* the standard
and like, idk, i feel like this is kinda a big problem? that there's no entirely correct way to programmatically work with the format used by the majority of wikis, including the one which happens to be the largest and most up-to-date encyclopedia in the world? that there's no actual specification for interpreting wikitext outside of just trying to copy what mediawiki does?
and like, it's not *terrible*. well, ok, it is, but, like, not entirely. it's possible to make a parser that close enough to what mediawiki does that in practice it works fine; many such parsers exist. but for such a ubiquitous format, i feel like there should be a much higher bar
really cool how we've stored the sum of all human knowledge in a format that is near impossible to parse
i'm not aware of any wikitext parser in existence that isn't imperfect in some way, they're all just *slightly* divergent as far as i can tell, and wikitext is so context-dependent that without having knowledge of the contents of every template used in the page (and every template used by said templates) you always need to just, guess how it should be interpreted
signed integer overflow is undefined in C, but that's only half of the story.
in fact, overflowing a signed integer whose size is less than sizeof(int) is usually well-defined (i'm pretty sure it's actually *always* well-defined, but i don't yet feel confident enough to make a bold claim like that)
#include <stdio.h> int main(void) { signed char i; scanf("%hhd", &i); // 127 if (i < 0) return 0; if (++i < 0) printf("%hhd < 0\n", i); }
the following program, when compiled and run with all optimizations enabled, will correctly execute the printf when the user enters "127" (SCHAR_MAX). however, changing signed char to int and %hhd to %d will cause the printf to be entirely optimized out.
the latter behavior (optimizing out the printf) is what one would expect if signed integer overflow is truly undefined. the reason that this doesn't happen with signed char is that C performs "integer promotions" on all operands of an arithmetic expression before evaluating it. this means that all values whose type is an integer whose size is less than that of int implicitly have their type "promoted" to int. furthermore, `++i` is defined to be identical to `i += 1`, which is defined as identical to `i = i + (1)`.
all of this combined means that, when doing `++i` on a signed char, it first promotes `i` to an int. then, it adds 1 to `i`, which is well-defined since the result is representable within an int. finally, it's converted back into a signed char, truncating the upper bits. (technically as far as i can tell this conversion is implementation-defined when working with signed integers, but i have no reason to believe that any implementation using twos-complement arithmetic would behave any differently here)
so, feel free to overflow small signed integers to your heart's content! or better yet, don't do that, but sleep easy knowing that if you wanted to you (probably) could
signed integer overflow is undefined in C, but that's only half of the story.
in fact, overflowing a signed integer whose size is less than sizeof(int) is usually well-defined (i'm pretty sure it's actually *always* well-defined, but i don't yet feel confident enough to make a bold claim like that)
#include <stdio.h> int main(void) { signed char i; scanf("%hhd", &i); // 127 if (i < 0) return 0; if (++i < 0) printf("%hhd < 0\n", i); }
the following program, when compiled and run with all optimizations enabled,...