Email or username:

Password:

Forgot your password?
Top-level
mhoye

The number of people in this thread who are willing to tell me that the real problem is how stupid everyone is is incredibly embarrassing.

Look, here's what happens when you compile Hello World with single quotes instead of double quotes.

Note, please, that these are warnings, not errors. GCC finishes and produces an a.out executable.

73 comments
aburka 🫣

@mhoye morbidly wondering what a.out does if you run it

DrYak

@aburka it will segfault and crash.
The key is type errors: in C kanguage "a" is a string, but 'a' has a completely different meaning: it's a character. Basically a number in the ASCII table (I am over simplyfing).
So when GCC sees 'Hello world', it interprets it as 'H' (and complains about the excedent letters - they are not proper UTF-8 multibyte) and interprets it as 72. Then C will automatically convert the number 72 into a pointer (hence the other warning).
...

DrYak

@aburka ...Running it, printf will attempt to find a format string at location 72 of the memory. And crash.

There are helpers in GCC for printf, but currently they only check if *the static string* is a proper format and match the argument. They currently cannot check that the wrong type (a character instead of a string) was provided.
C++ is more peeky about types and would probably complain about not converting int to pointer, but still no explanation why 'H' is considered int.

DrYak

@dryak that would require an intimate knowledge of the C language and its typing system (not trivial depending on one's background).
It's mostly a leak abstraction if the underneath machine code.

And a complete overhaul of how C is parsed in order to carry over the necessary extra information about type to detect this.

BUT!...

DrYak

@dryak ...BUT a proper linter should immediately spot the problem:

There are way too many stuff between the single quotes in the source code for a proper character.
Are you sure you really want a char/an int8/an integet? Didn't you mean a string instead and forgot the double quotes?

Peter Ludemann

@dryak @aburka Programmers don't like to specify conversions manually, so that's what you get, although C makes it extra spicy by quietly ignoring loss of precision or sign (and compilers don't catch all such errors, it seems).
Speaking of conversions, I got some amazing error messages from some C++ code that had user-defined conversion operators for integers, pointers, and subscripting.

mhoye

Geez, that doesn't look good, but they're just warnings, maybe not deal-breakers. So let's run it and see what happens! An... immediate segmentation fault.

Ok, well all those warnings are in printf so let's go look at the manual for printf.

man7.org/linux/man-pages/man3/

Oh, yeah, check this out. You can tell we're on the right track:

mhoye

It really gets rolling when we learn about the format of the format string. We wanted to print a string, not a constant character pointer to a "format string", but we can roll with that.

"The overall syntax of a conversion specification is: %[$][flags][width][.precision][length modifier]conversion"

We're getting warmer you know? You can really feel it. We're close to figuring this out.

mhoye

"The arguments must correspond properly (after type promotion) with the conversion specifier. By default, the arguments are used in the order given, where each '*' (see Field width and Precision below) and each conversion specifier asks for the next argument (and it is an error if insufficiently many arguments are given)."

Uh... ok... look maybe we're getting off track here, that first error message said "character constant too long for its type", let's look for that.

mhoye

Let's try something different, maybe gcc --help will do something.

And holy shit does it ever.

Greg Wilson

@mhoye I am currently working on a short tutorial about date-time manipulation in SQLite. I just reached the part where I explain what Julian days are.

mhoye

Well, let's see if we can find the error we got in that first message somewhere in the help file, the "-Wmultichar" thing we got in that first warning message. Let's see....

nope?

mhoye

The printf manual doesn't admit the existence of "multichar" either.

mhoye

"apropos multichar" comes up empty too, assuming I know about it.

Tell me what I should have done differently.

This is a _typo in 'hello world'_.

What trail of breadcrumbs should I have followed that would have gotten me from "incompatible integer to pointer conversion passing 'int' to parameter of type 'const char *'" to "You want double-quotes, not single quotes there", using the error messages and docs?

You complain about "StacktGPT Developers?" Well, this is what people are up against.

immibis
@mhoye in this particular case the trail of breadcrumbs would have been basic language knowledge.
powersoffour

@mhoye I think this is an excellent walkthrough of an all-too-common situation. Thanks for taking the time to make it.

immibis replied to powersoffour
@powersoffour but the example is something you would see if you tried to learn the language by trial and error. It reminds me a little bit of the question about trying to compile a PNG file in that the problem is someone using the language without knowing the fundamentals. Otherwise we can imagine many other similarly confused questions: A very common one is

void f(char *s) {
s = malloc .......
}
followed by asking why calling the function doesn't modify its parameter. How do you tell someone this without teaching them how pointers work?
@powersoffour but the example is something you would see if you tried to learn the language by trial and error. It reminds me a little bit of the question about trying to compile a PNG file in that the problem is someone using the language without knowing the fundamentals. Otherwise we can imagine many other similarly confused questions: A very common one is
Owen (spoopy aspect)

@mhoye The number of responses in the thread treating this as purely a documentation problem is also really alarming, tbh. There are human factors issues at multiple stages of this! The earliest one is probably the design of C, where the difference between values occupying types with wildly different semantics is expressed via the shift key alone, for example.

Owen (spoopy aspect) replied to Owen (spoopy aspect)

@mhoye Docs and error messages retrench that design problem without meaningfully remediating it, and that gets us to docs as one of the multitude of cultural failure points in this chain.

Sherri W (SyntaxSeed)

@mhoye This 100%!!

My god I can't believe how many times I've looked with despair at convoluted docs when a series of explained, code examples would have gone 1000x better.

That's why people love StackOverflow, because the answers usually include *example code*.

It's the programming equivalent of 'a picture is worth a thousand words'.

Show me don't tell me.

maegul replied to Sherri W (SyntaxSeed)

@syntaxseed @mhoye

I often suspect if the ultimate “zen” of documentation is just a well structured and indexed series of code examples with outputs and inputs.

Similarly for research papers except with figures and plots.

Graydon Hoare

@mhoye On the one hand I agree that the error messages are unhelpful here and, if we were a gcc dev, we could set off on working on this.

On the other hand, I think that (gestures to mountain of even worse unix and C diagnostics, docs, and even conceptual structures) realistically isn't going to have a dent put in it in our lifetimes. Too much sunk cost, too much compatibility burden, too much else competing for attention of the people working in that space.

I think there's a way to approach it as a historical expedition, where you accept it on its own terms and adapt yourself as you would if learning to build a shelter out of sticks or whatever; and there's a way to deal with it by mere avoidance; and definitely a constructive thing for those of us with the scar tissue to do (produce more appropriate tools that function at a similar level). But I think the "retrofit in place" approach is potentially even more work than that, and will produce worse results.

@mhoye On the one hand I agree that the error messages are unhelpful here and, if we were a gcc dev, we could set off on working on this.

On the other hand, I think that (gestures to mountain of even worse unix and C diagnostics, docs, and even conceptual structures) realistically isn't going to have a dent put in it in our lifetimes. Too much sunk cost, too much compatibility burden, too much else competing for attention of the people working in that space.

mhoye replied to Graydon

@graydon yeah. We’ll get to the point where this is understood as a load-bearing archaeology exercise, I think, that yes there’s lead in the pipes and asbestos in the walls and you don’t even wanna know how big those roaches are, but _some_ of these pipes still bring people drinkable water and heating, and here are the skills and tools you need to work safely in this environment, but we’re not building anything new like this ever, whatever those weird PDP-11 Reenactment Society people say.

Yahe

@mhoye It also told you that the format string is not a string literal?

LisPi

@mhoye What's interesting is that all of those would've worked in #Elisp without ever leaving #Emacs.

Much of it with #CommonLisp and #SLIME too. Most of the implementations also come with #info #manuals and their source-code (for those things not mentioned in the manual), and hyperspec.el also handles searching through the #HyperSpec locally on one's machine (it's packaged on Debian) and provides keybinds to search symbols automatically right from your editor.

C lacks most of this.

LisPi replied to LisPi

@mhoye #Interlisp did it better than most modern #CommonLisp environments though, if what I recall hearing is correct, as the documentation didn't require additional setup (pointing the editor at the docs' location).

Mark W. Gabby-Li 🐌

@mhoye The problem is not documentation, it's the C programming language.

While C is an excellent and elegant language in many ways, it's simply not designed with user-friendliness in mind.
It doesn't even offer basic type safety, as your example plainly demonstrates, letting the user do extremely destructive things that they would almost never want to do, with almost no guardrails.

It is legal to send an integer where a pointer is expected. That's C. Can't fix that with an instruction manual.

@mhoye The problem is not documentation, it's the C programming language.

While C is an excellent and elegant language in many ways, it's simply not designed with user-friendliness in mind.
It doesn't even offer basic type safety, as your example plainly demonstrates, letting the user do extremely destructive things that they would almost never want to do, with almost no guardrails.

delta :akko_bean: :verified_gay:

@mhoye@mastodon.social this is one of the reasons i generally stray away from things written in C(++) and why to some "not written in C" is seen as a selling point

sure C is fine if you're already an expert in it, but if you're not? its just pure pain

Niclas Hedhman

@mhoye

Almost all projects/products have two mutually exclusive problems at the same time; "Too Much Information" and "Too Little Information", depending on WHO you ask and WHEN you ask.

I think the "real" problem is much more about psychology and personal character. Why would software require less "practice" to reach "proficiency" than for instance becoming a doctor, a lawyer, a musician,...?

I would argue that no matter what docs you have, it can't replace experience/practice.

Irenes (many)

@mhoye also please note that this is the wrong printf manual, this is the manpage for the printf command-line tool not for the C function. if anything that strengthens your point, heh

Nelson Minar 🧚‍♂️

@mhoye no no you're supposed to know gcc and a few other highly doctrinaire programs have a totally different help system called "info" that no one remembers to use.

Nelson Minar 🧚‍♂️

@mhoye I totally agree with your larger point! I'm making fun of the GNU info program, which (I believe?) is still the authoritative docs for the GNU C Compiler. Your answer isn't those docs either.
Part of the blame here lies with C as a programming language and some of the weirdness of the type signature of printf. It's tricky for a C compiler to print a more useful error here, although I think printf would be worth a special case in the error handling. I'm not sure any set of docs can swiftly get to this answer, a Google search actually works OK.
I do appreciate that Python tries to be more humane about this kind of thing, but it has plenty of rough spots too.

@mhoye I totally agree with your larger point! I'm making fun of the GNU info program, which (I believe?) is still the authoritative docs for the GNU C Compiler. Your answer isn't those docs either.
Part of the blame here lies with C as a programming language and some of the weirdness of the type signature of printf. It's tricky for a C compiler to print a more useful error here, although I think printf would be worth a special case in the error handling. I'm not sure any set of docs can swiftly...

LisPi

@mhoye I'm not a fan of GCC's manual for a number of reasons, but it's infuriating how many distros feel like it's necessary to split out the documentation from it.

Just package it as gcc-full (aliased to gcc) by default and if someone *absolutely* wants it without documentation then they can install gcc-bin or whatever.

Kevin Lyda

@mhoye you complain there's no docs and you stop reading the docs where there are examples using double quotes. Plus your issue is you're not specifying a string literal correctly. This is covered in the very concise K&R C book written over four decades ago.

Every language has rules about literals. You do need to learn a language to use it...

publius

@mhoye

Without disagreeing with you, I wonder whether it's possible (without true Artificial Intelligence) to consistently provide really useful error message, given the immense flexibility and power of even the simplest of programming languages (and the immense variety of ways a programmer can screw up).

Documentation, on the other hand, suffers from having been written from the perspective of the people creating the program, not the people using it. I don't know how to fix that, either.

Dana Fried

@publius @mhoye if you use your own thing, and you hire or work with people who are not as familiar with the thing as you are, or you just occasionally read the top Stack Overflow questions about your thing, it becomes fairly obvious where the documentation and/or error messages need to be improved.

"For experts by experts" isn't a way to make software for anyone but yourself.

mhoye

@tess @publius yeah, the big winners from this AI in everything push aren’t going to be the people getting the answers, it’s going to be the people hoarding the questions.

LisPi

@tess @publius @mhoye That certainly does seem like a decent heuristic in absence of bug reports or contributions to the documentation.

wakame

@lispi314 @tess @publius @mhoye

A little more complicated, more a thought experiment:

Every time a compiler (anywhere in the world) stumbles and falls over a line, log that somewhere centrally.

If compiling succeeds later, log also if and how the offending line has been changed.

Then, combine the most common(?) or most unsimilar looking(hello Levenshtein) lines and solutions and show some of them to the user (in case a compile fails).

For a project, it might also be helpful if e.g. a common method is invoked with false parameters a lot, since this could hint at inconsistencies or missing documentation.

@lispi314 @tess @publius @mhoye

A little more complicated, more a thought experiment:

Every time a compiler (anywhere in the world) stumbles and falls over a line, log that somewhere centrally.

If compiling succeeds later, log also if and how the offending line has been changed.

Then, combine the most common(?) or most unsimilar looking(hello Levenshtein) lines and solutions and show some of them to the user (in case a compile fails).

LisPi

@wakame @tess @publius @mhoye That sounds like a very dangerous experiment with significant privacy issues.

In highly expressive and/or dynamic languages, it could also be of very limited use.

(But then such dynamic languages usually also have dynamic checks that can tell you *why* the type of something is wrong.)

wakame

@lispi314 @tess @publius @mhoye

That's why I introduced it as a thought experiment.

From a usability (or generally practical) perspective:

Of course expressive error messages are (likely) the best solution. With common, handwritten examples perhaps.

And if your interpreter/compiler understands what you are writing, that is always a plus.

But generally it feels as if especially software development is kind of resistant to invent and introduce new tools to make our work easier.

(This post is in danger to walking into "better UI development tools" territory, so I better stop writing now. :blobcatgiggle:​)

@lispi314 @tess @publius @mhoye

That's why I introduced it as a thought experiment.

From a usability (or generally practical) perspective:

Of course expressive error messages are (likely) the best solution. With common, handwritten examples perhaps.

And if your interpreter/compiler understands what you are writing, that is always a plus.

LisPi

@wakame @tess @publius @mhoye Anything too reminiscent of Lisp (Machines) causes fear and rejection in its detractors, which is a significant part of what I blame for the resistance.

MSavoritias

@publius @mhoye

The thing is though that rust has fixed that problem. And it is a language pitched as an alternative to C++ so it very complex. Also python is doing a lot of work to solve and it is another complex language.

The thing is the tech community hasn't bothered to write good errors or documentation in the majority of projects. Because if we make it too easy the wrong people will get in. (Actual argument i have heard), or they are so out of touch with people actually using computers that they think the error is obvious (?!). (Usually for gatekeeping and superiority complexes).

My point is AI cant fix the reason we don't have good errors or docs. Because the reasons are social. We could have good errors or docs without it.

@publius @mhoye

The thing is though that rust has fixed that problem. And it is a language pitched as an alternative to C++ so it very complex. Also python is doing a lot of work to solve and it is another complex language.

The thing is the tech community hasn't bothered to write good errors or documentation in the majority of projects. Because if we make it too easy the wrong people will get in. (Actual argument i have heard), or they are so out of touch with people actually using computers that...

LisPi

@mhoye A whole lot of that is just C being an awful language.

Which okay might be related to the unix brainworms, but it's a bit chicken and egg at that point.

Also the last error message literally tells you what to do right with a short example too.

Which I think might be related to relatively recent improvements in GCC's messages (there's ongoing work there).

AN/CRM-114

@lispi314 @mhoye the third leg on that stool is K&R, or short of that an undergrad course with C. There is a barrier of entry where you have to know what you are doing to know what you are doing. You have to get whacked by the paddle before they teach you the secret handshake. And like other kinds of hazing, or weeder courses, making people want to quit is the point

LisPi

@flyingsaceur @mhoye I didn't have much issues with such courses, but then I was already *working* in C by the time I had it so I unfortunately wouldn't quite know.

I'm not sure which book I used to learn it, but definitely I wouldn't say that C is the kind of language you have a fun time learning exploratively. It's no Racket (which incidentally also has pretty great docs and error messages thanks to all the dynamic checks and contract stuff).

AN/CRM-114

@lispi314 @mhoye Yup - the UNIX way is to build up calluses and muscle memory. You probably saw that failing example and remember the figurative nun rapping your knuckles for mixing them up and never forgot. Pain, some will say, is the best teacher.

On the other extreme, I’ve worked with Python stuff where I had to read the source to find things they never bothered to document. I’m still not sure how I feel about that

LisPi

@flyingsaceur @mhoye I'm not very fond of languages and systems that are not self-documenting at this point (which basically removes all non-dynamic languages).

#Scheme implementations tend to not do this as well as #Lisp implementations in my experience, and rely more on external tooling to handle it.

#Racket again does it somewhat better than other Schemes, but yeah.

Unfortunately, I also recognize the need in some places for static languages, so I insist on duly standardized ones.

Owen (spoopy aspect)

@mhoye It's always funny to reply to Mr. Chu with a link to the CVE list for openldap.

If he's going to argue the Tired Uncle Bob position of "just git gud, what's the problem," he should have his nose rubbed in his own results for it.

Justin Thomas 🛡

@mhoye I've been doing a lot in Rust lately. The error messages are ultimately helpful, but can be overwhelming, especially when you start dealing with lifetimes or lambda functions. And I do get that "just learn to read" vibe when I see folks asking for help deciphering them. It's unfortunate.

Timothy Clark

@mhoye It's not the *most* toxically-mastodon thread I've ever seen but "people are illiterate and stupid" -> "you just need regex bro" is quite the take.

Reina

@mhoye This somehow looks worse than Python's errors ... It only barely gives a single hint as to what the issue could be ...

Reina

@mhoye I need to recheck Python tho. This could actually still be better xD

Ben Cole

@mhoye I dunno man, I've seen some garbage error messages from compilers but this one is litterally pointing at the mistake.

I've had to work a lot of juniors around how to read error messages and stack traces but I'm not really sure of a better way to indicate issues (especially in a way that's applicable to multiple possible problems).

It might be nice to have a somewhat less judgmental place (than stack overflow) to search for common issues though, especially for beginners.

mhoye

@pxlplz I understand your point, but there's only so far away from the actual error you can with Hello World. It's a Hello World example, you know?

We can try a slightly different example, with the quotes fixed but % in the string, but this one is a lot less dramatic - a person with the on-device docs would have a real shot at quickly realizing that "%" is a special-case escape character (at least, if they're lucky enough to read the BSD docs, which are notably better than the Linux versions.)

DrYak

@mhoye That's indeed a perfect example of PDP-11 concepts leaking into modern coding errors: As I've discussed elsewhere in the thread the main problem is C itself, it's too leaky as an abstraction of assembler, and its type system makes this code perfectly valid (but doesn't do what one think it does).

If one isn't a C guru, it's difficult to track down.

A linter to spot the pattern would have helped.

The "multichar warning" could be a bit more expanded to cover that user error in details.

joat

@mhoye I note as an aside that some newer projects have learned these lessons. Here is what the Crystal compiler does in the same situation:

Pxl Phile

@mhoye I had a good laugh at this. Not defending GCC here, I am all in favor for a languages that help to avoid errors in all stages.

Anyhow, the last line food be a give-away "%s" but in that it does a pretty bad job indeed.

And while lots of docs are garbage, how about to condone the usage of C/C++ in the first place?

Go Up