Email or username:

Password:

Forgot your password?
Top-level
Hrefna (DHC)

4) The DS includes sections of code that look… well they look like generated templates waiting for someone else to fill them in. There are even comments to this effect. It creates a dead branch to… write a comment that says someone else can put something here.

Notably, this text is a lot like how _tutorial projects_ are written. That stands out to me and tells me something about the training set.

HS's approach of course doesn't do that.

4/

7 comments
Hrefna (DHC)

5) Finally, the biggest difference, and one of the things that _really_ stood out to me looking at the two solutions is this:

The #Devin solution solves the problem _as it is written_ (changing the color). It follows, to-the-letter, the instructions in the query.

The human solution actually reads the _problem that the requester was trying to solve_ and takes a stab at solving that in a slightly different way from the specifics of how the requester requested it.

That's _brilliant_.

5/5

5) Finally, the biggest difference, and one of the things that _really_ stood out to me looking at the two solutions is this:

The #Devin solution solves the problem _as it is written_ (changing the color). It follows, to-the-letter, the instructions in the query.

The human solution actually reads the _problem that the requester was trying to solve_ and takes a stab at solving that in a slightly different way from the specifics of how the requester requested it.

Mike P

@hrefna Interesting! Thank you. Some things that stand out to me:

1 - This confirms what I already thought: humans can understand things, AIs can't.

2 - As an experienced programmer who knows no Rust at all, I found the human PR far more understandable, and I even felt that it _taught_ me a little about the language, as opposed to the AI PR which taught me nothing at all.

3 - WTF is with that "New method..." comment? If I was reviewing this, I'd say "hell no" to that.

Hrefna (DHC)

@FenTiger yeah, 100% agreed. The "new method" comment stood out to me as well as an example of "this thing is bad at writing documentation by the standards of bad documentation"

Yvan DS 🗺️ :ferris: :go:

@hrefna @FenTiger what worries me even more, is that this is a "good case", with all the technics we have at the moment.
Everything that I have tested recently on all the LLMs out there showed me that all the answers on the same subject tend to converge, even on very different models.

I don't have data, but my theory is that they are all trained on more or less the same datasets. So their answers and capabilities converge.

It's going to look like this. Inadequate in non simple cases.

@hrefna @FenTiger what worries me even more, is that this is a "good case", with all the technics we have at the moment.
Everything that I have tested recently on all the LLMs out there showed me that all the answers on the same subject tend to converge, even on very different models.

I don't have data, but my theory is that they are all trained on more or less the same datasets. So their answers and capabilities converge.

Yvan DS 🗺️ :ferris: :go:

@hrefna @FenTiger the style might be different.
But the intent seems to be the same everywhere.

Eric McCorkle

@hrefna

This reminds me a lot of how things went during the whole outsourcing debacle of the mid-2000s. The nature of the documentation, the odd and simplistic technical choices, the obviously copied template code, and solutions that spec-lawyer their way to an unsatisfactory result.

Hrefna (DHC)

@emc2 Yep, and even earlier. Yourdon's Decline and Fall of the American Programmer was written in 1992 and was discussing these problems with outsourcing.

Go Up