Email or username:

Password:

Forgot your password?
Top-level
Emily M. Bender (she/her)

But they do make sure to spend a page and half talking about how they vewwy carefuwwy tested to make sure that it doesn't have "emergent properties" that would let is "create and act on long-term plans" (sec 2.9).

>>

1 comment
Emily M. Bender (she/her)

I also lol'ed at "GPT-4 was evaluated on a variety of exams originally designed for humans": They seem to think this is a point of pride, but it's actually a scientific failure. No one has established the construct validity of these "exams" vis a vis language models.

For more on missing construct validity and how it undermines claims of 'general' 'AI' capabilities, see:

datasets-benchmarks-proceeding

>>

Go Up