Email or username:

Password:

Forgot your password?
Top-level
Dr. Robert M Flight

@mhoye Wait, what? So their claim is that anything that is decent at compression, should also be decent at prediction? So therefore we've missed the boat because all of the work on compression the past few decades means we have really good predictors?

Definitely going to be reading this.

14 comments
Gabriele Svelto

@rmflight @mhoye symbol prediction is part of many of the best lossless compression algorithms, isn't it?

mhoye

@gabrielesvelto @rmflight It is, and if this paper's results hold up it we're talking about how large scale deep-learning networks are fundamentally a technical dead end, and something that takes a datacenter to do with a DNN can be done better with a clever application of gzip on a phone.

Nick Wood

@mhoye @gabrielesvelto @rmflight So what you’re saying is that I’m about to get a screaming deal on a graphics card?

Choong Ng

@mhoye @gabrielesvelto @rmflight On first look I think what this paper suggests is 1) for some classification tasks there's nicely simple approach that works well and 2) this is a promising path towards better feature engineering for language models that will in turn result in better accuracy vs cost.

Choong Ng

@mhoye @gabrielesvelto @rmflight If this works out well we'll see better + smaller models for all tasks (not just classification) that outperform both current DNNs and the NCD technique they use at moderate cost. There's precedent of this being a successful approach for example using frequency domain data for audio models instead of raw PCM. There's also precedent for finding ways DNNs waste a lot of capacity on effectively routing data around and restructuring to fix (ResNets for example).

Choong Ng

@mhoye @gabrielesvelto @rmflight Overall though in recent history data-based approaches have tended to win so I would expect the useful bits to get incorporated into DNNs rather than DNNs being obsoleted in almost any context. My favorite essay on that topic by Rich Sutton: incompleteideas.net/IncIdeas/B

Paul M. Heider

@rmflight @mhoye Sounds about right. My old coworker Tom always joked that NLP was just a compression algorithm on the input text.

mhoye

@paulmheider @rmflight And Ted Chiang has referred to ML-generated text as "a jpeg of a language", yeah. But to see that come together in fifteen lines of python that out-do these massive, crazy expensive DNN models is bonkers, jaws on the floor material.

Go Up