@thinkling "ll they need go do is identify the correct...

(Undercover)'s posts Post Back to profile

@thinkling "ll they need go do is identify the correct two sections of the page and cooy them word for word"

You say you are a software developer. Can you imagine a simple way of "all they need to do" part there? Like, all the different ways a web page can be laid out, all the different ways these sections can be written. Is this really a "all they need..." thing, or is it a tad more complex?

Like 22 January at 8:33 | Wall-to-wall | Open on tech.lgbt

3 comments

Maarten

@lettosprey it’s not complex. On most sites each is going to be in a div of its own, so you’re looking to identify two divs. One is going to have ingredient and quantity words. The other is the prose description and will have lots of action verbs. It doesn’t need AI. Like I said up front, apps did this before LLMs came along. And I’m not sure extracting this with LLMs would work well—literal word for word reproduction is not what they’re made to do.

25 January at 6:58 | Open on mastodon.social

Maarten

@lettosprey I suppose you could use trained classifiers to identify the divs, and one could call that AI (now that it’s well understood we tend to call it machine learning) but it’s vastly simpler than using LLMs.

25 January at 7:02 | Open on mastodon.social

Lett Osprey

@thinkling So, no, you don't quite understand the complexity of this task, then.

Just for the hell of it, I took a look at 3 different sites. They all used different tags, some used json objects for the ingrediences.

Yes, LLM seem to handle these kinda tasks very well.

I am not saying this cannot be done without AI, but it requires QUITE a lot more than "just look for a couple of divs", and will probably fail on a lot more sites than LLM extraction would.

25 January at 7:22 | Open on tech.lgbt