@lettosprey it’s not complex. On most sites each is...

(Undercover)'s posts Post Back to profile

@lettosprey it’s not complex. On most sites each is going to be in a div of its own, so you’re looking to identify two divs. One is going to have ingredient and quantity words. The other is the prose description and will have lots of action verbs. It doesn’t need AI. Like I said up front, apps did this before LLMs came along. And I’m not sure extracting this with LLMs would work well—literal word for word reproduction is not what they’re made to do.

Like 25 January at 6:58 | Wall-to-wall | Open on mastodon.social

2 comments

Maarten

@lettosprey I suppose you could use trained classifiers to identify the divs, and one could call that AI (now that it’s well understood we tend to call it machine learning) but it’s vastly simpler than using LLMs.

25 January at 7:02 | Open on mastodon.social

Lett Osprey

@thinkling So, no, you don't quite understand the complexity of this task, then.

Just for the hell of it, I took a look at 3 different sites. They all used different tags, some used json objects for the ingrediences.

Yes, LLM seem to handle these kinda tasks very well.

I am not saying this cannot be done without AI, but it requires QUITE a lot more than "just look for a couple of divs", and will probably fail on a lot more sites than LLM extraction would.

25 January at 7:22 | Open on tech.lgbt