Email or username:

Password:

Forgot your password?
Top-level
Maarten

@lettosprey it’s not complex. On most sites each is going to be in a div of its own, so you’re looking to identify two divs. One is going to have ingredient and quantity words. The other is the prose description and will have lots of action verbs. It doesn’t need AI. Like I said up front, apps did this before LLMs came along. And I’m not sure extracting this with LLMs would work well—literal word for word reproduction is not what they’re made to do.

2 comments
Maarten

@lettosprey I suppose you could use trained classifiers to identify the divs, and one could call that AI (now that it’s well understood we tend to call it machine learning) but it’s vastly simpler than using LLMs.

Lett Osprey

@thinkling So, no, you don't quite understand the complexity of this task, then.

Just for the hell of it, I took a look at 3 different sites. They all used different tags, some used json objects for the ingrediences.

Yes, LLM seem to handle these kinda tasks very well.

I am not saying this cannot be done without AI, but it requires QUITE a lot more than "just look for a couple of divs", and will probably fail on a lot more sites than LLM extraction would.

Go Up