Ollama 0.2 now supports concurrency, allowing parallel requests and multiple models to run simultaneously. This update enhances use cases like managing chat sessions and running diverse agents, optimizing memory and GPU usage.
https://alternativeto.net/news/2024/7/ollama-0-2-brings-parallel-requests-and-the-ability-to-run-multiple-models-simultaneously/