Jim Griffin
June 26, 2024
There’s a big breakthrough that just came out for handling large language models on smartphones. It’s called PowerInfer-2 and what it does is look at every option for a processing an LLM on a particular smartphone, and picks the fastest way for that particular LLM on that particular device. For […]
Jim Griffin
June 20, 2024
Last week, NVIDIA announced Nemotron-4, which consists of three models: Base, Instruct and Reward. These three models work together within the NeMo framework to enable the creation and fine-tuning of new large language models. At 340 billion parameters, this new entrant far bigger than any other open source model, but […]
Jim Griffin
June 13, 2024
Ollama is a popular platform for running language models on your local machine, with access to almost 100 different open source models, including llama-3 from Meta, Phi3 from Microsoft, Aya 23 from Cohere, the Gemma models from DeepMind and Mistral. This video shows llama-3 being run on a laptop, using […]