Running LLMs Locally with Ollama

28 Dec 2024

I’ve been surprised at how many folks are unaware that you can run LLM models of impressive quality locally, on your own personal machine. This means that with a bit of initial set-up, you can run an LLM without access to the internet, without giving OpenAI or Anthropic or Google any money, and without giving your data over to others.

There are limits to what you can do related to your hardware, of course, but running a local LLM is very possible and surprisingly easy. If you have a laptop or desktop purchased in the last few years, there’s a good chance you can run a fairly advanced LLM on your machine.

I’ll come back later and make an argument for why you might want to do this - even if you are a fierce AI critic - and I’ll also share some additional methods and strategies for using this stuff, but right now I just want to focus on getting started. In my experience so far, one of the quickest routes to success is to use a tool called Ollama.

For everything that follows, I’m running Fedora 41. All you need is some basic command line knowledge.

Installing Ollama

Ollama is a tool that helps you run different LLMs locally. Installation varies by OS - the Ollama homepage should point you in the right direction - but on Linux, it’s just a quick curl command in the terminal:

curl -fsSL https://ollama.com/install.sh | sh

You should see the terminal do a bit of work, some downloading and installation. When done, you’ll get a message noting that the installation is complete:

>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
>>> AMD GPU ready.

In Ollama, you can use the list command to view all available models.

ollama list
>>> NAME    ID    SIZE    MODIFIED

We don’t have any models yet installed, so all it returned was an empty list with some headers. Let’s download and run a model.

Running a Model in Ollama

You can run a model with the command ollama run [model_id], like so:

ollama run llama3.2:1b

This downloads the model, if not yet available, and immediately puts you in a session with that model. It will share a command menu, which you can access at any time using /?, but you can just start typing, like you would on any of the cloud-based big models:

>>> What is the longest word in your training data?
My training data was last updated in 2023, and I do not have information on very 
long words or phrases. However, I can tell you that according to the Oxford 
English Dictionary, some of the longest words in English are:

* pneumonoultramicroscopicsilicovolcanoconiosis (a type of lung disease)
* floccinaucinihilipilification (the act of estimating something as worthless)
* pteromedullosity (the quality of being able to fly)
* supramaximal (exceeding or surpassing maximum)

Keep in mind that these words are not commonly used in everyday language, and 
they may be considered too long or technical for most purposes.

>>> In the last you entry you provided a list of words. What is the second word in tha 
... t list?
The second word in the list I provided is "floccinaucinihilipilification".

Use /bye to leave the session.

You can see a full list of available models for Ollama at their website.

Using Ollama from the Command Line

The nice thing about Ollama is that it standardizes your interactions with a variety of models, so you can try as many as you like – or as many as your hardware can stand – and not have to learn a new trick every time.

The README at the Ollama repo is both concise and thorough for getting started. One of the greatest values of running a local LLM is its potential for applying it to your own files. To start experimenting, see the Multimodal models and Pass the prompt as an argument sections of the README.

In a future post, I’ll try to share some things I’ve learned re: Ollama and local LLM use, particularly in relation to using your own files.

A Graphic Interface for Ollama: `open-webui`

open-webui is effectively a GUI wrapper for working with Ollama and other LLM tools. It makes Ollama more accessible for those who may not be as comfortable on the command line.

It is 100% local: it runs in your web browser, but it is just like running Ollama from the command line in that it is working off your own local machine.

Installation is a bit more complex. Start here with readme at the open-webui repo.

I installed via the Docker image (“Installation with Default Configuration”) since I already had Ollama installed, but there is a bundled version that will handle Ollama for you, too. Installation takes some time, as it preps a number of models out of the box, and it can be a bit finicky. (I had to do some decent troubleshooting after my initial install to get Ollama and open-webui to talk to each other.)

I’ll share some screenshots and tips for open-webui later.

Installing Ollama

Running a Model in Ollama

Using Ollama from the Command Line

A Graphic Interface for Ollama: open-webui

A Graphic Interface for Ollama: `open-webui`