For the last decade or so, I used Seafile, Owncloud and then Nextcloud to self-host my data on a small homeserver. This has worked wonderfully, and I have nothing but respect for the community that built these wonderful and powerful tools.

But one thing that never worked as smoothly as I wanted it to was the photo upload from my smartphone to Nextcloud. The upload works, and it rarely fails, but it’s never instant. So it’s not as seamless as taking a picture, turning on the PC, and there it is. It takes anywhere between 30 seconds and many minutes to sync.

I am not the only person who would like a better self-hosted photo solution. There are at least three active competitors that could replace Nextcloud for the photo upload for me:

  1. ENTE
  2. PhotoPrism
  3. Immich

But the issue is that I have invested at least some time into automatically sorting my photos in the past (Sorting Images with ML), and I would love to have at least feature parity.

After some reading, I did notice that Immich uses a very similar tech stack to drive their “smart-search” feature, as I used for my image classifier: CLIP embeddings.

And as it is 100% self-hosted, I thought I would give it a try and see how hard it would be to replicate the classifier setup.

The smart search of Immich allows the user to enter “dog” into the search bar, and pictures that are close in the embedding space to the word “dog” show up.

Of course, this also allows full sentence search, such as “a dog walking in a park, sunset in the background, highlighting the church tower” or similar. That’s why CLIP embeddings are so cool.

But instead of searching for groups, I like my photos to be tagged. For example, I want them to be tagged with the concept of “Photo album” as pictures I might consider printing out in a hardcover photo album.

This is possible with a classifier. If we have a tag “photo album”, we can fetch all the embeddings for the pictures with it and train a logistic regression model distinguishing between those and all other pictures.

The Setup

So to accomplish this, I make use of both the Immich API, each Immich instance has (kudos for implementing this, this is amazing to have), but as the API does not currently give me the embeddings, my plugin also needs to speak directly to the PostgreSQL DB.

As I am very familiar with Python, I chose to make it a Python package that can be deployed via Docker.

The code for this project is hosted on GitHub (github.com/openpaul/Immich-ML-Tag).

How does it work

Docker Compose

In this example, let’s assume we use Docker Compose for deployment. Note that this is to run the immich service locally, for production or any use you’d want to put this behind a reverse proxy at least.

We can start with the Docker Compose file provided by Immich:

#
# WARNING: To install Immich, follow our guide: https://docs.immich.app/install/docker-compose
#
# Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.transcoding.yml
    #   service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
 - ${UPLOAD_LOCATION}:/data
 - /etc/localtime:/etc/localtime:ro
    env_file:
 - .env
    ports:
 - '2283:2283'
    depends_on:
 - redis
 - database
    restart: always
    healthcheck:
      disable: false

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, rocm, openvino, rknn] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    # extends: # uncomment this section for hardware acceleration - see https://docs.immich.app/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cpu # set to one of [armnn, cuda, rocm, openvino, openvino-wsl, rknn] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
 - model-cache:/cache
    env_file:
 - .env
    restart: always
    healthcheck:
      disable: false

  redis:
    container_name: immich_redis
    image: docker.io/valkey/valkey:9@sha256:4503e204c900a00ad393bec83c8c7c4c76b0529cd629e23b34b52011aefd1d27
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0@sha256:bcf63357191b76a916ae5eb93464d65c07511da41e3bf7a8416db519b40b1c23
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
      # Uncomment the DB_STORAGE_TYPE: 'HDD' var if your database isn't stored on SSDs
      # DB_STORAGE_TYPE: 'HDD'
    ports:
 - "5432:5432"
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
 - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    shm_size: 128mb
    restart: always

volumes:
  model-cache:

Notice I exposed the PostgreSQL port here.

To enable tagging, we need to add this section to the compose file:

immich-ml-tag:
  container_name: immich_ml_tag
  #image: ghcr.io/openpaul/immich-ml-tag:latest
  image: immich-ml-tag:latest
  volumes:
 - ml-tag-data:/data/ml_resources
  env_file:
 - .env
  environment:
 - ML_RESOURCE_PATH=/data/ml_resources
  # Optional settings
  # - TRAIN_TIME=02:00
  # - INFERENCE_INTERVAL=5
  # - THRESHOLD=0.5
  # - MIN_SAMPLES=10
  depends_on:
 - database
 - immich-server
  restart: unless-stopped

And add a volume for persistent storage:

volumes:
  ml-tag-data:

With that completed, it’s important to configure a .env file:

API_KEY="YOUR_API_KEY"
DB_PASSWORD="THEDBPASSWORD"
DB_USERNAME="postgres"
DB_DATABASE_NAME="immich"
DB_HOST="database"
DB_PORT=5432
URL="http://immich-server:2283"
ML_TAG_SUFFIX="_predicted"
ML_TAG_NEGATIVE_NAME="z_ml_negative_examples"
UPLOAD_LOCATION=./library
DB_DATA_LOCATION=./postgres
IMMICH_VERSION=v2

With that configured, you can bring up the services like so:

docker compose up -d

Now you can navigate to http://localhost:2283/ and set up a user account.

immich_welcome.png

Getting an API key

No, lets get an API key with the permissions:

  • asset
    • asset.read
  • tag
    • tag.create
    • tag.delete
    • tag.read
    • tag.asset

“Immich API screenshot”

Update the .env file and restart the tag container:

docker compose restart immich-ml-tag

Adding pictures

Now, for this test, the setup is almost done. Next, you need to upload some pictures to Immich, which can be done by drag and drop. Immich should then automatically import and embed the images. You can follow the process in the jobs panel in the admin settings.

Once the smart search is done, indexing your pictures, you can try searching for keywords. Let’s see cows:

immich_cows.png

That works so well and makes me happy.

ML tagging

If you want to now use the ML assistant tagging, its very easy:

  1. Create a Tag
  2. Add at least 10 pictures to the tag
  3. Wait until the training happens (2 am) or kick one of them manually
  4. Add negative examples, add more in-group pictures
  5. Wait for retraining (again, 2 am) or kick a retraining off manually

Let’s do it.

Dogs vs Cats

As an example, I want all dog pictures tagged as dogs and all cats as cats. But we will quickly see that the ML model needs negative examples to make this distinction.

1. Create the Tags

Easy enough, we enable the tag feature for the user and then create the tags.

immich_tag_sidebar.png

And now we tag some pictures. It’s easy to do using the smart search. After you have added 10 pictures to each category, you can start a training run like so:

docker compose exec immich-ml-tag immich-ml-tag train --force 

The classifier will then fetch the tagged images and a sample of unlabeled images and try to draw a line between those in the high-dimensional space. It will then fetch all unlabeled pictures and create new labels Dog_predicted and Cat_predicted, and tag all images with these labels.

As such, it will create new tags as you can see in this screenshot:

tags.png

Notice how there is also a tag called z_ml_negative_examples, we talk about that in the next section.

Due to how tags work in Immich, you can then investigate the Cat and Dog tags and see if there are any pictures that are not dogs or cats.

Removing false positives

To help the classifier out, we can label some of the pictures it got wrong with essentially Not a dog and Not a cat. These pictures will be considered during training and will help improve the classifier.

For this purpose, the plugin has created the tags z_ml_negative_examples/Cat and z_ml_negative_examples/Dog. The prefix can be set in the .env file. I chose this as it sorts low in the interface, but anything is possible. Maybe _NOT ?

Now, looking at the predicted cat labels, it’s easy to spot some negative examples.

cat_1.png

Those can be tagged with z_ml_negative_examples/Cat in the UI. This can then look something like this:

not_cats.png

After that is done, I restart training again (or wait till the next day).

docker compose exec immich-ml-tag immich-ml-tag train --force 

And suddenly the predictions are much better:

cats.png

This is really all that it is. Once a label is trained with a few examples, in my experience, the predictions get quite good and stay pretty solid over time.

Dogs and cats are simple enough concepts that ML has been trained on for years now. But concepts like favourite picture or decoration ideas or maybe pictures linked to a hobby, let’s say basketball or painting, they might include more than just pictures of basketballs or paint brushes for you. They might include the team or pictures you took as inspiration. These tags can be learned by such a model, and hopefully, the automated nature will bring some order into a photo collection. Helping narrow down the photo collection one concept at a time.

Conclusion

In the end, this experiment worked out rather well, and I might even like this setup better than what I had before, as now photos can have multiple tags instead of being sorted into folders.

The tool is by no means finished; help is welcome if you are into tagging your pictures with ML. Possible avenues for contributions are code review, better ML models, ML model evaluations, multi-user setups (not tested as of now) and much more.

Link to the project: github.com/openpaul/Immich-ML-Tag