Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong krunkit warning when no models yet running? #301

Closed
maxandersen opened this issue Oct 14, 2024 · 13 comments
Closed

wrong krunkit warning when no models yet running? #301

maxandersen opened this issue Oct 14, 2024 · 13 comments

Comments

@maxandersen
Copy link
Contributor

when I run ramalama list I get warning about not using krunkit. (Warning: podman needs to be configured to use krunkit for AI Workloads, running without containers)

if i run podman machine info i says vmtype applehv which uses krunkit.

podman machine info
host:
    arch: arm64
    currentmachine: ""
    defaultmachine: ""
    eventsdir: /var/folders/mm/z7zzmyl15bd8byr8vsdxf1740000gn/T/storage-run-501/podman
    machineconfigdir: /Users/manderse/.config/containers/podman/machine/applehv
    machineimagedir: /Users/manderse/.local/share/containers/podman/machine/applehv
    machinestate: ""
    numberofmachines: 0
    os: darwin
    vmtype: applehv
version:
    apiversion: 5.2.0
    version: 5.2.0
    goversion: go1.22.5
    gitcommit: b22d5c61eef93475413724f49fd6a32980d2c746
    builttime: Fri Aug  2 14:05:53 2024
    built: 1722600353
    osarch: darwin/arm64
    os: darwin

looking in the code the test done is running podman machine list which is empty on my machine.

@maxandersen maxandersen changed the title wrong warning when no models yet running wrong krunkit warning when no models yet running? Oct 14, 2024
@rhatdan
Copy link
Member

rhatdan commented Oct 14, 2024

You are running with vmtype: applehv

You need to switch this to libkrun (krunkit.)

containers/common#2200

https://blog.podman.io/2024/07/podman-and-libkrun/

I am not sure if this fully works yet, although hopefully soon. You can run the workload within a container on a Podman Machine with libkrun (krunkit) but I don't think the container is fully turning on the GPU.

@slp is working on this.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Oct 14, 2024

I'm still on applehv myself, must switch to krunkit, at the moment, even if somebody switched to krunkit, it would still run via cpu inferencing until we get this PR merged and the container image pushed:

#235

$ podman machine info
host:
    arch: arm64
    currentmachine: podman-machine-default
    defaultmachine: podman-machine-default
    eventsdir: /var/folders/q3/5msn8s8j02d62vcdg_y2hh2w0000gn/T/storage-run-501/podman
    machineconfigdir: /Users/ecurtin/.config/containers/podman/machine/applehv
    machineimagedir: /Users/ecurtin/.local/share/containers/podman/machine/applehv
    machinestate: Stopped
    numberofmachines: 1
    os: darwin
    vmtype: applehv
version:
    apiversion: 5.2.3
    version: 5.2.3
    goversion: go1.23.1
    gitcommit: c5366a308e89edd9636b66faf79bd5cb18ed0905
    builttime: Tue Sep 24 16:21:03 2024
    built: 1727191263
    osarch: darwin/arm64
    os: darwin

One piece of advice I got is don't install via brew install podman. Instead install via podman website https://podman.io/ .

I recently switched to the podman.io website version and it still seems to default to vfkit.

Is this blog post:

https://blog.podman.io/2024/07/podman-and-libkrun/

it seems like the key change is this file ~/.config/containers/containers.conf:

[machine]
provider="libkrun"

Is this the most up to date blog post @slp @baude ?

Ah @rhatdan posted before I completed writing this response 😄

I hope libkrun becomes default out of the box soon.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Oct 14, 2024

I'm not having great success myself if I'm honest:

Looking up Podman Machine image at quay.io/podman/machine-os:5.2 to create VM
Getting image source signatures
Copying blob 91ab7a44509f done   |
Copying config 44136fa355 done   |
Writing manifest to image destination
91ab7a44509f0dda83ecbef03d9079431b32f2f839f729efc47aff2e41276584
Extracting compressed file: podman-machine-default-arm64.raw: done
Machine init complete
Starting machine "podman-machine-default"
Error: krunkit exited unexpectedly with exit code 1

@ericcurtin
Copy link
Collaborator

I found my issue... @slp is this an artificial limit?

/opt/podman/bin/krunkit --cpus 12 --memory 32768
Error: too many vCPUs configured (max 8)

I have 12 cores on my machine, so I try and configure that

@ericcurtin
Copy link
Collaborator

Not a fix but loosely related:

containers/podman#24257

@ericcurtin
Copy link
Collaborator

I'm in business at least:

$ podman machine list
NAME                     VM TYPE     CREATED        LAST UP            CPUS        MEMORY      DISK SIZE
podman-machine-default*  libkrun     5 minutes ago  Currently running  8           32GiB       256GiB

@ericcurtin
Copy link
Collaborator

ericcurtin commented Oct 14, 2024

I followed the trail to here:

https://github.com/containers/libkrunfw?tab=readme-ov-file#known-limitations

But in the context of podman machine/krunkit we seem to run the standard everyday Fedora kernel which has a limit of 4096 CPUS?

$ uname -a; grep -r NR_CPU /ostree/deploy/fedora-coreos/deploy/65e61c38846df03b75d93090cb78f7ec2337e591a8b16c0f4922eea0b7d0c6cf.0/usr/lib/modules/6.10.10-200.fc40.aarch64/config
Linux localhost.localdomain 6.10.10-200.fc40.aarch64 #1 SMP PREEMPT_DYNAMIC Thu Sep 12 18:52:07 UTC 2024 aarch64 GNU/Linux
CONFIG_NR_CPUS=4096

@tylerfanelli @slp do we actually have this 8 CPU restriction in the podman machine/krunkit case? I'm not a podman machine/krunkit guru or anything though

@maxandersen
Copy link
Contributor Author

afaik I don't have installed via brew - just podman desktop and in podman desktop I get: Podman v5.2.0 GPU enabled (LibKrun) hence I thought all was fine.

I wonder why it seems podman on cli reports something different :/

@ericcurtin
Copy link
Collaborator

ericcurtin commented Oct 14, 2024

@maxandersen I populated this file ~/.config/containers/containers.conf:

[machine]
provider="libkrun"

ran

podman machine reset # deletes everything
podman machine init --cpus 8 --disk-size 256 -m 32768 --username $USER --now # 8 cores, 32GB ram, 256GB disk

and all went well. Not ideal because I deleted all my existing podman machine stuff, dunno if that's necessary, but it worked...

@ericcurtin
Copy link
Collaborator

This also might be the fix @maxandersen

#303

This string is libkrun on my system at least

@maxandersen
Copy link
Contributor Author

@ericcurtin +1; after I recreated my machines as you reported i got libkrun too ....still weird why podman desktop reported something else...but the issue in ramalama is fixed with #303

@ericcurtin
Copy link
Collaborator

I'll close now, feel free to re-open if needs be

@tylerfanelli
Copy link

@tylerfanelli @slp do we actually have this 8 CPU restriction in the podman machine/krunkit case? I'm not a podman machine/krunkit guru or anything though

As I understand, the 8 CPU restriction is a temporary fix within krunkit, but >8 CPUs will be supported at some point. @slp Do I have this right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants