V2EX hongdengdao
 hongdengdao's recent timeline updates
hongdengdao

hongdengdao

V2EX member #42707, joined on 2013-07-27 15:19:45 +08:00
Today's activity rank 12632
hongdengdao's recent replies
这个应该有点问题,双卡 32g 显存, 8k-64k 肯定是可以运行的

.\kaiwu.exe run Qwen3.6-27B-Q4_K_M.gguf







本地大模型部署器 vv0.1.2 llama.cpp b8864
by llmbbs.ai 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 4060 Ti × 2 (SM89, 16380 MB VRAM each, 0 GB/s)
RAM: 61 GB DDR5
OS: windows amd64

[2/6] Selecting configuration...
Model: Qwen3.6-27B (dense, 28B)
Quant: Q4_K_M (15.7 GB)
Mode: full_gpu
Accel: Flash Attention

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Model: Qwen3.6-27B-Q4_K_M.gguf [cached]

[4/6] Preflight check...
VRAM sufficient

[5/6] Warmup benchmark...
Probe 1: ctx=256K ... OOM
Probe 2: ctx=128K ... OOM
Probe 3: ctx=64K ... OOM
Probe 4: ctx=32K ... OOM
Probe 5: ctx=16K ... OOM
Probe 6: ctx=8K ... OOM
Warmup failed: all ctx probes failed (tried down to 4K)
Using default parameters

[6/6] Starting server...
llama-server 不支持 iso3 ,回退到 q8_0/q4_0
Waiting for llama-server to be ready (port 11434)...
显存不足,降低上下文至 64K 重试...
Waiting for llama-server to be ready (port 11434)...
显存不足,降低上下文至 32K 重试...
Waiting for llama-server to be ready (port 11434)...
Error: failed to start llama-server: 3 次启动均失败,建议选择更小的模型
Usage:
kaiwu run <model> [flags],
.\kaiwu.exe run Qwen3.6-27B-Q4_K_M.gguf







本地大模型部署器 vv0.1.1 llama.cpp b8864
by llmbbs.ai 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 4060 Ti (SM89, 16380 MB VRAM, 0 GB/s)
RAM: 61 GB DDR5
OS: windows amd64

[2/6] Selecting configuration...
Model: Qwen3.6-27B (dense, 28B)
Quant: Q4_K_M (15.7 GB)
Mode: full_gpu
Accel: Flash Attention

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Model: Qwen3.6-27B-Q4_K_M.gguf [cached]

[4/6] Preflight check...
VRAM sufficient

[5/6] Warmup benchmark...
Probe 1: ctx=8K ... OOM
Probe 2: ctx=4K ... OOM
Warmup failed: all ctx probes failed (tried down to 4K)
Using default parameters

[6/6] Starting server...
llama-server 不支持 iso3 ,回退到 q8_0/q4_0
Waiting for llama-server to be ready (port 11434)...
显存不足,降低上下文至 4K 重试...
Waiting for llama-server to be ready (port 11434)...
Error: failed to start llama-server: 连续 2 次启动失败,即使最小上下文(4K)也无法运行
建议:选择更小的量化或使用 MoE offload 模型
Usage:
kaiwu run <model> [flags]
0.1.1 我的是你的这个版本, nvidia-smi
Fri Apr 24 22:37:21 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.86 Driver Version: 591.86 CUDA Version: 13.1 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti WDDM | 00000000:01:00.0 Off | N/A |
| 0% 33C P8 3W / 199W | 8MiB / 16380MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 5060 Ti WDDM | 00000000:0D:00.0 On | N/A |
| 0% 38C P5 12W / 180W | 3831MiB / 16311MiB | 4% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
4060ti+5060ti 双卡没有识别出来,只出来了 4060ti
新年快乐
支持关注
一般都是最少给客户 5 天退货时间,应该是哪里出了问题,你这个应该不是普遍情况,可能是 bug 或者是别的问题,可以跟淘宝客服反馈下
@hongdengdao 之前联系过威联通,说要换兼容性列表上的硬盘
About     Help     Advertise     Blog     API     FAQ     Solana     3169 Online   Highest 6679       Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.5 27ms UTC 12:38 PVG 20:38 LAX 05:38 JFK 08:38
Do have faith in what you're doing.
ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86