V2EX tubanwu
tubanwu

tubanwu

V2EX member #41869, joined on 2013-07-09 22:10:43 +08:00
Today's activity rank 12191
Per tubanwu's settings, the topics list is hidden
Deals info, including closed deals, is not hidden
tubanwu's recent replies
@KaiWuBOSS 跑不起来
本地大模型部署器 vv0.3.1 llama.cpp b8864
by llmbbs.ai 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 5060 (SM120, 8151 MB VRAM, 448 GB/s)
RAM: 31 GB DDR4
OS: windows amd64
CUDA 13.2 detected known bug with low-bit quantization
If you see garbled output, downgrade driver to CUDA 13.1
Warning: RTX 50 series with CUDA 13.2 detected
Kaiwu will use CUDA 12.4 binary for stability.

[2/6] Selecting configuration...
Model: Qwen3.6 35B A3B Claude 4.7 Opus Reasoning Distilled (moe, 22B total / 1B active)
Quant: Q22 (13.5 GB)
Mode: moe_partial
Accel: Flash Attention + SWA-Full (hybrid arch)

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Model: Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled.i1-IQ3_XS.gguf [cached]

[4/6] Preflight check...
VRAM sufficient

[5/6] Warmup benchmark...
RTX 50 系首次运行,正在编译 CUDA 内核(约 60s ,仅需一次)...
CUDA 内核编译完成,后续启动将秒开
JIT 预热失败: exit status 0xc0000135
Probe 1: ctx=128K ... OOM
Probe 2: ctx=64K ... OOM
Probe 3: ctx=32K ... OOM
Probe 4: ctx=16K ... OOM
Probe 5: ctx=8K ... OOM
Warmup failed: all ctx probes failed (tried down to 4K)
Using default parameters

[6/6] Starting server...
Waiting for llama-server to be ready (port 11434)...
显存不足,降低上下文至 4K 重试...
Waiting for llama-server to be ready (port 11434)...
Error: failed to start llama-server: 连续 2 次启动失败,即使最小上下文(4K)也无法运行

NVIDIA GeForce RTX 5060: 8151 MB VRAM
模型 Qwen3.6 35B A3B Claude 4.7 Opus Reasoning Distilled: ~13813 MB
KV cache (4K, q4_0): ~80 MB
预估总需: ~14917 MB

差额: 6766 MB

建议:
1. 选择更小的量化 (Q4_K_M 或 Q2_K)
2. 选择更小的模型

Usage:
kaiwu run <model> [flags]

Flags:
--bench Run benchmark after starting
--ctx-size int 手动指定上下文大小( 0=自动)
--fast Skip warmup, use cached profile
-h, --help help for run
--host string 监听地址(默认 127.0.0.1 ,用 0.0.0.0 开放局域网) (default "127.0.0.1")
--llama-server string 使用自定义 llama-server 二进制(完整路径)
--mode string 模式选择: speed/balanced/context (默认用上次选择)
--reset 清除缓存,重新 warmup 探测最优参数
@andyskaura 谢谢哥
@andyskaura 等你下周的 win 版本
谢谢大佬
@Vendettar #30 我是去四川联通贴吧里找的,不知道现在还有没有。
成都联通动态公网 IP 怎么申请能说一下吗?
Dec 25, 2024
Replied to a topic by yangtianming 问与答 关于国补的商家套路
同款空调双十一的时候京东自营国补到手价我买成 1580 。你 6 月是不是赶上 618 的价格了
红米 turbo3 ,性能不错价格便宜,也支持 80%充电上限,可以去某鱼看看有没有已经解锁的二手机
我买过两个小米洗烘一体机好像没有这种明细的刺鼻气味,虽然不是同一个型号但是应该都是吉德代工的。用过海尔海信的冷凝式洗烘一体机,洗烘完之后的味道和小米洗烘出来的没有太明显的区别。
About     Help     Advertise     Blog     API     FAQ     Solana     3626 Online   Highest 6679       Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.5 20ms UTC 04:58 PVG 12:58 LAX 21:58 JFK 00:58
Do have faith in what you're doing.
ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86