
\u6211\u4eec\u76ee\u524d\u6b63\u5728\u5bfb\u627e\u4e00\u540d\u6df1\u5ea6\u5b66\u4e60\u6027\u80fd\u8f6f\u4ef6\u5de5\u7a0b\u5e08\uff01\u6211\u4eec\u6b63\u5728\u6269\u5c55\u6211\u4eec\u7684\u63a8\u7406\u7814\u7a76\u4e0e\u5f00\u53d1\u3002\u6211\u4eec\u5bfb\u6c42\u4f18\u79c0\u7684\u8f6f\u4ef6\u5de5\u7a0b\u5e08\u548c\u9ad8\u7ea7\u8f6f\u4ef6\u5de5\u7a0b\u5e08\u52a0\u5165\u6211\u4eec\u7684\u56e2\u961f\u3002\u6211\u4eec\u4e13\u6ce8\u4e8e\u5f00\u53d1 GPU \u52a0\u901f\u7684\u6df1\u5ea6\u5b66\u4e60\u8f6f\u4ef6\u3002\u5168\u7403\u7684\u7814\u7a76\u4eba\u5458\u6b63\u5728\u4f7f\u7528 NVIDIA GPU \u63a8\u52a8\u6df1\u5ea6\u5b66\u4e60\u7684\u9769\u547d\uff0c\u8fd9\u5728\u4f17\u591a\u9886\u57df\u5b9e\u73b0\u4e86\u7a81\u7834\u3002\u52a0\u5165\u6211\u4eec\u7684\u56e2\u961f\uff0c\u6784\u5efa\u4f7f\u65b0\u89e3\u51b3\u65b9\u6848\u6210\u4e3a\u53ef\u80fd\u7684\u8f6f\u4ef6\u3002\u4e0e\u6df1\u5ea6\u5b66\u4e60\u793e\u533a\u5408\u4f5c\uff0c\u5728 Tensor-RT \u4e2d\u5b9e\u73b0\u6700\u65b0\u7b97\u6cd5\u7684\u516c\u5f00\u53d1\u5e03\u3002\u6211\u4eec\u9700\u8981\u4f60\u80fd\u591f\u5728\u5feb\u8282\u594f\u3001\u4ee5\u5ba2\u6237\u4e3a\u4e2d\u5fc3\u7684\u56e2\u961f\u4e2d\u5de5\u4f5c\uff0c\u5e76\u4e14\u5177\u5907\u51fa\u8272\u7684\u6c9f\u901a\u6280\u5de7\u3002
\n\u4f60\u5c06\u8981\u505a\u7684\u5de5\u4f5c\u5305\u62ec\uff1a
\n\u6211\u4eec\u5e0c\u671b\u770b\u5230\u7684\u8d44\u8d28\uff1a
\n\u5982\u679c\u611f\u5174\u8da3\u8bf7\u8054\u7cfb\uff1a
\n\u5fae\u4fe1\uff1a18867144803\n\u7b80\u5386\u6295\u9012\uff1a xiaozhao@nvidia.com
\n\u8bf7\u5907\u6ce8\u6295\u9012\u7684\u5c97\u4f4d\u65b9\u5411\u5982\uff1a\u59d3\u540d+\u6df1\u5ea6\u5b66\u4e60\u6027\u80fd\u4f18\u5316
\n" }, { "author": { "url": "member/cwjwgg", "name": "cwjwgg", "avatar": "https://cdn.v2ex.com/gravatar/5bf0a4af3e7eca9f77152858ac3dfa26?s=73&d=retro" }, "url": "t/1050060", "title": "\u8bad\u7ec3 SVC \u58f0\u97f3\u6a21\u578b 2060 12g \u548c 8G \u7684 3060TI \u54ea\u4e2a\u5feb", "id": "t/1050060", "date_published": "2024-06-17T01:41:01+00:00", "content_html": "\u6211\u770b 2060 \u62e5\u6709 1920 \u4e2a CUDA \u6838\u5fc3 \u4f46 12G\u6211\u4eec\u5728 k8s \u4e2d\u90e8\u7f72\u4e86 stable-diffusion-webui\n\u4f9b\u4efb\u4f55\u60f3\u8981\u4f53\u9a8c\u7684 Stable Diffusion Model \u7684\u7528\u6237\u4f7f\u7528.\n\u968f\u7740\u4e00\u4e2a\u53c8\u4e00\u4e2a\u7684\u8bf7\u6c42, \u6211\u4eec\u9891\u7e41\u7684\u9047\u5230 CUDA \u7684 OOM \u9519\u8bef.\n\u5176\u4e2d\u7684\u4e00\u5c0f\u90e8\u5206\u786e\u5b9e\u662f\u56e0\u4e3a\u7528\u6237\u8bf7\u6c42\u9700\u8981\u7684\u8d44\u6e90\u8d85\u8fc7\u4e86\u5bf9\u5e94 GPU \u80fd\u591f\u63d0\u4f9b\u7684\u5185\u5b58.
\n\u5269\u4e0b\u7684, \u5360\u5927\u90e8\u5206\u7684, \u662f\u7c7b\u4f3c\u5982\u4e0b\u7684\u4ee4\u4eba\u56f0\u60d1\u7684\u573a\u666f.
\n{\"error\": \"OutOfMemoryError\", \"detail\": \"\", \"body\": \"\", \"errors\": \"CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 11.76 GiB total capacity; 7.92 GiB already allocated; 784.31 MiB free; 10.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF\"}\n\n\u6839\u636e\u5bf9 memory_stats \u7684\u7406\u89e3:
\n\u8fd9\u90e8\u5206\u5185\u5b58\u53bb\u54ea\u513f\u4e86\u5462? \u4e3a\u4ec0\u4e48\u5728\u7528\u6237\u7533\u8bf7\u7684\u65f6\u5019\u4f9d\u7136\u6ca1\u6709\u88ab\u56de\u6536\u5462?
\n\u5f53\u7528\u6237\u8bf7\u6c42\u5185\u5b58\u65f6, pytorch \u7684\u5904\u7406\u6d41\u7a0b\u53ef\u4ee5\u7b80\u5316\u4e3a:
\nget_free_block \u53bb\u5bfb\u627e\u6ee1\u8db3\u8981\u6c42\u7684\u7a7a\u95f2 Blocktrigger_free_memory_callbacks \u53bb\u56de\u6536\u5df2\u5206\u914d\u4f46\u4e0d\u518d\u4f7f\u7528\u7684 Block \u540e, \u518d\u6b21\u5c1d\u8bd5 get_free_blockalloc_block \u53bb\u5411 GPU \u7533\u8bf7\u65b0\u7684 Blockrelease_available_cached_blocks \u5c06\u5df2\u7533\u8bf7\u4f46\u672a\u5206\u914d\u7684 Block \u91ca\u653e\u540e\u518d\u6b21\u5c1d\u8bd5 alloc_blockrelease_cached_blocks \u5c06\u6240\u6709\u5df2\u7533\u8bf7\u4f46\u672a\u5206\u914d\u7684 Block \u91ca\u653e, \u518d\u6b21\u5c1d\u8bd5 alloc_block\u6211\u4eec\u6ce8\u610f\u5230 pytorch \u5411 GPU \u7533\u8bf7\u548c\u5206\u914d\u7ed9\u7528\u6237\u7684\u5185\u5b58\u90fd\u4ee5 Block \u4e3a\u5355\u4f4d.\npytorch \u5411 GPU \u7533\u8bf7\u7684 Block \u5927\u5c0f\u5e76\u4e0d\u56fa\u5b9a, \u53d7\u5f53\u65f6\u7528\u6237\u8bf7\u6c42\u5185\u5b58\u5927\u5c0f\u7684\u5f71\u54cd.\n\u7528\u6237\u91ca\u653e\u5185\u5b58\u540e, Block \u8fd4\u56de\u7ed9 pytorch \u5e76\u6210\u4e3a\u7a7a\u95f2\u72b6\u6001.\n\u7528\u6237\u4e0b\u6b21\u7533\u8bf7\u65f6\u4f18\u5148\u4f1a\u590d\u7528\u7a7a\u95f2 Block, \u800c\u4e0d\u662f\u76f4\u63a5\u5411 GPU \u7533\u8bf7.
\n\u5982\u679c\u7528\u6237\u7533\u8bf7\u7684\u5185\u5b58\u5927\u5c0f\u5c0f\u4e8e\u6ee1\u8db3\u8981\u6c42\u7684\u7a7a\u95f2 Block, pytorch \u4f1a\u8fdb\u884c\u4e00\u6b21 split \u64cd\u4f5c.\n\u5c06 Block \u5206\u5272\u6210\u4e24\u4e2a Block, \u9664\u53bb\u7528\u6237\u8bf7\u6c42\u5927\u5c0f\u7684\u5185\u5b58\u4f1a\u88ab\u5206\u5272\u6210\u4e00\u4e2a\u72ec\u7acb\u7684 Block,\n\u7559\u5f85\u540e\u7528\u5e76\u901a\u8fc7\u53cc\u5411\u94fe\u8868\u548c\u5206\u914d\u7ed9\u7528\u6237\u7684 Block \u76f8\u5173\u8054.
\ntrigger_free_memory_callbacks \u7684\u56de\u6536\u8fc7\u7a0b\u4f1a\u5c06\u76f8\u90bb\u7684\u7a7a\u95f2 Block \u5408\u5e76, \u63d0\u9ad8\u540e\u7eed\u5206\u914d\u7684\u7075\u6d3b\u6027.
\u76f8\u8f83\u4e8e\u5176\u4ed6\u5185\u5b58\u7ba1\u7406\u673a\u5236, pytorch \u7684\u5185\u5b58\u7ba1\u7406\u76f8\u5bf9\u7b80\u7565:
\n\u4e0a\u8ff0\u7684\u4e24\u70b9, \u9020\u6210\u4e86 pytorch \u53ef\u80fd\u56e0\u4e3a Block \u788e\u7247\u5316, \u5bfc\u81f4\u5927\u91cf\u5185\u5b58\u65e0\u6cd5\u88ab\u4f7f\u7528.
\n\u5047\u8bbe\u5728\u67d0\u6b21\u5206\u914d\u5185\u5b58\u65f6, pytorch \u6839\u636e\u7528\u6237\u8bf7\u6c42\u5411 GPU \u7533\u8bf7\u4e86\u4e00\u4e2a 256M \u7684 Block.
\n<-------------------------- 256M ----------------------------->
\u7ecf\u8fc7\u591a\u6b21\u5206\u914d\u548c\u56de\u6536, \u5176\u4f7f\u7528\u60c5\u51b5\u53ef\u80fd\u53d8\u6210\u5982\u4e0b.
\n<-- 28M(allocated) --><-- 100M(free) --><-- 28M(allocated) --><-- 100M(free) -->
\u6b64\u65f6\u5982\u679c\u7528\u6237\u7533\u8bf7 160M \u5185\u5b58:
\nmax_split_size_mb \u7684\u4f5c\u7528\u5728\u4e8e\u7981\u6b62 pytorch \u5bf9\u4efb\u4f55\u5927\u4e8e\u8be5\u5927\u5c0f\u7684 Block \u8fdb\u884c\u5206\u5272\u64cd\u4f5c, \u4ece\u800c\u63a7\u5236\u788e\u7247\u5316\u7684\u7a0b\u5ea6.\n\u6211\u4eec\u4e0a\u6587\u8bb2\u8bc9\u7684\u90fd\u662f\u5728\u672a\u4e3b\u52a8\u8bbe\u7f6e max_split_size_mb \u7684\u60c5\u51b5\u4e0b\u7684\u903b\u8f91, \u6b64\u65f6 max_split_size_mb \u53d6\u9ed8\u8ba4\u503c MAX_INT.
\n\u6211\u4eec\u5e76\u6ca1\u6709\u627e\u5230\u5b98\u65b9\u63a8\u8350\u7684 max_split_size_mb, \u6211\u4eec\u4e5f\u4e0d\u719f\u6089 pytorch \u548c nvida, \u5f88\u96be\u7ed9\u51fa\u4e00\u4e2a\u5f88\u597d\u7684\u63a8\u8350\u503c.\n\u4ece\u5b9e\u9645\u4f7f\u7528\u6765\u548c\u76f4\u89c2\u903b\u8f91\u6765\u8bf4, 128/256/512 \u4e4b\u7c7b\u7684\u503c\u90fd\u662f\u53ef\u9009\u7684, \u5207\u5b9e\u7684\u907f\u514d\u4e86 OOM, \u4e5f\u6ca1\u6709\u5bfc\u81f4\u660e\u663e\u7684\u6027\u80fd\u8d1f\u62c5.
\npytorch \u9ed8\u8ba4\u4ec5\u5728\u65e0\u6cd5\u83b7\u53d6\u5230\u5408\u9002\u7684\u7a7a\u95f2 Block \u65f6\u89e6\u53d1\u56de\u6536,\n\u8fd9\u4e2a\u503c\u53ef\u4ee5\u63a7\u5236\u5f53 allocated/capacity \u8d85\u8fc7\u6b64\u503c\u65f6\u89e6\u53d1\u4e3b\u52a8\u7684\u56de\u6536.
\npytorch \u6700\u65b0(>v2.0.1)\u7684 master \u5206\u652f\u4e2d\u6dfb\u52a0\u4e86 Expandable Segments,\n\u53ef\u80fd\u4e5f\u53ef\u4ee5\u7f13\u89e3\u788e\u7247\u5316\u7684\u95ee\u9898.
\n[ \u5730\u70b9 ] \uff1aShanghai/Beijing/Shenzhen
\n[ \u53d1\u9001\u7b80\u5386\u5230 ] :xiaozhao@nvidia.com
\n[ WeChat \u53ef\u52a0\u5fae\u4fe1 ] \uff1a18867144803
\n\u4ee3\u7801\u80fd\u529b\u3009\u5de5\u4f5c\u5e74\u9650
\nDeep Learning Performance Architect-Compiler/LLM-TensorRT
\n\u4e3b\u8981\u505a\u7684\u662f\u56f4\u7ed5\u6df1\u5ea6\u5b66\u4e60\u7aef\u5230\u7aef\u7684 AI \u8f6f\u4ef6\u5168\u6808\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8e\u8bad\u7ec3\u6846\u67b6\u3001\u6838\u5fc3\u8ba1\u7b97\u5e93\u3001\u63a8\u7406\u4f18\u5316\u5de5\u5177\uff08\u6bd4\u5982 TensorRT \uff09\uff0cAI \u7f16\u8bd1\u5668\uff0c\u6a21\u578b\u538b\u7f29\u7b49\u5168\u6808\u8f6f\u4ef6\u6808\u3002\u4ee5\u53ca\u53ef\u4ee5\u5728 AI \u8f6f\u4ef6\u5168\u6808\u57fa\u7840\u4e0a\u5f71\u54cd\u5230\u4e0b\u4e00\u4ee3\u751a\u81f3\u4e0b\u4e24\u4ee3\u786c\u4ef6\u67b6\u6784\u7684\u7279\u6027\u8bbe\u8ba1\u3002
\nRequired skills: \u826f\u597d C++\u7f16\u7a0b\uff0c\u719f\u6089 AI \u8f6f\u4ef6\u6808\u5e95\u5c42\u6216\u8005\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\uff0c\u719f\u6089\u4e0a\u5c42\u7b97\u6cd5\u4e0e Python \u662f\u52a0\u5206\u9879\u3002
\n\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77
\nDeep Learning Performance Architect-TensorRT
\n\u8d1f\u8d23 NVIDIA \u6df1\u5ea6\u5b66\u4e60\u63a8\u7406\u5f15\u64ce TensorRT \u7684\u8bbe\u8ba1\u3001\u5f00\u53d1\u548c\u7ef4\u62a4\u5de5\u4f5c(e.g. TensorRT \u6a21\u578b\u5bfc\u5165\u7684\u6d41\u7a0b\u548c\u76f8\u5173\u5de5\u5177\uff0c\u56fe\u4f18\u5316\uff0c\u7b97\u5b50\u7684 CUDA \u5b9e\u73b0\u53ca\u4ee3\u7801\u751f\u6210\uff0c\u7b97\u5b50\u6027\u80fd\u4f18\u5316\u7b49)\uff0c\u4ee5\u53ca\u5bf9\u5f53\u524d\u4e3b\u6d41\u7684\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u4f7f\u7528 TensorRT \u8fdb\u884c\u63a8\u7406\u7684\u6027\u80fd\u8fdb\u884c\u5206\u6790\u548c\u4f18\u5316\u3002\u540c\u65f6\uff0c\u8fd8\u5c06\u4e0e NVIDIA GPU \u4f53\u7cfb\u7ed3\u6784\u8bbe\u8ba1\u56e2\u961f\u5408\u4f5c\u6765\u63a8\u52a8 NVIDIA \u6df1\u5ea6\u5b66\u4e60\u89e3\u51b3\u65b9\u6848\u7684\u8f6f\u786c\u4ef6\u534f\u540c\u8bbe\u8ba1\u548c\u7814\u53d1\u3002
\n\u5c97\u4f4d\u57fa\u672c\u8981\u6c42: \u719f\u7ec3\u638c\u63e1 C++\u7f16\u7a0b
\n\u5176\u5b83\u5bc6\u5207\u76f8\u5173\u7684\u6280\u80fd /\u7ecf\u9a8c: \u6df1\u5ea6\u5b66\u4e60\u6846\u67b6 /\u6df1\u5ea6\u5b66\u4e60\u7f16\u8bd1\u5668\u5f00\u53d1\uff0c\u6027\u80fd\u5206\u6790 /\u5efa\u6a21 /\u4f18\u5316\u76f8\u5173\u7684\u65b9\u6cd5\u8bba /\u5de5\u5177\uff0c\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\u76f8\u5173\u77e5\u8bc6\uff0cCUDA kernel \u5f00\u53d1 /\u4f18\u5316
\n\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77
\nDeep Learning Performance Architect-Operator
\n\u4e3b\u8981\u505a\u7684\u662f\u9488\u5bf9\u4e0d\u540c GPU \u67b6\u6784\u4e3a TensorRT, cuDNN, cuBLAS, cuSPARSE \u7b49\u6df1\u5ea6\u5b66\u4e60\u7b97\u5b50\u5e93\u63d0\u4f9b\u9ad8\u6027\u80fd\u57fa\u7840\u7b97\u5b50\u4ee5\u53ca\u7b97\u5b50\u878d\u5408\u5b9e\u73b0\uff0c\u5305\u542b\u5728\u7ebf\u4ee3\u7801\u751f\u6210\uff0c\u4ee3\u7801\u878d\u5408\u7b49\u76f8\u5173\u5f00\u53d1\u5de5\u4f5c\uff0c\u4ee5\u53ca\u6839\u636e\u5f53\u4ee3 GPU \u4f18\u5316\u74f6\u9888\u5f71\u54cd\u540e\u7eed\u786c\u4ef6\u67b6\u6784\u7279\u5f81\u8bbe\u8ba1\u548c\u9a8c\u8bc1\u5de5\u4f5c\u3002
\nRequired skills: \u826f\u597d C++\u7f16\u7a0b\uff0c\u719f\u6089\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\uff0c \u6709 TVM, MLIR \u76f8\u5173\u5f00\u53d1\u7ecf\u9a8c\u662f\u52a0\u5206\u9879\u3002
\n\u5730\u70b9\uff1a\u4e0a\u6d77\u4e0e\u5317\u4eac
\nDeep Learning Performance Architect
\n\u4e3b\u8981\u505a\u7684\u662f\u56f4\u7ed5\u8fd0\u7b97\u67b6\u6784\u7684\u5168\u6808\u4f18\u5316\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8e\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u5206\u6790\u4e0e\u9884\u6d4b\uff0c\u67b6\u6784\u7684\u6027\u80fd\u5206\u6790\uff0c\u7f16\u8bd1\u5668\u6027\u80fd\u5206\u6790\u4ee5\u53ca\u5bf9\u4e3b\u6d41\u8fd0\u7b97\u67b6\u6784\uff0c\u8f6f\u4ef6\u751f\u6001\u7684\u5206\u6790\u3002\u4f7f NVIDIA \u8f6f\u4ef6\u751f\u6001\u4e0e\u8ba1\u7b97\u67b6\u6784\u66f4\u597d\u7684\u652f\u6301\u4e3b\u6d41\u5e94\u7528\u3002
\nRequired skills: \u826f\u597d C++/Python \uff0c\u719f\u6089 AI \u8f6f\u4ef6\u6216\u8005\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\u3002
\n\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77
\nDeveloper Technology Engineer-AI
\n\u5ba2\u6237\u7684\u6df1\u5ea6\u5b66\u4e60\u548c\u9ad8\u80fd\u6027\u8ba1\u7b97\u5e94\u7528\u5728 NVIDIA \u751f\u6001\u4e0a\u7684\u79fb\u690d\u548c\u4f18\u5316\u3002\u8fd9\u4e9b\u5e94\u7528\u5305\u62ec\u5927\u8bed\u8a00\u6a21\u578b\uff0cCV \uff0cSpeech,\u63a8\u8350\u7cfb\u7edf\u548c\u5206\u5b50\u52a8\u529b\u5b66\uff0c\u8ba1\u7b97\u529b\u5b66\uff0c\u8ba1\u7b97\u91cf\u5b50\u5316\u5b66\u7b49\u3002\u901a\u8fc7\u7b97\u6cd5\u548c\u5de5\u7a0b\u4f18\u5316\uff0c\u63d0\u4f9b\u7cfb\u7edf\u7ea7\u7684\u4f18\u5316\u65b9\u6848\u3002\u6df1\u5ea6\u4e0e\u5185\u90e8\u67b6\u6784\u548c\u4ea7\u54c1\u56e2\u961f\u5408\u4f5c\uff0c\u6784\u5efa\u548c\u5b8c\u5584 NVIDIA \u8f6f\u786c\u4ef6\u52a0\u901f\u751f\u6001\u3002
\nRequired skills: Required Skills: \u826f\u597d C/C++\u7f16\u7a0b\u80fd\u529b\uff0c\u5206\u6790\u80fd\u529b\u548c\u6c9f\u901a\u80fd\u529b\uff0c\u719f\u6089\u6df1\u5ea6\u5b66\u4e60\u6216 GPU \u52a0\u901f\u8ba1\u7b97\u8f6f\u4ef6\u6808\uff0c\u624e\u5b9e\u7684\u6df1\u5ea6\u5b66\u4e60\u7406\u8bba\u57fa\u7840\u6216\u7cbe\u901a GPU \u67b6\u6784\u548c\u4f18\u5316\u3002
\n\u5730\u70b9\uff1a\u5317\u4eac\uff0c\u4e0a\u6d77\u4e0e\u6df1\u5733
\n" }, { "author": { "url": "member/leven87", "name": "leven87", "avatar": "https://cdn.v2ex.com/gravatar/b4497986025202f2280dc8497ab80cb7?s=73&d=retro" }, "url": "t/837601", "title": "\u5982\u679c\u5b9e\u73b0 openmpi \u548c cuda \u7f16\u7a0b\u7684\u7ed3\u5408", "id": "t/837601", "date_published": "2022-03-03T02:04:27+00:00", "content_html": "\u5404\u4f4d V \u53cb\u597d\uff0c\u6211\u521a\u63a5\u89e6 cuda \u7f16\u7a0b\u3002\u73b0\u5728\u53ef\u4ee5\u5b9e\u73b0\u5229\u7528\u5355 cpu \u548c gpu \u6765\u52a0\u901f\u8fd0\u7b97\u3002 \u73b0\u5728\u9700\u8981\u5b9e\u73b0\u591a cpu \u548c gpu \u6765\u8fdb\u4e00\u6b65\u52a0\u901f\u8fd0\u7b97\uff0c \u770b\u7f51\u4e0a\u4f8b\u5b50\uff0c\u9700\u8981\u7528\u5230 openmpi, \u8fd8\u8981\u5f00\u542f\u5b83\u7684 cuda \u652f\u6301\u3002 \u8bf7\u95ee\uff1a\n\u8fd9\u6761\u9053\u8def\u662f\u5426\u6b63\u786e\uff1f\n\u8fd8\u6709\u54ea\u4e9b\u9700\u8981\u6ce8\u610f\u7684\u5730\u65b9\uff0ccuda \u4ee3\u7801\u7684\u4fee\u6539\uff0c\u6216\u8005\u914d\u7f6e\u5565\u7684\uff1f
\n" }, { "author": { "url": "member/gouchaoer", "name": "gouchaoer", "avatar": "https://cdn.v2ex.com/avatar/58d5/5587/189082_large.png?m=1480987620" }, "url": "t/831540", "date_modified": "2022-01-31T12:29:28+00:00", "content_html": "\u95ee\u4e2a\u6280\u672f\u95ee\u9898\uff0c\u5f00\u59cb\u89c9\u5f97\u5f88\u7b80\u5355\uff0c\u4f46\u662f\u641c\u4e86\u5f88\u4e45\u6ca1\u7ed3\u679c\u3002\u3002\u3002\u6211\u7528 NVIDIA \u7684 nvdec \u628a 4 \u4e2a h264 \u7684\u89c6\u9891\u89e3\u7801\u51fa\u6765\u6210\u4e86 rgba \u7684 4 \u5f20 raw \u56fe\u50cf\uff0c\u8bf7\u95ee\u6211\u600e\u4e48\u628a\u5b83\u8f93\u51fa\u5230\u663e\u5361\u7684 4 \u4e2a dp \u53e3\uff1f\u6700\u597d\u662f NVIDIA \u7684 api \uff0cOpenGL \u554a drm \u4e4b\u7c7b\u7684\u5305\u88c5\u8fc7\u7684\u4e5f\u884c
\n", "date_published": "2022-01-31T12:27:55+00:00", "title": "\u600e\u4e48\u628a\u663e\u5361\u663e\u5b58\u4e2d\u7684 rgba \u56fe\u50cf\u6e32\u67d3\u8f93\u51fa\uff1f", "id": "t/831540" }, { "author": { "url": "member/wangx0102", "name": "wangx0102", "avatar": "https://cdn.v2ex.com/avatar/2964/0b5a/432993_large.png?m=1695724205" }, "url": "t/794158", "date_modified": "2021-08-06T11:44:19+00:00", "content_html": "\u5bfc\u5e08\u7ed9\u4e86\u4e00\u4e2a\u7a0b\u5e8f\uff0c\u5b9e\u73b0\u4e86\u4e00\u4e2a\u4e2d\u95f4\u4ef6\u53ef\u4ee5\u5b9e\u73b0 CPU \u548c GPU \u8fd0\u7b97\u7684\u8d1f\u8f7d\u5747\u8861\u3002
\n\u6211\u7684\u521d\u6b65\u60f3\u6cd5\u662f\u628a CUDA \u7a0b\u5e8f\u6253\u5305\u6210 exe \u6216\u8005.so \u5565\u7684\uff0c\u7136\u540e\u7528 Python \u8c03\u7528\uff0c\u4f7f\u7528 Celery \u5b9e\u73b0\u5206\u5e03\u5f0f\u96c6\u7fa4\u3002
\n\u5e0c\u671b\u5927\u5bb6\u80fd\u6709\u66f4\u597d\u7684\u60f3\u6cd5
\n", "date_published": "2021-08-06T11:38:23+00:00", "title": "\u5982\u4f55\u5b9e\u73b0 CUDA \u7684\u5206\u5e03\u5f0f\u5e76\u884c\u8fd0\u7b97\uff1f", "id": "t/794158" }, { "author": { "url": "member/huzhikuizainali", "name": "huzhikuizainali", "avatar": "https://cdn.v2ex.com/avatar/1869/a390/522912_large.png?m=1752498684" }, "url": "t/775344", "title": "\u6e38\u620f\u672c\u4e0a\u7528 cuda \u662f\u4ec0\u4e48\u4f53\u9a8c\uff1f", "id": "t/775344", "date_published": "2021-05-07T01:48:50+00:00", "content_html": "1 \u3001\u6709\u4eba\u5728\u6e38\u620f\u672c\u4e0a\u7528\u8fc7 cuda \u4e48\uff1f\u4f53\u9a8c\u5982\u4f55\uff1f\u8003\u8651\u5230\u91cd\u91cf\u589e\u5927\uff0c\u5f85\u673a\u53d8\u77ed\uff0c\u53d1\u70ed\u3002\u540c\u7b97\u529b\u7684\u589e\u5f3a\u76f8\u6bd4\u3002\u7efc\u5408\u5229\u5f0a\u5f97\u5931\uff0c\u5e26\u4e2a\u6e38\u620f\u672c\u8dd1 cuda \u662f\u5426\u503c\u5f97\uff1fhttps://devblogs.nvidia.com/announcing-cuda-on-windows-subsystem-for-linux-2/
\n" }, { "author": { "url": "member/different", "name": "different", "avatar": "https://cdn.v2ex.com/avatar/94f6/7670/374456_large.png?m=1661142546" }, "url": "t/591013", "title": "\u5173\u4e8e\u4f7f\u7528 GPU \u751f\u6210\u968f\u673a\u6570\uff08cuda/opencl\uff09", "id": "t/591013", "date_published": "2019-08-11T14:16:02+00:00", "content_html": "\u7531\u4e8e\u7279\u6b8a\u539f\u56e0\uff08\u539f\u56e0\u5f88\u7279\u6b8a\uff08\u624b\u52a8\u72d7\u5934\uff09\uff09\u5e76\u4e0d\u80fd\u4f7f\u7528 cuda \u81ea\u5e26\u7684\u968f\u673a\u51fd\u6570\u3002
\n\u56e0\u6b64\uff0c\u7ffb\u8f66\u4e86....\u3002
\n\u76ee\u7684\uff1a\u5728\u4e0d\u4f7f\u7528 cuda \u81ea\u5e26\u7684\u968f\u673a\u51fd\u6570\u524d\u63d0\u4e0b\uff0c\u4f7f\u7528 cuda/opencl \u7684\u4e00\u4e2a\u5185\u6838\u51fd\u6570\u751f\u6210 10000 \u4e2a\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\u3002
\n\u672c\u4eba\u5df2\u5c1d\u8bd5\u4e00\u4e0b\u6b65\u9aa4\uff1a
\n1.\u5728 cpu \u751f\u6210 10000 \u7684\u968f\u673a\u51fd\u6570\uff08\u5e94\u8be5\u662f\u7ebf\u6027\u540c\u4f59\u7b97\u6cd5\uff09
\n2.\u5728 cpu \u4f7f\u7528 The Box \u2013 Muller transform \uff08\u542c\u8bf4\u548c\u7ebf\u6027\u540c\u4f59\u7b97\u6cd5\u4f7f\u7528\u8d77\u6765\u4f1a\u7ffb\u8f66..\uff09\u7b97\u6cd5\u5c06\u6b65\u9aa4 1 \u7684\u968f\u673a\u6570\u8f6c\u6210\u6b63\u6001\u5206\u5e03
\n3.\u7136\u540e\u68c0\u9a8c\u662f\u5426\u4e3a\u6b63\u6001\u5206\u5e03\uff0c\u7ed3\u679c\u662f\u5bf9\u7684.
\n4.\u81f3\u6b64\uff0c\u5df2\u7ecf\u751f\u6210\u4e86\u4e00\u4e2a 10000 \u4e2a\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\u5566\uff0c\u5c06\u5176\u4fdd\u5b58\u5230\u6570\u7ec4 a\u3002
\n\u4e8b\u5b9e\u4e0a\u9700\u8981\u4e0d\u65ad\u751f\u6210\u5e76\u4f7f\u7528\u6570\u7ec4 a\u3002
\n\u56e0\u6b64\u8003\u8651 GPU
\n\u5206\u6790\uff1a\u4e0a\u8ff0\u7684 cpu \u4ee3\u7801\u662f\u5e8f\u5217\u8fdb\u884c\u7684\uff0c\u4e5f\u5c31\u662f\u53ea\u6709\u4e00\u4e2a\u968f\u673a\u79cd\u5b50\uff0c\u7136\u540e\u5728\u4e00\u4e2a\u7ebf\u7a0b\u5185\u5b8c\u6210\u4e86 10000 \u4e2a\u968f\u673a\u6570\u7684\u751f\u6210\u3002
\n\u7136\u540e\u5c06\u4ee3\u7801\u6539\u6539\u653e\u5230 GPU \u4e0a\u9762\u6765\u751f\u6210\u3002(\u76ee\u6807\u662f\u5b9e\u73b0\u4e0e cuda \u7684\u51fd\u6570 curandGenerateNormal(cuda::generator, cudaRand, number, 0.0, 1.0); \u4e00\u6478\u4e00\u6837\u7684\u529f\u80fd)\u3002
\n\u4e3a\u4e86\u5f97\u5230\u4e0e curandGenerateNormal \u51fd\u6570\u76f8\u540c\u7684\u7ed3\u679c\uff0c\u6211\u5c1d\u8bd5\u6bcf\u4e2a\u5185\u6838\u7ebf\u7a0b\u7ef4\u62a4\u4e00\u4e2a\u79cd\u5b50\uff0c\u4e5f\u5c31\u662f\u6709 10000 \u4e2a\u968f\u673a\u6570\u79cd\u5b50\u3002(\u8c03\u7528\u4e00\u6b21\u5185\u6838\uff0c\u7136\u540e\u6267\u884c\u4e00\u4e07\u4e2a\u7ebf\u7a0b\uff0c\u6bcf\u9694\u7ebf\u7a0b\u4f7f\u7528\u81ea\u5df1\u7684\u79cd\u5b50\u751f\u6210\u4e00\u4e2a\u968f\u673a\u6570\uff0c\u7136\u540e\u7ec4\u5408\u5230\u6570\u7ec4 a \u4e2d)\n\u4f46\u662f\u76ee\u524d\uff0c\u6211\u505a\u4e86\u8bd5\u9a8c\u4e2d\uff0c\u5982\u679c\u6bcf\u4e2a\u5185\u6838\u7ebf\u7a0b\u7ef4\u62a4\u4e00\u4e2a\u79cd\u5b50\uff0c\u6bcf\u4e2a\u7ebf\u7a0b\u7ef4\u62a4 a[i](i \u4e3a\u7ebf\u7a0b id),\u6700\u540e\u7684\u51fa\u6765\u7684\u5e76\u4e0d\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u3002
\n\u4e5f\u5c31\u662f\u8bf4\uff0c\u7eb5\u5411\u53bb\u770b\u7684\u8bdd\uff08 cpu \u4e32\u884c\uff09\u662f\u53ef\u4ee5\u5f97\u5230\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\uff0c\u6a2a\u5411\u5e76\u4e0d\u884c\u3002
\n\u4e5f\u5c31\u662f\u8bf4\uff0c\u5047\u5982\u6709 a \u6570\u7ec4\uff0cb \u6570\u7ec4....z \u6570\u7ec4\u4e2d\uff0c\u6bcf\u4e2a\u6570\u7ec4\u81ea\u4e2a\u662f\u9ad8\u65af\u5206\u5e03\uff0c\u4f46\u662f a...z \u4e2d\uff0c\u5404\u53d6\u4e00\u4e2a\u51fa\u6765\uff0c\u7ec4\u5408\u5728\u4e00\u8d77\uff0c\u5e76\u4e0d\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u3002
\n\u800c\u5982\u679c\u4ece\u76f4\u89c2\u4e0a\u51fa\u53d1\uff0c\u4e0a\u8ff0\u5e94\u8be5\u4e5f\u670d\u4ece\u9ad8\u65af\u5206\u5e03\uff0c\u4f46\u662f\u7531\u4e8e\u968f\u673a\u79cd\u5b50\u7684\u95ee\u9898\uff0c\u53ef\u80fd\u5bfc\u81f4\u5176 a....z \u53ef\u80fd\u6709\u76f8\u5173\u6027\u3002\u5177\u4f53\u539f\u56e0\u6211\u4e5f\u4e0d\u662f\u5f88\u6e05\u695a\u3002
\n\u4e0d\u77e5\u9053\u8868\u8fbe\u6e05\u695a\u6ca1\uff0c\u5404\u4f4d\u5144\u53f0\u6709\u6ca1\u6709\u4e86\u89e3\u8fc7\u76f8\u5173\u7684\u4fe1\u606f\uff1f
\n\u4e00\u53e5\u8bdd\u6982\u62ec\u5c31\u662f\uff1acurandGenerateNormal \u51fd\u6570\u76f8\u540c\u7684\u529f\u80fd...
\n\u6240\u4ee5\u60f3\u95ee\u95ee\u5927\u4f19\u6709\u505a\u8fc7\u76f8\u5173\u7684\u7814\u7a76\u5417\uff1f
\n" }, { "author": { "url": "member/different", "name": "different", "avatar": "https://cdn.v2ex.com/avatar/94f6/7670/374456_large.png?m=1661142546" }, "url": "t/580600", "date_modified": "2019-07-06T09:05:15+00:00", "content_html": "\u6307\u7684\u662f\u53cc\u7cbe\u5ea6\u3002
\n\u4e0d\u77e5\u9053\u662f\u4e0d\u662f\u7f16\u8bd1\u7684\u65f6\u5019\u53cc\u7cbe\u5ea6\u9700\u8981\u6dfb\u52a0\u4e00\u4e9b\u5176\u4ed6\u6307\u4ee4\uff1f
\n\u4e0b\u9762\u662f kernel\u3002
\nvoid CSR(int i,unsigned int N,\nunsigned int *xadj,unsigned int *adjncy,\ndouble *dataxx,double *datayy,double *datazz,\ndouble *Cspin,\ndouble *CHDemag,double *CH)
\n{
\nif(i < N)\n{\n\tdouble dot[3]={0,0,0};\n\tfor(int n = xadj[i] ; n < xadj[i+1]; n++)\n\t{\n\t\tunsigned int neigh=adjncy[n];\n\t\tprintf(\"%d\\n\",n);\n\t\tprintf(\"%f,%f,%f\\n\",dataxx[n],datayy[n],datazz[n]);\n\t\tdouble val[3] = {dataxx[n],datayy[n],datazz[n]};\n\t\tfor(unsigned int co = 0 ; co < 3 ; co++)\n\t\t{\n\t\t\tdot[co]+=(val[co]*Cspin[3*neigh+co]);\n\t\t}\n\t}\n\tdouble a=CHDemag[3*i];\n\tdouble b=CHDemag[3*i+1];\n\tdouble c=CHDemag[3*i+2];\n\tCH[3*i]=a+dot[0];\n\tCH[3*i+1]=b+dot[1];\n\tCH[3*i+2]=c+dot[2];\n}\n\n}
\n\u901a\u8fc7\u663e\u5361\u53c2\u6570\u6765\u770b\uff0crtx \u5e94\u8be5\u662f\u6ca1\u6709\u53cc\u7cbe\u5ea6\u8ba1\u7b97\u5355\u5143\u7684\u3002\u800c titan v \u7684\u53cc\u7cbe\u5ea6\u5e94\u8be5\u8fd8\u884c\u3002
\n\u800c\u6211\u8dd1\u7684\u65f6\u5019\uff0ctitan v \u6bd4 rtx \u6162\u4e86\u4e09\u5206\u4e4b\u4e00\u3002\u3002
\n\u6c42\u89e3
\n", "date_published": "2019-07-06T08:59:28+00:00", "title": "cuda \u8ba1\u7b97 titan v \u4e3a\u4f55\u6bd4 rtx2080ti \u66f4\u6162\uff1f", "id": "t/580600" }, { "author": { "url": "member/Livid", "name": "Livid", "avatar": "https://cdn.v2ex.com/avatar/c4ca/4238/1_large.png?m=1776858751" }, "url": "t/528949", "title": "DeOldify", "id": "t/528949", "date_published": "2019-01-21T00:27:22+00:00", "content_html": "https://github.com/jantic/DeOldify/blob/master/README.mdhttp://i.imgur.com/4s1hBfN.png
\n\u4e0d\u6b7b\u5fc3\u95ee\u4e00\u4e0b\u7b14\u8bb0\u672c 1050 \u652f\u6301 cudnn \u5417\uff1f
\n\u65e2\u7136\u652f\u6301 cuda\uff0c\u600e\u4e48\u4f1a\u4e0d\u6210\u529f\u5462\uff1f
\nhttp://i.imgur.com/mT99ID0.jpg
\n", "date_published": "2017-06-18T04:02:43+00:00", "title": "\u4e0d\u6b7b\u5fc3\u95ee\u4e00\u4e0b\u7b14\u8bb0\u672c 1050 \u652f\u6301 cudnn \u5417\uff1f", "id": "t/369282" }, { "author": { "url": "member/OldFinder", "name": "OldFinder", "avatar": "https://cdn.v2ex.com/avatar/3136/7816/191399_large.png?m=1473652706" }, "url": "t/312305", "date_modified": "2016-10-12T11:51:58+00:00", "content_html": "\u76ee\u524d\u9700\u6c42\u662f\u7528\u5230\u56fe\u5f62\u8bc6\u522b\u548c\u6570\u636e\u7684\u6574\u7406\u548c\u7edf\u8ba1\uff0c\u6570\u91cf\u7ea7\u4e5f\u5c31\u662f\u51e0\u5341\u4e07\u6761\u7684\uff0c\u4e0d\u7b97\u5f88\u5927\u3002", "date_published": "2016-10-12T11:07:25+00:00", "title": "Python+CUDA\uff0c\u5927\u5bb6\u6709\u4ec0\u4e48\u63a8\u8350\u7684\u503c\u5f97\u6df1\u5165\u5b66\u4e60\u4e86\u89e3\u7684\u9879\u76ee\u6216\u8005\u8457\u4f5c\u4e48\uff1f", "id": "t/312305" }, { "author": { "url": "member/xiangtianxiao", "name": "xiangtianxiao", "avatar": "https://cdn.v2ex.com/avatar/0220/9919/93403_large.png?m=1422028256" }, "url": "t/297613", "date_modified": "2018-06-14T05:50:45+00:00", "content_html": "\u6bd4\u5982\u8bf4\u4e09\u5343\u5757\u5de6\u53f3\u7684\u663e\u5361\uff0c GTX 1070 \u62e5\u6709 1920 \u4e2a\u6d41\u5904\u7406\u5355\u5143\uff0c 8G \u663e\u5b58\u3002\nQuadro M2000 \u53ea\u6709 768 \u4e2a\u5355\u5143\uff0c 4G \u663e\u5b58\u3002
\n\u6e38\u620f\u5361\u7684\u8bf1\u60d1\u592a\u5927\u4e86\u554a\uff0c\u663e\u5b58\u5927\uff0c\u5355\u5143\u591a...\u6211\u77e5\u9053\u4e13\u4e1a\u5361\u5728 CAD \u65b9\u9762\u53ef\u80fd\u6709\u52a0\u6210\uff0c\u4f46\u662f\u4e0d\u77e5\u9053\u5bf9\u4e8e CUDA \u8fd9\u79cd\u5e76\u884c\u8ba1\u7b97\u6709\u6ca1\u6709\u4f18\u5316\uff0c\u6216\u8005\u8bf4\u53ef\u4ee5\u66f4\u52a0\u7a33\u5b9a\uff1f
\n", "date_published": "2016-08-06T13:17:19+00:00", "title": "\u5199 CUDA\uff0c\u4f7f\u7528\u4e13\u4e1a\u5361\u4e0e\u6e38\u620f\u5361\u6709\u4ec0\u4e48\u533a\u522b\uff1f", "id": "t/297613" }, { "author": { "url": "member/Jolly23", "name": "Jolly23", "avatar": "https://cdn.v2ex.com/avatar/8f94/502f/172360_large.png?m=1603016148" }, "url": "t/294997", "date_modified": "2016-07-26T04:59:07+00:00", "content_html": "\u4e0d\u662f\u7528\u6765\u6253\u6e38\u620f\uff0c\n\u662f\u641e\u6df1\u5ea6\u5b66\u4e60\u7528\u7684\uff08 CUDA \uff0f caffe \uff0f DIGITS \uff09
\n\u4e4b\u524d\u82b1 30k \u4e00\u5757\u7684\u4ef7\u683c\uff0c\u4e70\u4e86\u51e0\u5757 tesla K40C \uff0c\u88c5 ubuntu \u8dd1\u6df1\u5ea6\u5b66\u4e60\u4e86\uff0c\u8fd0\u7b97\u80fd\u529b\u771f\u662f\u5f3a\u608d\uff0c\u4f46\u662f\u53e6\u4e00\u4f4d\u5bfc\u5e08\u63a5\u53d7\u4e0d\u4e86\u8fd9\u4e2a\u91c7\u8d2d\u4ef7\u683c\uff0c\u53ea\u80fd\u4e70 5k \u5de6\u53f3\u7684\u5361\uff0c\u6c42\u63a8\u8350\uff01\u80fd\u88c5 ubuntu \u5c31\u884c\uff0c\uff08\u4ed6\u4e4b\u524d\u6709\u5757\u6cf0\u5766\uff0c\u88c5 ubuntu \u663e\u793a unknown chipset maxwell \uff09\uff0c\u5e94\u8be5\u662f\u9ea6\u514b\u65af\u97e6\u67b6\u6784\u7684\u5361\u88c5\u4e0d\u4e86 ubuntu \uff0c\u6211\u4e5f\u4e0d\u592a\u6e05\u695a\u5177\u4f53\u60c5\u51b5\u3002
\n\u8981\u6c42\uff1a\u5728\u8fd9\u5f20\u8868\u91cc\u7684\u5361\nhttps://developer.nvidia.com/cuda-gpus
\n\u4ef7\u683c 5k \u5de6\u53f3\u5c31\u884c\uff0c\u7ed9\u63a8\u8350\u70b9\uff0c\u8c22\u8c22\u5404\u4f4d
\n", "date_published": "2016-07-26T04:37:59+00:00", "title": "\u6025\u6c42\u63a8\u8350\u4e2a 5k \u4eba\u6c11\u5e01\u5de6\u53f3\u7684\u8fd0\u7b97 GPU\uff0c\u80fd\u88c5 ubuntu \u5c31\u884c\uff0c\u8dd1\u6df1\u5ea6\u5b66\u4e60\u7528\u7684\uff0c\u5fc5\u987b\u5728 nvidia \u8fd0\u7b97\u80fd\u529b\u8868\u91cc\u9762\u7684\u5361", "id": "t/294997" }, { "author": { "url": "member/andrewzhou", "name": "andrewzhou", "avatar": "https://cdn.v2ex.com/avatar/57e7/710f/42926_large.png?m=1668704021" }, "url": "t/286358", "date_modified": "2016-10-12T15:40:34+00:00", "content_html": "\u6700\u8fd1\u8003\u8651\u6362\u5de5\u4f5c\uff08\u6362\u884c\u4e1a\uff09\uff0c\u5e0c\u671b\u8f6c\u5230\u4eba\u5de5\u667a\u80fd\u7b49\u65b0\u9886\u57df\uff0c\u4f46\u662f\u5bf9\u8fd9\u4e9b\u9886\u57df\u5546\u4e1a\u5e94\u7528\u4e0a\u63a5\u89e6\u5f88\u5c11,\u76ee\u524d\u6bd4\u8f83\u4e2d\u610f CUDA \u548c EMC \u5b58\u50a8\u76f8\u5173\u7684\u5de5\u4f5c\u3002\u6c42\u6307\u5bfc\uff1f\r\u53d1\u73b0\u662f\u53ef\u4ee5\u7684\uff0c\u7528 canvas \u7684\u6bcf\u4e2a\u50cf\u7d20\u6a21\u62df\u4e00\u4e2a\u77e9\u9635\u7684\u503c\uff0c\u7136\u540e\u5728 fragment shader \u91cc\u9762\u8ba1\u7b97\u5c31\u53ef\u4ee5\u5b9e\u73b0\u77e9\u9635\u4e58\u6cd5\u4e86
\n\n\u628a float16 \u7684\u8fd1\u4f3c\u7269\u653e\u5728 rgba \u4e2d\uff08 a \u51e0\u4e4e\u4e0d\u80fd\u7528\uff09\uff0c\u7136\u540e\u5230 fragment \u91cc\u9762\u8fd8\u539f\uff0c\u6700\u540e\u8f93\u51fa\u5230\u5c4f\u5e55\u4e0a\u7684\u56fe\u50cf\u5c31\u662f\u8ba1\u7b97\u7ed3\u679c
\n\n\u5927\u6982\u5728 650m \u4e0b\u9762\u53ef\u4ee5\u6709 150Gflops+\u7684\u6210\u7ee9\uff08 float16 \uff09
\n\n\u7136\u540e\u6211\u5728\u60f3\u8fd9\u4e1c\u897f\u6709\u4ec0\u4e48\u5375\u7528\u2026\u2026\uff08\u6211\u77e5\u9053\u8fd9\u4e2a\u4e16\u754c\u4e0a\u6709\u4e2a\u65e0\u4eba\u9e1f\u7684 webcl \uff09
\n", "date_published": "2015-10-04T02:45:09+00:00", "title": "\u60f3\u4e86\u60f3\u7528 WebGL \u80fd\u4e0d\u80fd\u505a\u79d1\u5b66\u8ba1\u7b97", "id": "t/225518" }, { "author": { "url": "member/hardware", "name": "hardware", "avatar": "https://cdn.v2ex.com/gravatar/cc4fb30c125148f251f7345709336b55?s=73&d=retro" }, "url": "t/203204", "date_modified": "2016-03-09T10:43:36+00:00", "content_html": "\u81ea\u5df1\u53ea\u6709\u4e24\u4e2amacbook pro retina\uff0c\u60f3\u5b66\u5b66caffe\u4ec0\u4e48\u7684\uff0c\u73b0\u5728\u5728\u7528parallel desktop\u8dd1linux\uff0c\u611f\u89c9\u633a\u6162\u7684\u3002\u6709\u6ca1\u6709\u4eba\u7528\u5916\u63a5\u663e\u5361\u505aGPU\u7f16\u7a0b\u7684\uff1f\u80fd\u5de5\u4f5c\u4e48\uff1f
\n", "date_published": "2015-07-03T15:59:31+00:00", "title": "\u6709\u4eba\u7528\u96f7\u7535\u8f6c PCI-e \u8bbe\u5907\u5916\u63a5\u663e\u5361\u8dd1\u8fc7 CUDA \u7684\u4e48\uff1f", "id": "t/203204" }, { "author": { "url": "member/haoji", "name": "haoji", "avatar": "https://cdn.v2ex.com/avatar/ec8b/57b0/2879_large.png?m=1697947304" }, "url": "t/87690", "date_modified": "2016-03-09T10:42:06+00:00", "content_html": "\u6700\u8fd1\u5728\u7814\u7a76 GPU CUDA \u7f16\u7a0b\uff0c\u4e0d\u77e5\u9053\u6709\u6ca1\u6709 V2EXer \u5bf9\u8fd9\u65b9\u9762\u6bd4\u8f83\u4e86\u89e3\u7684\uff1f