PHP 爬虫 随机爬取美图录的一张图片 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
爱意满满的作品展示区。
Liulang007
V2EX    分享创造

PHP 爬虫 随机爬取美图录的一张图片

  •  
  •   Liulang007 2019-01-31 16:16:43 +08:00 5898 次点击
    这是一个创建于 2475 天前的主题,其中的信息可能已经有所发展或是发生改变。
    <?php require 'phpQuery.php'; // 主体域名 $basicUrl = 'https://www.meitulu.com/'; // 分类名称 $category = array('nvshen', 'jipin', 'nenmo', 'wangluohongren', 'fengsuniang', 'qizhi', 'youwu', 'baoru', 'xinggan', 'youhuo', 'meixiong', 'shaofu', 'changtui', 'mengmeizi', 'loli', 'keai', 'huwai', 'bijini', 'qingchun', 'weimei', 'qingxin'); // 爬虫代码 function curl($url, $referer, $download) { $ch = curl_init(); curl_setopt($ch, CURLOPT_TIMEOUT, 2); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 500); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HTTPHEADER, array('User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/600.1.3 (KHTML, like Gecko) Version/8.0 Mobile/12A4345d Safari/600.1.4')); curl_setopt($ch, CURLOPT_REFERER, $referer); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($ch, CURLOPT_REDIR_PROTOCOLS, -1); $cOntents= curl_exec($ch); curl_close($ch); if ($download) { $resource = fopen('default.jpg', 'w'); fwrite($resource, $contents); fclose($resource); return; } return $contents; } $count = 10; // 随机分类 while ($count > 0) { $afterUrl = $basicUrl . 't/' . $category[rand(0, count($category) - 1)] . '/' . rand(2, 5) . '.html'; $html = curl($afterUrl, $afterUrl, false); if (strlen($html) != 0) { break; } $count--; } if($count == 0){ echo '爬取失败!'; exit; } $count = 10; $afterUrlTmp = $afterUrl; $eg = phpQuery::newDocument($html); $links = pq('ul.img > li > a'); // 随机套图 $afterUrl = ''; for ($i = 0; $i < count($links); $i++) { $afterUrl = $links->eq($i)->attr('href'); if (strpos($afterUrl, 'item' !== false)) { if (strpos($afterUrl, 'https' == false)) { $afterUrl = 'https://www.meitulu.com' + $afterUrl; } $html = curl($afterUrl, $afterUrlTmp, false); if (strlen($html) != 0) { break; } } } $html = curl($afterUrl, $afterUrlTmp, false); $eg = phpQuery::newDocument($html); $img = pq('img.content_img'); $afterUrlTmp = $afterUrl; // 随机图片 while ($count > 0) { $afterUrl = $img->eq(rand(0, count($img) - 1))->attr('src'); if (strlen($afterUrl) != 0) { break; } $count--; } if($count == 0){ echo '爬取失败!'; exit; } curl($afterUrl, $afterUrlTmp, true); echo '<img src="default.jpg">'; ?> 

    PHP 爬虫 随机爬取美图录的一张图片

    演示地址:https://www.liulangboy.com/tools/02/get-meitulu-pic.php

    8 条回复    2019-02-12 10:40:46 +08:00
    rekulas
        1
    rekulas  
       2019-01-31 17:43:44 +08:00
    这代码对我来说毫无价值,不过这网站不错收藏了。。。
    eW91IHNlZSBtZQ
        2
    eW91IHNlZSBtZQ  
       2019-02-01 10:59:13 +08:00   1
    wensonsmith
        3
    wensonsmith  
       2019-02-01 16:36:45 +08:00
    楼上要不要这么直接, 我想说这代码写的非常好,我选择打开网站
    richChou
        4
    richChou  
       2019-02-01 18:24:06 +08:00   1
    你们口味真重。反手就是一个网址:aHR0cCUzQS8vdjIubW14eXoubmV0
    rooob1
        5
    rooob1  
       2019-02-02 09:03:48 +08:00
    @zhoujunjie221 点个赞
    gongcheng121
        6
    gongcheng121  
       2019-02-02 17:02:57 +08:00
    @zhoujunjie221 哈哈,隐藏的挺好
    dd0754
        7
    dd0754  
       2019-02-06 12:55:04 +08:00 via iPhone
    网址收藏了
    Thresh
        8
    Thresh  
       2019-02-12 10:40:46 +08:00
    ....真厉害.... 你自己写的么
    @zhoujunjie221
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     1071 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 24ms UTC 23:10 PVG 07:10 LAX 15:10 JFK 18:10
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86