HtmlUnit 框架无法通过 Webclient.getPage()正确获取网址以 JSP 结尾的 HtmlPage 对象如何解决? - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
tiRolin
V2EX    Java

HtmlUnit 框架无法通过 Webclient.getPage()正确获取网址以 JSP 结尾的 HtmlPage 对象如何解决?

  •  
  •   tiRolin 2023-10-24 09:37:10 +08:00 926 次点击
    这是一个创建于 763 天前的主题,其中的信息可能已经有所发展或是发生改变。

    想要用 HtmlUnit 框架做爬虫,我 HtmlUnit 的设置如下

    public clas WebClientUtils { public static WebClient getWebClient() { WebClient webClient = new WebClient(BrowserVersion.FIREFOX); //配置 webClient webClient.getOptions().setCssEnabled(false); webClient.getOptions().setJavascriptEnabled(true); webClient.setAjaxController(new NicelyResynchronizingAjaxController()); webClient.getOptions().setThrowExceptionOnFailingStatusCode(false); webClient.getOptions().setThrowExceptionOnScriptError(false); webClient.getCookieManager().setCookiesEnabled(true); webClient.setJavascriptErrorListener(new JavascriptErrorListener() { @Override public void scriptException(HtmlPage htmlPage, ScriptException e) { } @Override public void timeoutError(HtmlPage htmlPage, long l, long l1) { } @Override public void malformedScriptURL(HtmlPage htmlPage, String s, MalformedURLException e) { } @Override public void loadScriptError(HtmlPage htmlPage, URL url, Exception e) { } @Override public void warn(String s, String s1, int i, String s2, int i1) { } }); return webClient; } } 

    然后后文中我就访问该网址,但是问题是居然每次都无法正确获得这个网址的 HtmlPage 对象,结果总是为 null

     WebClient webClient = WebClientUtils.getWebClient() HtmlPage page = webClient.getPage("我要访问的网址"); 

    后文的打印结果显示 page 对象的结果是为 null 的 这个网址因为某些原因所以我不能说,所以就不提了,不过这个网址是以 JSP 结尾的,这个网址可以使用 Selenium 框架正确访问,但是在 HtmlUnit 中却总是无法正确获得对象,同时如果说 Html 只是去访问百度首页的话,那又是可以的,也就是说我这个 htmlunit 就访问我这个 jsp 结尾的网址有问题 我按照百度我试了很多方法都不行,我真没法了所以我来问问大伙们,上面基本就是全部信息了

    2 条回复    2023-10-24 10:23:09 +08:00
    xiaohundun
        1
    xiaohundun  
       2023-10-24 09:56:14 +08:00
    webClient. waitForBackgroundJavascript 等待一会试试
    tiRolin
        2
    tiRolin  
    OP
       2023-10-24 10:23:09 +08:00
    @xiaohundun 这招我一开始就试过了,不顶用,还是跟之前一样获取不了对象
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     2874 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 23ms UTC 14:07 PVG 22:07 LAX 06:07 JFK 09:07
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86