Jsoup
1.1、Jsoup简介
是什么 ?:a HTML Parser which provides a very convenient API for extracting and manipulating data, using the best of DOMCSS, and jQuery-like methods.
开发语言:Java
官方主页:https://jsoup.org
源码仓库:https://github.com/jhy/jsoup
1.2、引用Jsoup

Jsoupjar托管在:

Gradle Kotlin DSL

implementation("org.jsoup:jsoup:1.10.2")

Gradle Groovy DSL

implementation 'org.jsoup:jsoup:1.10.2'
1.3、Jsoup

Jsoup类结构图如下:

Jsoup这个类里面的方法都是静态的,用来解析HTML、发起HTTP请求等。

1.4、Connection

Connection对象代表着网络连接,可以设置很多的参数。

1.5、Document

Document代表解析结果。

1.6、示例

示例1:

Document document = Jsoup.connect("https://www.baidu.com").userAgent("android").get();
Elements elements = document.select("a");
for (Element element : elements) {
    System.out.println(element.text());
}

示例2(处理Cookie):

String url = "https://www.baidu.com";

//1、先获取到cookie
Map<String, String> cookies = Jsoup.connect(url).userAgent("android").timeout(30000).execute().cookies();
System.out.println(cookies.toString());

//2、带上Cookie再请求
Document document = Jsoup.connect(url).userAgent("android").timeout(30000).cookies(cookies).get();
Elements elements = document.select("a");
for (Element element : elements) {
    System.out.println(element.text());
}