如何在不使用任何外部函式庫的情況下讀取 Java 中網頁內容？

java.net 包的 URL 類表示統一資源定位器，用於指向全球資訊網中的資源（檔案或目錄或引用）。

此類的 openStream() 方法開啟指向當前物件表示的 URL 的連線，並返回一個 InputStream 物件，您可以使用該物件從 URL 中讀取資料。

因此，要從網頁中讀取資料（使用 URL 類）-

透過將所需網頁的 URL 作為其建構函式的引數傳入，例項化 java.net.URL 類。
呼叫 openStream() 方法並檢索 InputStream 物件。
透過將以上檢索的 InputStream 物件作為引數，例項化 Scanner 類。

示例

import java.io.IOException;
import java.net.URL;
import java.util.Scanner;
public class ReadingWebPage {
   public static void main(String args[]) throws IOException {
      //Instantiating the URL class
      URL url = new URL("http://www.something.com/");
      //Retrieving the contents of the specified page
      Scanner sc = new Scanner(url.openStream());
      //Instantiating the StringBuffer class to hold the result
      StringBuffer sb = new StringBuffer();
      while(sc.hasNext()) {
         sb.append(sc.next());
         //System.out.println(sc.next());
      }
      //Retrieving the String from the String Buffer object
      String result = sb.toString();
      System.out.println(result);
      //Removing the HTML tags
      result = result.replaceAll("<[^>]*>", "");
      System.out.println("Contents of the web page: "+result);
   }
}

輸出

<html><body><h1>Itworks!</h1></body></html>
Contents of the web page: Itworks!

Maruthi Krishna

更新日期： 2019-10-11

311 次觀看

開啟您的職業生涯

完成課程獲得認證

開始

如何在不使用任何外部函式庫的情況下讀取 Java 中網頁內容？

示例

輸出

開啟您的 職業生涯

開啟您的職業生涯