Blog of Bruce: XPath

參考資料(英文版)
參考資料(簡中版)
XPath在javax.xml.xpath包裡
這篇的Document、Element、NodeList是import org.w3c.dom 這一包，不要import錯了

※test.xml

<root>
    <first fa="f1">
        <second sa="s1">a1</second>
        <second>
            <third>
                a2
                <fourth>
                    a3
                </fourth>
            </third>
            b1
        </second>
        <second>c1</second>
    </first>
    
    <first fa="f2">
        <second sa="s2">d1</second>
        <first fa="f2">
            e1
            <second sa="s3">
                e2
                <third>e3</third>
            </second>
        </first>
        <second>f1</second>
    </first>
    
    <first fa="f3">
        <second sa="s3">g1</second>
        <second>
            h1
            <first fa="f3">
                h2
                <third>
                    h3
                    <second>
                        h4
                    </second>
                </third>
            </first>
        </second>
        <second>i1</second>
    </first>
</root>

※測試用的xml

※Test.java

try (InputStream is = Test.class.getClassLoader().getResourceAsStream("test.xml")) {
    try {
        DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = db.parse(is);
        XPath xpath = XPathFactory.newInstance().newXPath();
        NodeList list = (NodeList) xpath.evaluate("//first[@fa='f1']", doc, XPathConstants.NODESET);
    
        for (int i = 0; i < list.getLength(); i++) {
            Element element = (Element) list.item(i);
            NodeList list2 = element.getElementsByTagName("second");
            for (int j = 0; j < list2.getLength(); j++) {
                // System.out.println("node name=" + list2.item(j).getNodeName());
                System.out.println("text content=" + list2.item(j).getTextContent());
                
                NamedNodeMap map = list2.item(j).getAttributes();
                System.out.println("attr=>" + map.getNamedItem("sa"));
                System.out.println("==========");
            }
        }
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (XPathExpressionException e) {
        e.printStackTrace();
    }
} catch (IOException e1) {
    e1.printStackTrace();
}

※XPathConstants.NODESET表示回傳什麼型態，有5種
NUMBER、STRING、BOOLEAN這3個就不用說了
NODESET：回傳0~多個結點
NODE：回傳0~1個結點
下面還會說明兩者的差異

※「//first[@fa='f1']」的結果為：
text content=a1
attr=>sa="s1"
==========
text content=

a2

a3

b1

attr=>null
==========
text content=c1
attr=>null
==========

second有3個，所以會跑3次，可以看到第2個second以下的(包括子元素)都會跑，甚至連換行都有

※「//first[@fa='f2']」的結果為：
text content=d1
attr=>sa="s2"
==========
text content=
e2
e3

attr=>sa="s3"
==========
text content=f1
attr=>null
==========
text content=
e2
e3

attr=>sa="s3"
==========

因為first屬性是f2的有兩個，所以會跑4次

※「//first[@fa='f3']」的結果為：
text content=g1
attr=>sa="s3"
==========
text content=
h1

h2

h3

h4

attr=>null
==========
text content=
h4

attr=>null
==========
text content=i1
attr=>null
==========
text content=
h4

attr=>null
==========

first屬性是f3有兩個，第一次跑second有4個
然後第二次又再跑一次，所以是5次

※「//」和「/」的差別

//：不管任何層級，全部都會找
/：只會從根目錄開始找，以上面的例子，就是root

以「//first[@fa='f2']」為例，改成「/first[@fa='f2']」什麼結果都沒有，因為根目錄是root
再改成「/root/first[@fa='f2']」可以找到，但因為是從根目錄開始找，所以結果只會跑3次，少了子元素是first且屬性是fa，值是f2的那一次

※以上面的例子「/root/first」可以發現NodeList長度為3，也就是全部都抓，如果只想抓第二筆，可以用「/root/first[2]」，要注意是從1開始的

※XPathConstants.NODESET/NODE差異

※「/」也可以用Node，就是上面程式碼的XPathConstants.NODESET，改成XPathConstants.NODE，如下：

try (InputStream is = TestJunit.class.getClassLoader().getResourceAsStream("xml.xml")) {
    try {
        DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
        Document doc = db.parse(is);
        XPath xpath = XPathFactory.newInstance().newXPath();
        Element element = (Element) xpath.evaluate("//first[@fa='f2']", doc, XPathConstants.NODE);
        NodeList list2 = element.getElementsByTagName("second");

        System.out.println(list2.getLength());
        for (int j = 0; j < list2.getLength(); j++) {
            System.out.println("text content=" + list2.item(j).getTextContent());
    
            NamedNodeMap map = list2.item(j).getAttributes();
            System.out.println("attr=>" + map.getNamedItem("sa"));
            System.out.println("==========");
        }
    } catch (ParserConfigurationException e) {
        e.printStackTrace();
    } catch (SAXException e) {
        e.printStackTrace();
    } catch (XPathExpressionException e) {
        e.printStackTrace();
    }
} catch (IOException e1) {
    e1.printStackTrace();
}

※Node強轉成Element；Node強轉成NodeList

Blog of Bruce

2016年11月13日星期日

XPath

※test.xml

※Test.java

※「//」和「/」的差別

※XPathConstants.NODESET/NODE差異

沒有留言:

張貼留言

關於我自己

網誌存檔

2016年11月13日 星期日

XPath

※test.xml

※Test.java

※「//」和「/」的差別

※XPathConstants.NODESET/NODE差異

沒有留言:

張貼留言

2016年11月13日星期日