在 Python 中解析和修改 XML

2021-08-17 1 minute read

XML 数据

xml_data = '''<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book SN="12460901">
        <name>国富论</name>
        <author nation="English">亚当·斯密</author>
        <price>108</price>
        <ISBN>9787511373229</ISBN>
    </book>
    <book SN="12257413">
        <name>原则</name>
        <author nation="America">瑞·达利欧</author>
        <price>98</price>
        <ISBN>9787508684031</ISBN>
    </book>
</books>
'''

xml.etree.ElementTree

导入

import xml.etree.ElementTree as ET

读取 XML 数据

文件读取

with open(xml_file) as file:
    root = ET.parse(file)

字符串读取

root = ET.fromstring(xml_data)

标签名

root.tag

books

元素文本

root.find('book/name').text

国富论

find 找到的第一个元素

属性值

root.find('book/author').get('nation')
root.find('book/author').attrib['nation']

English
English

通过路径选择元素

root.find('book/author')

<Element 'author' at 0x126d290e8>

通过属性选择元素

root.find('book/[@SN="12257413"]')

<Element 'book' at 0x11ee431d8>

找到所有的元素

for book in root.findall('book'):
    name = book.findtext('name')

国富论
原则

获得元素下面的所有子元素

for book in root.getchildren():
    print(book)

<Element 'book' at 0x11de5e048>
<Element 'book' at 0x11de5e1d8>

遍历所有的元素

for name in root.iter('name'):
    print(name.text)

国富论
原则

示例

import xml.etree.ElementTree as ET

root = ET.fromstring(xml_data)
for book in root.findall('book'):
    SN = book.get('SN')
    name = book.findtext('name')
    author = book.find('author')
    author_name = author.text
    author_nation = author.get('nation')
    price = book.findtext('price')
    ISBN = book.findtext('ISBN')

    book_str = f'Book Name: {name} SN={SN}\n Author Name: {author_name}, Nation: {author_nation}\n Price: {price}\n ISBN: {ISBN}'
    print(book_str)

Book Name: 国富论 SN=12460901
 Author Name: 亚当·斯密, Nation: English
 Price: 108
 ISBN: 9787511373229
Book Name: 原则 SN=12257413
 Author Name: 瑞·达利欧, Nation: America
 Price: 98
 ISBN: 9787508684031

XML 数据

xml.etree.ElementTree

导入

读取 XML 数据

文件读取

字符串读取

标签名

元素文本

属性值

通过路径选择元素

通过属性选择元素

找到所有的元素

获得元素下面的所有子元素

遍历所有的元素

示例

参考资料