Parsing XML in Bash
It's almost impossible to find a developer who still doesn't use XML. It's a popular markup language that's widely used to structure and transmit data.
This article will show how we can parse XML with Bash.
We will discuss two libraries here. Our first library is xmllint and the second library is called XMLStarlet.
Before you can use them, you need to install them.
Parsing XML in Bash using xmllint
This is the most common library available for parsing XML files. But you must download and install the library before you can use it.
To install this library, you need to execute the following command.
sudo apt-get update -qq
sudo apt-get install -y libxml2-utils
We have to apt-get
install the libxml2-utils package using .
If you have an XML file named MyXML.xml, you can easily get the XML using the following command.
xmllint MyXML.xml
After executing the above command, you will get the output as follows.
<?xml version="1.0"?>
<specification>
<type>Laptop</type>
<model>Macbook</model>
<screenSizeInch>14</screenSizeInch>
</specification>
This library contains some options or flags. The available options of the library are shared below.
- --auto - This flag is used to generate documentation for testing.
- --catalogs - This flag is used to use the catalogs from SGML_CATALOG_FILES. Otherwise, the default is to use /etc/xml/catalog.
- --chkregister - This flag is used to turn on node registration.
- --compress - This flag turns on gzip compression of the output.
- --copy - This flag is used to test the internal copy implementation.
- --c14n - This flag is used to serialize the results of the parsing to stdout using W3C XML Canonicalization (C14N). It also preserves comments in the results.
- --dtdvalid URL - This flag is used to validate using the DTD specified by the URL.
- --dtdvalidfpi FPI - This flag is used to specify the DTD using the public identifier FPI for validation; note that this flag requires the directory to be exported as a public identifier to work.
- --debug - This flag is used to parse the file. It also outputs an annotated tree which is an in-memory version of the document.
- --debugent - This flag is used to debug entities defined in the documentation.
- --dropdtd - This flag is used to remove the DTD from the output.
- --dtdattr - This flag will fetch an external DTD. It also populates the tree with inherited attributes.
- --encode - This flag will provide output in a given encoding.
- --format - This flag will reformat and re-indent the output.
- --help - This flag will print out a summary of xmllint usage.
- --html - This flag is used to use the HTML parser.
- --htmlout - This flag will display the results as an HTML file. It will output the necessary HTML tags around the result tree output so that the results can be displayed/viewed in a browser.
- --insert - This flag is used to test a valid insert.
- --loaddtd − This flag is used to fetch an external DTD.
- --load-trace - This flag will display all documents loaded as they are processed to stderr.
- --maxmem NNBYTES - This flag is used to test parser memory support. Here, NNBYTES is the maximum number of bytes that the library can allocate.
- --memory - This flag is used to parse from memory.
- --noblanks - This flag will remove ignorable whitespace.
- --nocatalogs - This flag specifies not to use any catalogs.
- --nocdata - This flag will replace CDATA sections by equivalent text nodes.
- --noent - This flag will replace entity references with entity values.
- --nonet - This flag specifies not to use the Internet to retrieve the DTD or entities.
- --noout - This flag will suppress the output. By default, xmllint will display the output of the result tree.
- --nowarning - This flag specifies that no warnings should be emitted from the validator and/or parser.
- --nowrap - This flag specifies that the HTML document wrapper should not be output.
- --noxincludenod - This flag is used to perform XInclude processing, but specifies that the XInclude start and end nodes are not generated.
- --nsclean - This flag is used to remove redundant namespace declarations.
- --output FILE - This flag defines the file path where xmllint saves the parsing results.
- --path "PATH(S)" - This flag is used to load DTDs or entities using the colon-separated or space-separated list of file system paths specified by PATHS. Here, the space-separated list is enclosed in quotes.
- --pattern PATTERNVALUE - This flag is used to exercise the pattern recognition engine that can be used with the reader interface. It is also used for debugging.
- --postvalid - This flag is used to validate after parsing is complete.
- --push - This flag enables push mode.
- --recover - This flag is used to output any parsable portions of the invalid document.
- --relaxng SCHEMA - This flag will use a RelaxNG file named SCHEMA for validation.
- --repeat - This flag is used to repeat 100 times for timing or profiling purposes.
- --schema - This flag will use the W3C XML schema file called SCHEMA.
- --shell - Run the navigation shell.
- --stream - This flag is for the streaming API.
- --testIO - This flag will test user input/output support.
- --timing - This flag will output information about how long xmllint takes to perform various steps.
- --valid - This flag will check the validity of the document.
- --version - This flag will display the version of the library.
- --walker - This flag will test the walker module
- --xinclude - This flag will perform XInclude processing.
- --xmlout - This flag is mainly used in conjunction with --html. It will save the document using the XML serializer. It is mainly used for conversion from HTML to XHTML.
Parsing XML in Bash using XMLStarlet
Another popular library for parsing any XML document is called XMLStarlet. The main command of this library is xmlstarlet.
You must execute the following command as root to install this library.
sudo dnf install xmlstarlet
It contains useful options that make it easier to validate, transform or query XML files. You can easily get XML files with the simplest commands of the library.
xmlstarlet format MyXML.xml
After executing the above command, you will see the contents of the XML file as the output below.
<?xml version="1.0"?>
<specification>
<type>Laptop</type>
<model>Macbook</model>
<screenSizeInch>14</screenSizeInch>
</specification>
All the codes used in this article are written in Bash. It will only work in Linux Shell environment.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
How to decompress x.tar.xz format files under Linux
Publish Date:2025/04/08 Views:186 Category:OPERATING SYSTEM
-
A lot of software found today is in the tar.xz format, which is a lossless data compression file format that uses the LZMA compression algorithm. Like gzip and bzip2, it supports multiple file compression, but the convention is not to compr
Summary of vim common commands
Publish Date:2025/04/08 Views:115 Category:OPERATING SYSTEM
-
In Linux, the best editor should be vim. However, the complex commands behind vim's powerful functions also make us daunted. Of course, these commands do not need to be memorized by rote. As long as you practice using vim more, you can reme
Detailed explanation of command return value $? in Linux
Publish Date:2025/04/08 Views:58 Category:OPERATING SYSTEM
-
? is a special variable. This variable represents the return value of the previous command. That is to say, when we run certain commands, these commands will return a code after running. Generally, if the command is successfully run, the re
Common judgment formulas for Linux script shell
Publish Date:2025/04/08 Views:159 Category:OPERATING SYSTEM
-
In shell script programming, predicates are often used. There are two ways to use predicates, one is to use test, and the other is to use []. Let's take a look at how to use these two methods through two simple examples. Example 1 # test –
Shell script programming practice - specify a directory to delete files
Publish Date:2025/04/08 Views:98 Category:OPERATING SYSTEM
-
Usually, in Linux system we need to frequently delete some temporary files or junk files. If we delete them one by one manually, it will be quite troublesome. I have also been learning shell script programming recently, so I tried to write
Use of Linux command at - set time to execute command only once
Publish Date:2025/04/08 Views:158 Category:OPERATING SYSTEM
-
This article mainly involves a knowledge point, which is the atd service. Similar to this service is the crond service. The functions of these two services can be similar to the two functional functions of javascript. Those who have learned
Use of Linux command crontab - loop execution of set commands
Publish Date:2025/04/08 Views:170 Category:OPERATING SYSTEM
-
Compared with at , which executes a command only once, crontab, which we are going to talk about in this article, executes the set commands in a loop. Similarly, the use of crontab requires the support of the crond service. The service is s
Linux practice - regularly delete files under the directory
Publish Date:2025/04/08 Views:198 Category:OPERATING SYSTEM
-
Since we want to delete the files under the directory regularly, we need to use the Linux crontab command. And the content format of each work routine is also introduced in the format of each crontab work. Similarly, we need to use shell sc
How to use the Linux file remote copy command scp
Publish Date:2025/04/08 Views:151 Category:OPERATING SYSTEM
-
Scp copies files between two hosts over the network, and the data is encrypted during transmission. Its underlying layer uses ssh for data transmission. And it has the same authentication mechanism and the same security level as ssh. When u