JIYIK CN >

Current Location:Home > Learning > OPERATING SYSTEM >

Parsing XML in Bash

Author:JIYIK Last Updated:2025/03/21 Views:

It's almost impossible to find a developer who still doesn't use XML. It's a popular markup language that's widely used to structure and transmit data.

This article will show how we can parse XML with Bash.

We will discuss two libraries here. Our first library is xmllint and the second library is called XMLStarlet.

Before you can use them, you need to install them.


Parsing XML in Bash using xmllint

This is the most common library available for parsing XML files. But you must download and install the library before you can use it.

To install this library, you need to execute the following command.

sudo apt-get update -qq
sudo apt-get install -y libxml2-utils

We have to apt-getinstall the libxml2-utils package using .

If you have an XML file named MyXML.xml, you can easily get the XML using the following command.

xmllint MyXML.xml

After executing the above command, you will get the output as follows.

<?xml version="1.0"?>
<specification>
        <type>Laptop</type>
        <model>Macbook</model>
        <screenSizeInch>14</screenSizeInch>
</specification>

This library contains some options or flags. The available options of the library are shared below.

  1. --auto - This flag is used to generate documentation for testing.
  2. --catalogs - This flag is used to use the catalogs from SGML_CATALOG_FILES. Otherwise, the default is to use /etc/xml/catalog.
  3. --chkregister - This flag is used to turn on node registration.
  4. --compress - This flag turns on gzip compression of the output.
  5. --copy - This flag is used to test the internal copy implementation.
  6. --c14n - This flag is used to serialize the results of the parsing to stdout using W3C XML Canonicalization (C14N). It also preserves comments in the results.
  7. --dtdvalid URL - This flag is used to validate using the DTD specified by the URL.
  8. --dtdvalidfpi FPI - This flag is used to specify the DTD using the public identifier FPI for validation; note that this flag requires the directory to be exported as a public identifier to work.
  9. --debug - This flag is used to parse the file. It also outputs an annotated tree which is an in-memory version of the document.
  10. --debugent - This flag is used to debug entities defined in the documentation.
  11. --dropdtd - This flag is used to remove the DTD from the output.
  12. --dtdattr - This flag will fetch an external DTD. It also populates the tree with inherited attributes.
  13. --encode - This flag will provide output in a given encoding.
  14. --format - This flag will reformat and re-indent the output.
  15. --help - This flag will print out a summary of xmllint usage.
  16. --html - This flag is used to use the HTML parser.
  17. --htmlout - This flag will display the results as an HTML file. It will output the necessary HTML tags around the result tree output so that the results can be displayed/viewed in a browser.
  18. --insert - This flag is used to test a valid insert.
  19. --loaddtd − This flag is used to fetch an external DTD.
  20. --load-trace - This flag will display all documents loaded as they are processed to stderr.
  21. --maxmem NNBYTES - This flag is used to test parser memory support. Here, NNBYTES is the maximum number of bytes that the library can allocate.
  22. --memory - This flag is used to parse from memory.
  23. --noblanks - This flag will remove ignorable whitespace.
  24. --nocatalogs - This flag specifies not to use any catalogs.
  25. --nocdata - This flag will replace CDATA sections by equivalent text nodes.
  26. --noent - This flag will replace entity references with entity values.
  27. --nonet - This flag specifies not to use the Internet to retrieve the DTD or entities.
  28. --noout - This flag will suppress the output. By default, xmllint will display the output of the result tree.
  29. --nowarning - This flag specifies that no warnings should be emitted from the validator and/or parser.
  30. --nowrap - This flag specifies that the HTML document wrapper should not be output.
  31. --noxincludenod - This flag is used to perform XInclude processing, but specifies that the XInclude start and end nodes are not generated.
  32. --nsclean - This flag is used to remove redundant namespace declarations.
  33. --output FILE - This flag defines the file path where xmllint saves the parsing results.
  34. --path "PATH(S)" - This flag is used to load DTDs or entities using the colon-separated or space-separated list of file system paths specified by PATHS. Here, the space-separated list is enclosed in quotes.
  35. --pattern PATTERNVALUE - This flag is used to exercise the pattern recognition engine that can be used with the reader interface. It is also used for debugging.
  36. --postvalid - This flag is used to validate after parsing is complete.
  37. --push - This flag enables push mode.
  38. --recover - This flag is used to output any parsable portions of the invalid document.
  39. --relaxng SCHEMA - This flag will use a RelaxNG file named SCHEMA for validation.
  40. --repeat - This flag is used to repeat 100 times for timing or profiling purposes.
  41. --schema - This flag will use the W3C XML schema file called SCHEMA.
  42. --shell - Run the navigation shell.
  43. --stream - This flag is for the streaming API.
  44. --testIO - This flag will test user input/output support.
  45. --timing - This flag will output information about how long xmllint takes to perform various steps.
  46. --valid - This flag will check the validity of the document.
  47. --version - This flag will display the version of the library.
  48. --walker - This flag will test the walker module
  49. --xinclude - This flag will perform XInclude processing.
  50. --xmlout - This flag is mainly used in conjunction with --html. It will save the document using the XML serializer. It is mainly used for conversion from HTML to XHTML.

Parsing XML in Bash using XMLStarlet

Another popular library for parsing any XML document is called XMLStarlet. The main command of this library is xmlstarlet.

You must execute the following command as root to install this library.

sudo dnf install xmlstarlet

It contains useful options that make it easier to validate, transform or query XML files. You can easily get XML files with the simplest commands of the library.

xmlstarlet format MyXML.xml

After executing the above command, you will see the contents of the XML file as the output below.

<?xml version="1.0"?>
<specification>
        <type>Laptop</type>
        <model>Macbook</model>
        <screenSizeInch>14</screenSizeInch>
</specification>

All the codes used in this article are written in Bash. It will only work in Linux Shell environment.

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

How to decompress x.tar.xz format files under Linux

Publish Date:2025/04/08 Views:186 Category:OPERATING SYSTEM

A lot of software found today is in the tar.xz format, which is a lossless data compression file format that uses the LZMA compression algorithm. Like gzip and bzip2, it supports multiple file compression, but the convention is not to compr

Summary of vim common commands

Publish Date:2025/04/08 Views:115 Category:OPERATING SYSTEM

In Linux, the best editor should be vim. However, the complex commands behind vim's powerful functions also make us daunted. Of course, these commands do not need to be memorized by rote. As long as you practice using vim more, you can reme

Detailed explanation of command return value $? in Linux

Publish Date:2025/04/08 Views:58 Category:OPERATING SYSTEM

? is a special variable. This variable represents the return value of the previous command. That is to say, when we run certain commands, these commands will return a code after running. Generally, if the command is successfully run, the re

Common judgment formulas for Linux script shell

Publish Date:2025/04/08 Views:159 Category:OPERATING SYSTEM

In shell script programming, predicates are often used. There are two ways to use predicates, one is to use test, and the other is to use []. Let's take a look at how to use these two methods through two simple examples. Example 1 # test –

How to use the Linux file remote copy command scp

Publish Date:2025/04/08 Views:151 Category:OPERATING SYSTEM

Scp copies files between two hosts over the network, and the data is encrypted during transmission. Its underlying layer uses ssh for data transmission. And it has the same authentication mechanism and the same security level as ssh. When u

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial