Skip to content

prixladi/xq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xq

xq is a command-line utility for querying XML. It is written in Haskell and has zero dependencies. It uses XQuery (a subset of the language XPath) for querying, read more about XQuery in the following sections.

Project structure

  • /app - Command line application entry point
  • /lib - Core functionality of xq
  • /test - Unit tests

Building & Installing

xq requires:

You can run your program directly using stack using:

stack run

This is just for development and testing purposes. If you want to install xq on your system or run benchmarks you should build the binary:

stack build

and copy it to your /bin folder, eg.:

cp $(stack path --local-install-root)/bin/xq-exe ~/bin/xq

Usage

The first argument of xq is always XQuery, the second argument is either path to XML file xq <xQuery> <xmlFilePath> or is not present at all and XML must be provided in stdin xq <xQuery> {xmlStdin}.

File example:

xq "//book/*" "./bookstore.xml"

Stdin example:

cat "./bookstore.xml" | xq "//book/*"

XQuery

XQuery (also referenced as XQ) is a language used for querying parsed XML. It is a small subset of the XPath language and aims to be completely replaced by XPath if every functionality gets implemented.

Supported features

Supported features can be seen in the examples below. More exhaustive documentation is currently in progress. Right now you can check source code directly if you need specific. Parsing of the XQuery can be found in the parser file and usage in querying XML can be found in the runner file.

Examples

Descendant selectors

Descendant selectors must start with '//' or '/'. The relative syntax without the slash qualifier is not supported.

XQuery CSS selector equivalent
//div div
//div//a div a
//div//* div *
//div/* div > *
/body :root > body
/* :root

Attribute selectors

XQuery CSS selector equivalent
//input[@type='submit'] input[type='submit']
//a[@rel] a[rel]

Position selectors

XQuery CSS selector equivalent
//ul/li[position()=1] ul > li:first-of-type
//ul/li[last()] ul > li:last-of-type
//ul[@test='true']/li[@test='true'][last()] ul[test='true'] > li[test='true']:last-of-type

Content selectors

For number comparisons currently only the integer types are supported. A floating point number will result in an XQuery parse error.

XQuery Description
//price[text()='100 EUR'] Selects all price nodes that have content that equals to string "100 EUR"
//price[text()!='100 EUR'] Selects all price nodes that have content that does not equal to string "100 EUR"
//price[text()>100] Selects all price nodes that have content that is parsable to int greater than 100
//price[text()=100] Selects all price nodes that have content that is parsable to int equal 100

Child selectors

Child selectors allow to nest the additional XQuery into the selector and its result is treated as a boolean value. Child selectors support the same syntax as root XQuery with one exception - the first descendant selector must be relative (must start directly with tag without the / or // prefix)

XQuery Description
//book[price[text()<500]]/title Selects all titles of boots that have a price that is a number lesser than 500
//bookstore[book[@lang='en']/genre[text()>'comedy']] Selects all bookstores that have at least one book that has attribute 'lang' set to 'en' and has genre subnode that contains the text 'comedy'.

Known issues and limitations

Some of the issues and limitations that are known to the maintainer/developers.

  1. Missing error hints - When parsing of the XML or XQ fails there are no error hints of which line and character the error occurred.

  2. XML parsing -

    • Parsing of the XML is not as strict as it should be. Such as:

      • XML tag with duplicate attributes is not considered invalid but it should be.
      • Tags that start with the string 'xml' are considered valid
      • XML prelude is not validated at all and can be included anywhere in the document
      • XML tag/attribute can contain multiple namespace indicators and can end with a namespace indicator (:)
      • And many more ...

      But this should not be so much of an issue since xq primary use is querying and not validating the XML

    • On the other hand, many of the XML valid constructs are not supported. Such as:

      • DTD.
      • CDATA
      • Single quotes for attribute definitions
      • And many more ...

Unit tests

Basic xq use cases are written as tests in folder /test.

The tests can be started using

stack test

About

CLI utility for querying XML

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published