Anatomy of XPATH Queries in AEM

All of us have come across XPATH queries while working with AEM but very few people understand or can write XPATH queries without looking every time into documentation or somewhere. It is quite easier to write the queries using QueryBuilder and it is the recommended approach by Adobe, but sometimes as developers we do come across situations where we need to understand or write XPATH queries.


Anatomy

To give you a little bit of background about XPATH queries, XPATH is a query language which was initially designed to search the nodes inside an XML document by W3C consortium. Since a XML document has a hierarchical structure similar to JCR, the XPATH was included as a suitable Query language in the JCR specification. Without boring you much further with history let us jump into understanding an XPATH query.

Every XPATH query is divided into 4 parts :

1) PATH Constraint

2) Type Constraint

3) Property Constraint

4) Order specifiers

Let us see each of them one by one:



config



1) PATH constraint: This can be used to restrict your queries to a certain branch in jcr. For example – if you want to search the entire repository starting from the root node then you can simply write

/jcr:root/content/geometrixx/en//* [we are searching for everything nodes under /content/geometrixx/en]

We have restricted our query here to a path under /content/geometrixx/en. If you try this query only on root then this will take lot of time as it has to traverse the complete repository

2) Type constraint:

/jcr:root/content/geometrixx/en//element(*, cq:Page)

Here there are 2 things which you should notice :

a) The “/” after path and before “element” keyword – This is a espace character used by the query parsers to identify the end of the path constraint. b) The element keyword - used to specify the type constraint. The above query will limit the search to only nodes of type cq:Page.

Similarly, to search for all nodes of type nt:unstructured you can modify your query like this :

/jcr:root/content/geometrixx/en//element(*, nt:unstructured) 

if you want to find all the Assets in your dam path then you can modify the path constraint and the type constraint like below :

/jcr:root/content/dam/geometrixx//element(*, dam:Asset)

Here you should keep in mind the node type inheritance. nt:base is the parent node and all other node types are derived from it. Therefore, if you search for nt:base nodes then it will list you all the derived node types like nt:unstructured, cq:Page etc. You should always restrict the node types in your search queries. i.e if your intention is to search for all pages then always restrict your constraint to cq:Page.

3) Property constraint – This is used to define property predicates of your search. If you want to search for all pages which contains a property called as “test” and whose value is “value”, you can add your property constraints like this :

/jcr:root/content/geometrixx//element(*, cq:Page) 
   
[
    jcr
:content/@test = 'value'
   
]

here notice the square brackets and the property constraints specified as key & value inside it.

Since the properties inside page are stored inside jcr:content node, therefore we added the relative path in the property name.

Now let’s say we have to search for pages which have a title component, we can add the relative path in the property like this :

/jcr:root/content/geometrixx//element(*, cq:Page) 
   
[
    jcr
:content/par/title/@jcr:title = 'test'
   
]

In the above queries we are only checking if property ‘key’ is equal to the ‘value’ but sometimes we have to find if that property contains any of the keywords which we want to search. For these kind of queries we can use the special functions provided by XPATH.

1) JCR:Contains() - This function can be used to match any of the specified keywords to a property name. For example – the below query will match any of the

/jcr:root/content/geometrixx//element(*, cq:Page) 
   
[
    jcr
:contains(., 'test')
   
]

2) Jcr:Like() – similar to jcr:contains, but the below query will only return exact matches.

/jcr:root/content/geometrixx//element(*, cq:Page) 
   
[
    jcr
:like(jcr:title, 'test')
   
]

For jcr:like we must specify the property name. it cannot run on all the properties and it will do an exact match. We can modify it to support wild characters by modifying the query to

jcr:like(jcr:title, '%test%')

Notice the % symbol. It will be used to match one or more characters in the jcr:title property.

It is advised to use jcr:contains in place of jcr:like always in your queries as contains() is more optimized. Also, the implicit conversion of querybuilder queries always use jcr:contains().

3) Other operations - we can also do other logical operations like >, <, >=, <= etc.

For example - To check all the pages that were created since yesterday

/jcr:root/content/geometrixx//element(*, cq:Page)
   
[
   
@jcr:created>= xs:dateTime('2015-08-19T18:13:17.022+05:30')
   
]

you can check all the assets which were uploaded since yesterday by modifying the above query to match dam:Asset node type.

/jcr:root/content/dam/geometrixx//element(*, dam:Asset)
   
[
   
@jcr:created>= xs:dateTime('2015-08-19T18:13:17.022+05:30')
   
]

4) Ordering specifiers - The ordering specifiers are used to order your search results on a given property. For example – if I want to sort my search results by jcr:modified date then I can specify the ordering parameters like below :

/jcr:root/content/geometrixx//element(*, cq:Page) 
   
[
    jcr
:content/@test = 'value'
   
]  order by @jcr:lastModified, @jcr:score

We can specify multiple sort orders by comma “,” separators







These are the basics of writing any XPATH queries. These basic blocks can be added to create complex queries. I have skipped few items to make this an introductory article. if you like this article please share it with your friends and if you have any queries then please leave it in the comments. I will be happy to answer your questions.


View the discussion thread.blog comments powered byDisqus