Nested AND and OR queries with elastic search

Nested AND and OR Queries with Elasticsearch

This is the third part of my series about elastic (search). In the last part, the second Part i desribed how to index documents with elastic and how a very simple query could look. The topic of the part should be, how you can filter a result set and how the filters can be combined with logical AND and OR.

Prepare the Elastic Search Index with some Test Documents

To have a dataset for our queries we'll index a few countries with the language, the number of residents and the currency that is used in the country:

 

curl -PUT -d '{"name": "spain", "currency": "EUR", "residents": 47000000, "language": "spanish"}' http://localhost:9200/countries/europe
curl -PUT -d '{"name": "germany", "currency": "EUR", "residents": 81000000, "language": "german"}' http://localhost:9200/countries/europe
curl -PUT -d '{"name": "usa", "currency": "USD", "residents": 320000000, "language": "english"}' http://localhost:9200/countries/america
curl -PUT -d '{"name": "mexico", "currency": "MXN", "residents": 123000000, "language": "spanish"}' http://localhost:9200/countries/america
curl -PUT -d '{"name": "england", "currency": "GBP", "residents": 53000000, "language": "english"}' http://localhost:9200/countries/europe

Elastic search Query Filter and Filtered Queries

In the previous post about "Indexing and simple Queries" we allready learned how to use a simple "match" Query for a Field.

When we adopt this query to the current data structur, i looks like this:

 

POST http://localhost:9200/countries/_search
{
  "query": {
    "match": {
      "currency": "EUR"
    }
  }
}

In a SQL database the query would look like this:

SELECT * FROM countries where currency LIKE '%EUR%';

In the search context there are multiple parts that are important. The first is filtering and the second one is the scoring, which means the order of the results.

Because the order is the most important metric for good results the possibilities in this field are very flexible in elastic.

Filter Query

In elastic search querys there are two "contexts".

The Query Context - The WHERE: In this context you should do operations, that influence the scoring, which means the order of the results. The query context is introduced/entred by using the "query" keyword in the Querystring.


The Filter Context - The WHETHER: The filter context incfluences the result set, which means if a document is included in the resultset or not.

The recommendation of elastic is to use the Filter Context whenever possible.

Beside the query from the beginning you could get the same results set by using this query:

{
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ],
      "filter": [
        {
          "match": {
            "currency": "EUR"
          }
        }
      ]
    }
  }
}

In this example all documents get queried and filtered to the result set where the currency contains "EUR"

The filtering is happening after the retrieval. When you have queries that contains the same filters very often and exclude a large amount of documents it is recommended to prefer "Filtered Queries" instead of "Filter Queries", because the are executed before the Query. 

Filtered Query

And additionl variant ist the "Filtered Query". This query reduced the result set from the beginning.

Espacially for large datasets this kind of query should be prefered, because the performance would be better.

 

{
  "query": {
    "filtered": {
      "filter": {
        "match": {
          "currency": "EUR"
        }
      }
    }
  }
}

By default, when no query was added in the "query segment", elastic search will return all queries (only for Filtered Queries) and applies the filter on them.

Combining Filters with AND / OR - Boolean Filter with must and should

Most of the times on boolean filter ist not enough and you need to combine multiple filters or even ntest them. Comming back to our SQL sample, a compareable SQL-Query could look like this:

SELECT * FROM countries WHERE residents > 60000000 and language LIKE '%english%';

It this case we would expect to retrieve all countries with more then 60000000 residents and as language "english". The only match from our test dataset would be "USA" because both criteria match.

In elastic you can build such a nested boolean query by using the "must" filter to kombine the filters with a logical AND.

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "match": {
                "language": "english"
              }
            },
            {
              "range": {
                "residents": {
                  "gt": 60000000
                }
              }
            }
          ]
        }
      }
    }
  }
}

If we now combine the filters with OR, like the following SQL query:

SELECT * FROM countries WHERE residents > 60000000 or language LIKE '%english%';

We would expect that elastic will return all coutries beside spain, because all others have more then 60000000 residents or english as language.


The equivalent to OR is should:

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "should": [
            {
              "match": {
                "language": "english"
              }
            },
            {
              "range": {
                "residents": {
                  "gt": 60000000
                }
              }
            }
          ]
        }
      }
    }
  }
}

This was the first more in depth tutorial about the elastic search query syntax. I wish you a lot of fun by exploring elastic search.

 

Navigation