Java MongoDB Driver 3.3.0 - Full Text Search Example

 Posted On  | Yashwant Chavan 

Today we are going to learn about How to perform Full-Text search using Java mongoDB driver, It uses text Indexes to perform the different full text search operations, It is language sensitive and search relevance is based on their matched score.

Tools and Technologies

Basically we are using below tools and technologies

  1. Maven 3.0.4
  2. JDK 1.8
  3. Mongo Java Driver 3.3.0

Maven Dependencies

Define mongo-java-driver maven dependencies in pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.technicalkeeda</groupId>
    <artifactId>JavaExamples</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <build>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.7</source>
                    <target>1.7</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency>
            <groupId>org.mongodb</groupId>
            <artifactId>mongo-java-driver</artifactId>
            <version>3.3.0</version>
        </dependency>
    </dependencies>
</project>

Connecting to database

A MongoDB client internally uses the connection pooling mechanism. It has one MongoClient instance for the entire JVM. To connect the local mongoDb database use host name (localhost) and default port number (27017).

 MongoClient mongoClient = new MongoClient(new MongoClientURI("mongodb://localhost:27017"));

Get the database instance

Once you are successfully connected, access the mongoDB database e.g.(technicalkeeda).

 MongoDatabase database = mongoClient.getDatabase("technicalkeeda");

Get Collection

Get articles collection from the database. It will create the new collection, if collection does not exist in the database.

 MongoCollection <Document> articles = database.getCollection("articles");

mongoDB $text operator

$text performs a search on the content of the indexed field with a text index. In below example we have created index on "subject" field of the article document, A $text expression has the following syntax:

{
  $text:
    {
      $search: <string>,
      $language: <string>,
      $caseSensitive: <boolean>,
      $diacriticSensitive: <boolean>
    }
}

Before proceeding example, You should know about below search fields.

$search - A string of search terms which MongoDB parses and uses to perform query the text index. MongoDB performs a logical OR search of the terms unless specified as a phrase.

$language - Optional parameter, The language that determines the list of stop words for the search and the rules for the stemmer and tokenizer. $text operator supports multiple languages , few of them are Dutch, French , German, Spanish, Arabic etc.

$caseSensitive - Optional. A boolean flag to enable or disable case sensitive search, Defaults to false

$diacriticSensitive - Optional. A boolean flag to enable or disable diacritic sensitive search against version 3 text indexes.

To demonstrate this example, I have inserted few records into articles collection, which will help you to understand the different full text search scenarios.

#1 - $search: "coffee"

This query returns the documents that contain the term coffee in the indexed subject field.

 db.articles.find( { $text: { $search: "coffee" } } )
 1] query = coffee , caseSensitive = false, diacriticSensitive = false
 Document{{_id=2, subject=Popular Coffee Shopping}}
 Document{{_id=7, subject=coffee and cream}}
 Document{{_id=1, subject=coffee}}

#2 - $search: "bake coffee cake"

This query returns documents that contain either "bake" or "coffee" or "cake" in the indexed subject field, or more precisely, the stemmed version of these words (e.g. bake, baking, baked ).

 db.articles.find( { $text: { $search: "bake coffee cake" } } )
 2] query = bake coffee cake , caseSensitive = false, diacriticSensitive = false
 Document{{_id=2, subject=Popular Coffee Shopping}}
 Document{{_id=7, subject=coffee and cream}}
 Document{{_id=1, subject=coffee}}
 Document{{_id=3, subject=Baking a cake}}
 Document{{_id=4, subject=baking}}

#3 - $search: "\"coffee shop\""

This query returns documents that contain the phrase "coffee shop"

 db.articles.find( { $text: { $search: "\"coffee shop\"" } } )
 3] query = "coffee shop" , caseSensitive = false, diacriticSensitive = false
 Document{{_id=2, subject=Popular Coffee Shopping}}

#4 - $search: "coffee -shop"

This query returns documents that contain the words "coffee" but do not contain the term "shop".

 db.articles.find( { $text: { $search: "coffee -shop" } } )
 4] query = coffee -shop , caseSensitive = false, diacriticSensitive = false
 Document{{_id=7, subject=coffee and cream}}
 Document{{_id=1, subject=coffee}}

#5 - Case Sensitive Search - $search: "Coffee"

This query performs a case sensitive search for the term "Coffee", It may impact your search performance if $caseSensitive : true

 db.articles.find( { $text: { $search: "Coffee", $caseSensitive: true } } )
 5] query = Coffee , caseSensitive = true , diacriticSensitive = false
 Document{{_id=2, subject=Popular Coffee Shopping}}

#6 - Diacritic Sensitive Search - $search: "CAFÉ"

This query query performs a diacritic sensitive text search on the term "CAFÉ"

 db.articles.find( { $text: { $search: "CAFÉ", $diacriticSensitive: true } } )
 6] query = CAFÉ , caseSensitive = false , diacriticSensitive = true
 Document{{_id=5, subject=Café Con Leche}}

Full Example

package com.technicalkeeda.app;

import java.util.logging.Level;
import java.util.logging.Logger;

import org.bson.Document;

import com.mongodb.MongoClient;
import com.mongodb.MongoClientURI;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoCursor;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.IndexOptions;


public class App {
    public static void main(String[] args) {
        Logger mongoLogger = Logger.getLogger("org.mongodb.driver");
        mongoLogger.setLevel(Level.SEVERE);

        App app = new App();
        app.insert();
        System.out.println("1] query = coffee , caseSensitive = false, diacriticSensitive = false");
        app.fullTextSearch("coffee", false, false);
        System.out.println("-------------------------------------------------------------------------");

        System.out.println("2] query = bake coffee cake , caseSensitive = false, diacriticSensitive = false");
        app.fullTextSearch("bake coffee cake", false, false);
        System.out.println("-------------------------------------------------------------------------");


        System.out.println("3] query = \"coffee shop\" , caseSensitive = false, diacriticSensitive = false");
        app.fullTextSearch("\"coffee shop\"", false, false);
        System.out.println("-------------------------------------------------------------------------");


        System.out.println("4] query = coffee -shop , caseSensitive = false, diacriticSensitive = false");
        app.fullTextSearch("coffee -shop", false, false);
        System.out.println("-------------------------------------------------------------------------");


        System.out.println("5] query = Coffee , caseSensitive = true , diacriticSensitive = false");
        app.fullTextSearch("Coffee", true, false);
        System.out.println("-------------------------------------------------------------------------");


        System.out.println("6] query = CAFÉ , caseSensitive = false , diacriticSensitive = true");
        app.fullTextSearch("CAFÉ", false, true);
        System.out.println("-------------------------------------------------------------------------");

    }

    public void insert() {
        MongoClient mongoClient = new MongoClient(new MongoClientURI("mongodb://localhost:27017"));
        MongoDatabase database = mongoClient.getDatabase("technicalkeeda");
        MongoCollection <Document> collection = database.getCollection("articles");
        collection.drop();
        collection.createIndex(new Document("subject", "text"), new IndexOptions());
        try {
            collection.insertOne(new Document("_id", 1).append("subject", "coffee"));
            collection.insertOne(new Document("_id", 2).append("subject", "Popular Coffee Shopping"));
            collection.insertOne(new Document("_id", 3).append("subject", "Baking a cake"));
            collection.insertOne(new Document("_id", 4).append("subject", "baking"));
            collection.insertOne(new Document("_id", 5).append("subject", "Café Con Leche"));
            collection.insertOne(new Document("_id", 6).append("subject", "???????"));
            collection.insertOne(new Document("_id", 7).append("subject", "coffee and cream"));
            collection.insertOne(new Document("_id", 9).append("subject", "Cafe con Leche"));

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            mongoClient.close();
        }
    }

    public void fullTextSearch(String query, boolean caseSensitive, boolean diacriticSensitive) {

        MongoClient mongoClient = new MongoClient(new MongoClientURI("mongodb://localhost:27017"));
        MongoDatabase database = mongoClient.getDatabase("technicalkeeda");
        MongoCollection<Document> collection = database.getCollection("articles");

        try {
            MongoCursor<Document> cursor = null;
            cursor = collection.find(new Document("$text", new Document("$search", query).append("$caseSensitive", new Boolean(caseSensitive)).append("$diacriticSensitive", new Boolean(diacriticSensitive)))).iterator();

            while (cursor.hasNext()) {
                Document article = cursor.next();
                System.out.println(article);
            }

            cursor.close();

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            mongoClient.close();
        }

    }

}

Inserted Records into mongoDB collection

> use technicalkeeda
switched to db technicalkeeda
> db.articles.find()
{ "_id" : 1, "subject" : "coffee" }
{ "_id" : 2, "subject" : "Popular Coffee Shopping" }
{ "_id" : 3, "subject" : "Baking a cake" }
{ "_id" : 4, "subject" : "baking" }
{ "_id" : 5, "subject" : "Café Con Leche" }
{ "_id" : 6, "subject" : "???????" }
{ "_id" : 7, "subject" : "coffee and cream" }
{ "_id" : 9, "subject" : "Cafe con Leche" }
>

Program Output

1] query = coffee , caseSensitive = false, diacriticSensitive = false
Document{{_id=2, subject=Popular Coffee Shopping}}
Document{{_id=7, subject=coffee and cream}}
Document{{_id=1, subject=coffee}}
-------------------------------------------------------------------------
2] query = bake coffee cake , caseSensitive = false, diacriticSensitive = false
Document{{_id=2, subject=Popular Coffee Shopping}}
Document{{_id=7, subject=coffee and cream}}
Document{{_id=1, subject=coffee}}
Document{{_id=3, subject=Baking a cake}}
Document{{_id=4, subject=baking}}
-------------------------------------------------------------------------
3] query = "coffee shop" , caseSensitive = false, diacriticSensitive = false
Document{{_id=2, subject=Popular Coffee Shopping}}
-------------------------------------------------------------------------
4] query = coffee -shop , caseSensitive = false, diacriticSensitive = false
Document{{_id=7, subject=coffee and cream}}
Document{{_id=1, subject=coffee}}
-------------------------------------------------------------------------
5] query = Coffee , caseSensitive = true , diacriticSensitive = false
Document{{_id=2, subject=Popular Coffee Shopping}}
-------------------------------------------------------------------------
6] query = CAFÉ , caseSensitive = false , diacriticSensitive = true
Document{{_id=5, subject=Café Con Leche}}
-------------------------------------------------------------------------


© technicalkeeda.com 2017

 |  Find us on Google+ |  Rss Feed

Loaded in 0.0339 seconds.