Bulk Indexing of data to remote Elasticsearch Cluster

SearchBlox allows you to grow your search data to an unlimited scale by enabling the ability to index to a remote elasticsearch cluster. Whether you need to crawl millions of websites, scan terabytes of network folders or load your transactional data in real time, parallel bulk indexing of data can be done from multiple SearchBlox servers to a remote elasticsearch cluster.

Elasticsearch Connectors

#1 Enable your Elasticsearch Cluster

Please note that the SearchBlox license needs to be full version and not the free version.

Go to ..//config/elastisearch.yml and add/enable the following lines in the configuration.

cluster.name: test
script.disable_dynamic: false

#2 Enable SearchBlox to connect to the remote ES Cluster

Stop SearchBlox and go to //webapps/searchblox/WEB-INF/elasticsearch.yml and provide the details of the remote server and the cluster name. Please disable all the other lines by prefixing # before the lines.

# Enter the IP of the remote Elasticsearch
searchblox.remote.es:
# Enter the Cluster name of the remote Elasticsearch by default it is searchblox
searchblox.remote.es.cluster.name: test

In the first line give the remote ip address and the second line give the cluster name as in the remote Elasticsearch.

#3 Indexing and Searching

After you restart SearchBlox, the data will be indexing and searching from the remote elasticsearch cluster. While starting SearchBlox you can see that the log message “connected to remote server” appear message in SearcBlox console
One more way to check whether the indexes have been created in remote index is to check the remote Elasticsearch cluster. You can verify using a Elasticsearch tool like Sense, etc. You can also view from browser http://:9200/_cat/indices

Learn more by visiting our documentation site.

Contact Us to learn more on enabling large scale indexing/searching of your data using Elasticsearch.