Indexer

An Indexer sends processed Documents to a specific destination.

Indexers

An Indexer is a thread that retrieves processed Documents from the end of a Pipeline and sends them in batches to a specific destination. For users of Lucille, this destination will most commonly be a search engine.

Only one Indexer can be defined in a Lucille run. All pipelines will feed to the same Indexer.

Indexer configuration has two parts:

  • the generic indexer configuration

  • configuration for the implementation you are using.

  • For example, if you are using Solr, you’d provide solr config, or elastic for Elasticsearch, csv for CSV, etc.

Here’s what using the SolrIndexer might look like:

# Generic indexer config
indexer {
  type: "solr"
  ignoreFields: ["city_temp"]
  batchSize: 100
}
# Specific implementation (Solr) config
solr {
  useCloudClient: true
  url: "localhost:9200"
  defaultCollection: "test_index"
}

At a minimum, indexer must contain either type or class. type is shorthand for an indexer provided by lucille-core - it can be "Solr", "OpenSearch", "ElasticSearch", or "CSV". indexer can contain a variety of additional properties as well. Some Indexers do not support certain properties, however. For example, OpenSearchIndexer and ElasticsearchIndexer do not support indexer.indexOverrideField.

The lucille-core module contains a number of commonly used indexers. Additional indexers with a large number of dependencies are provided as optional plugin modules.

Lucille Indexers (Core)

Lucille Indexers (Plugins)