https://doc.lucidworks.com/lucidworks-hdpsearch/2.5/Guide-Solr.html
Solr is the popular open source search solution.
Solr can index content from many sources and has integration points for Apache Tika to index rich text documents (Office documents, PDFs, etc.), JSON files, CSV files and Solr-specific XML.
Cores, Collections and Clusters
Generally speaking, if you use Solr in standalone mode, you have a single core for each index. You can have multiple cores, but they would all be separate indexes.
Generally speaking, if you use Solr in standalone mode, you have a single core for each index. YouIf you use Solr in SolrCloud mode, which is how this documentation suggests you use Solr with Hadoop, you would have a core on each node of your cluster, and together those cores make up a collection. You can have multiple collections, for separate indexes.can have multiple cores, but they would all be separate indexes.
Terminology: Cores, Collections & Nodes
Core
A single Solr instance, which represents a single Solr index. A core has a different set of configuration files and schema definitions than other cores.
Collection
A group of cores that together form a single logical index. A collection has a different set of configuration files and schema definitions than other collections.
Shard
A logical section of a single collection.
Node
A Java Virtual Machine instance running Solr, commonly known as a server. Multiple cores can run on a node if you wish.
While a single core includes a single index, an index can also be distributed across multiple nodes. In this case, the part of the index on any single node is a shard. Each shard is hosted in a core.