Solr

  • Flask,  Python,  Solr

    Flask and Apache Solr 7.5 using HTML input for searching

    This post follows up the first one Flask and Apache Solr 7.5. Now we are going to several user inputs for returning values in from Solr. The code is at github: https://github.com/spawnmarvel/solrhttp A quick follow up on what we have: Flask is and Solr is running on localhost. The template Search.html has a input field and 4 radio buttons. The user enters a string of any kind and select’s in what field to search Solr. The fields for now is according to the schema we built, id, tag, plant and description. When the user enters a input it goes first to the view, the to the data access object (both…

    Comments Off on Flask and Apache Solr 7.5 using HTML input for searching
  • Flask,  Python,  Solr

    Flask and Apache Solr 7.5

    Lets develop a webapp with Solr. Short guide to Jinja2: http://jinja.pocoo.org/docs/2.10/ Jinja2 is a modern day templating language for Python developers. It was made after Django’s template. It is used to create HTML, XML or other markup formats that are returned to the user via an HTTP request. Sandboxed Execution: It provides a protected framework for automation of testing programs, whose behaviour is unknown and must be investigated. HTML Escaping: Jinja 2 has a powerful automatic HTML Escaping, which helps preventing Cross-site Scripting (XSS Attack). There are special characters like >,<,&, etc. which carry special meanings in the templates. So, if you want to use them as regular text in…

    Comments Off on Flask and Apache Solr 7.5
  • Information Retrieval,  Python,  Solr

    Apache Solr 7.5 The Standard Query Parser focus searching

    http://lucene.apache.org/solr/guide/7_5/the-standard-query-parser.html Solr’s default Query Parser is also known as the “lucene” parser. The key advantage of the standard query parser is that it supports a robust and fairly intuitive syntax allowing you to create a variety of structured queries. The largest disadvantage is that it’s very intolerant of syntax errors, as compared with something like the DisMax query parser which is designed to throw as few errors as possible. Standard Query Parser Parameters In addition to the Common Query Parameters, Faceting Parameters, Highlighting Parameters, and MoreLikeThis Parameters, the standard query parser supports the parameters described in the table below. q: Defines a query using standard query syntax. This parameter is…

    Comments Off on Apache Solr 7.5 The Standard Query Parser focus searching
  • Github,  Information Retrieval,  Python,  Solr

    Apache Solr 7.5 Index build with DirectUpdateHandler2 HTTP (with delete index)

    http://lucene.apache.org/solr/guide/7_5/uploading-data-with-index-handlers.html Index Handlers are Request Handlers designed to add, delete and update documents to the index. In addition to having plugins for importing rich documents using Tika or from structured data sources using the Data Import Handler, Solr natively supports indexing structured documents in XML, CSV and JSON. The default in solrconfig.xml is an UpdateHandler: <!– The default high-performance update handler –><updateHandler class=”solr.DirectUpdateHandler2″> DirectUpdateHandler2 implements an UpdateHandler where documents are added directly to the main Lucene index as opposed to adding to a separate smaller index. Now it is going to be a bit code, so the code for Python can be found at github: https://github.com/spawnmarvel/solrhttp Ok, lets index a…

    Comments Off on Apache Solr 7.5 Index build with DirectUpdateHandler2 HTTP (with delete index)
  • Information Retrieval,  Python,  Solr

    Apache Solr 7.5 create core, alter schema and query for API data

    With reference to previous tutorial: Core A single Solr instance, which represents a single Solr index. A core has a different set of configuration files and schema definitions than other cores. Now lets create a new core, after downloading and installing the files, have a look at the readme file in solr-7.5.0 directory (my pc; C:\solr_test\solr-7.5.0\solr-7.5.0): Stop Solr if is running in another mode i.e cloud. Great, now back to the readme file. Getting Started ————— To start Solr for the first time after installation, simply do: bin/solr start Lets check that no core is present: Great, next: This will launch a standalone Solr server in the background of your…

    Comments Off on Apache Solr 7.5 create core, alter schema and query for API data
  • Information Retrieval,  Solr

    Solr Terminology: Cores, Collections & Nodes

    https://doc.lucidworks.com/lucidworks-hdpsearch/2.5/Guide-Solr.html Solr is the popular open source search solution. Solr can index content from many sources and has integration points for Apache Tika to index rich text documents (Office documents, PDFs, etc.), JSON files, CSV files and Solr-specific XML. Cores, Collections and Clusters Generally speaking, if you use Solr in standalone mode, you have a single core for each index. You can have multiple cores, but they would all be separate indexes. Generally speaking, if you use Solr in standalone mode, you have a single core for each index. YouIf you use Solr in SolrCloud mode, which is how this documentation suggests you use Solr with Hadoop, you would have…

    Comments Off on Solr Terminology: Cores, Collections & Nodes
  • Information Retrieval,  Solr

    Apache Solr 7.5 Build book_store index backup / restore

    https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html If you are worried about data loss, and of course you should be, you need a way to back up your Solr indexes so that you can recover quickly in case of catastrophic failure. Solr provides two approaches to backing up and restoring Solr cores or collections, depending on how you are running Solr. If you run in SolrCloud mode, you will use the Collections API. If you run Solr in standalone mode, you will use the replication handler.   The backup API requires sending a command to the /replication handler to back up the system. You can trigger a back-up with an HTTP command like this (replace “gettingstarted”…

    Comments Off on Apache Solr 7.5 Build book_store index backup / restore
  • Information Retrieval,  Solr

    Apache Solr 7.5 Build book_store index insert/update/delete

    We are starting from: Your Solr server is up and running, but it doesn’t contain any data yet, so we can’t do any queries, but we will use our own data and create our own index. in: https://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-1   Quick recap: Start cloud mode: bin\solr.cmd start -e cloud We will create a collection called book_store with configuration _default. Ie following the parameter in tutorial 2, but changed the name and the configuration. We start with 1 node on default port bin\solr.cmd start -e cloud (if the collection is existing just type for 1 for use existing, stop node1, bin\solr.cmd stop -p 8983) 1 node, running on default port 8983, shared…

    Comments Off on Apache Solr 7.5 Build book_store index insert/update/delete
  • Solr

    Reindexing in Solr

    https://wiki.apache.org/solr/HowToReindex   The term “reindex” is not a special thing you can do with Solr. It literally means “index again.” You just have to restart Solr (or reload your core), possibly delete the existing index, and then repeat whatever actions you took to build your index in the first place. Indexing (and reindexing) is not something that just happens. Solr has no ability to initiate indexing itself. There is the dataimport handler, but it will not do anything until it is called by something external to Solr. Indexing is something that can be manually done by a person or automatically done by a program, but it is always external to…

    Comments Off on Reindexing in Solr
  • Information Retrieval,  Solr

    Apache Solr 7.5 (techproducts tutorial) 2

    In this tutorial I have moved the environment outside the virtualbox. I installed the same version of Solr and JRE, and added the path for environment. And also created two bat files for easy start and stop of the server. I go over the steps in the tutorial below Index techproducts example data: https://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-1 To launch Solr bin\solr.cmd start -e cloud on Windows The first prompt asks how many nodes we want to run. Note the [2] at the end of the last line; that is the default number of nodes. Two is what we want for this example, so you can simply press enter. This will be the port…

    Comments Off on Apache Solr 7.5 (techproducts tutorial) 2