Menu
e-lo
  • Home
    • Tech
    • Inspiration and about
  • Database
    • T-SQL
    • SQL Server quick
    • SQL server docs
    • MySql quick sheet
    • Postgre
    • InfluxDB
  • Programming
    • Automating the boring stuff
    • Python 101
    • Python Docs
    • Python Logging
    • cSharp Overview
    • Powershell Latest
    • Powershell 4 lang ref
    • MS Azure PS ref
    • MS Azure CLI ref
  • Azure
    • AZ-104-MS Azure Administrator 101 quick ref
    • AZ-104 Lab
    • MS Windows virtual machines in Azure
    • MS ARM Docs
    • MS ARM Tutorial
    • MS Deployment scripts (intern/extern)
    • ARM Quickstart
    • MS ARM templates 4h
  • Az Adm
    • AD 101
    • Governance and Compliance 102
    • Administration 103
    • Virtual Networking 104
    • Storage 107 (With table (NoSQL and more))
    • Virtual Machines 108
    • Azure Virtual Machines 101
    • Monitor VM (and market)
  • Linux
    • Top CMD’s
    • Useful CMD Linux
    • ss64 Linux
    • Ubuntu
    • 30 things Ubuntu 18.04
  • Zen
    • Not thinking about anything is Zen
e-lo

Apache Solr 7.5 Index build with DirectUpdateHandler2 HTTP

Posted on November 3, 2018November 4, 2018 by espenk

http://lucene.apache.org/solr/guide/7_5/uploading-data-with-index-handlers.html

 

Index Handlers are Request Handlers designed to add, delete and update documents to the index. In addition to having plugins for importing rich documents using Tika or from structured data sources using the Data Import Handler, Solr natively supports indexing structured documents in XML, CSV and JSON.

The default in solrconfig.xml is an UpdateHandler:

<!– The default high-performance update handler –>
<updateHandler class=”solr.DirectUpdateHandler2″>

DirectUpdateHandler2 implements an UpdateHandler where documents are added directly to the main Lucene index as opposed to adding to a separate smaller index.

Now it is going to be a bit code, so the code for Python can be found at github:

https://github.com/spawnmarvel/solrhttp

Ok, lets index a bunch of data, I have modified the schema abit:

 

We now have, desc, id, plant and tag field, we can run the get_fields in run_schema.py (from here on I will just refer to the method and not the file from the code at github):

Ok, lets generate some test data, run the genereate_test_data(), it will make a txt file with 2000 rows we can insert.

Ok, now the index is empty, we can check that with status_core():

Ok, lets index the 2000 rows, run index_dt_test_data():

 

Great, all done, now lets check solr, change the start, rows in the Query Gui to 0-2000:

 

Great, lets have a look in server\logs\solr.log and add the 2000 item to see what is happening using Baretail:

 

In order to get all the docs we need to alter the HTTP query the same way we did in th Query Gui, http://localhost:8983/solr/newcore/select?q=*:*&rows=2001

I made the get_docs_max() with a default=10000:

Lets run the status_core() to see the change in index size:

Great, all done with indexing 2000 items, but lets also try to delete all!!.

Run the index_remove_all(), it is easy to rebuild.

 

And then we are done, I will commit this tutorial with “Index build with DirectUpdateHandler2 HTTP” so it is easy to get the version of the repos.

 

 

RSS Azure

  • Azure and HITRUST publish shared responsibility matrix January 14, 2021

RSS Python

  • PEP 651: Robust Stack Overflow Handling January 18, 2021

Cloud

ARM (8) azure (23) cmd (1) Django (4) Docker (1) e-lo (2) Flask (2) Github (9) Grafana (2) Information (1) Information Retrieval (11) JAVA (1) kivy (2) Kotlin (4) linux (11) mobile (2) Natural Language Prossesing (NLP) (2) Net.Core (1) Networking and Security (2) OPC (2) PEP8 (1) Philosophy (3) Python (41) Python Networking and Security (2) Reason (2) RMQ (2) Solr (11) Sql (10) VSC (1) Warframe (2) WMVARE (4) Zabbix (7)

Recent Posts

  • 1 TODO ARM Lab 105 MS (Deployment create a pipeline)
  • 2 TODO MS ARM Template 4h
  • TODO Cryptography with Python – Caesar Cipher
  • 3 TODO Udemy AZ-104 Microsoft Azure Administrator Exam Certification (Scott Duffy)
  • ARM Lab 104 MS (Deployment and more)

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
©2021 e-lo | Powered by WordPress & Superb Themes