Open source Search Engine Software for Enterprises
The Apache Lucene Core is the most reliable cross-platform open source search engine project that distributed under the Apache License and completely based on Java. However, despite purely written in Java, it also ported and available in other programming languages such as Delphi, Perl, C#, C++, Python, Ruby, and PHP. It works ranking search system that means the best results returned first. Lucene uses pluggable ranking models, including the Vector Space Model and Okapi BM25. It also supports many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more.
Elasticsearch is an open source search engine software which is a distributed, RESTful search and analytics engine that based on Apache Lucene. It is a highly scalable open source search engine which means can support the small-medium business to large enterprises. The Elastic search engine provides full-text search capabilities with HTTP web interface and Schema-free JSON documents. It is a distributed search system that means each index is fully sharded with a configurable number of shards. Also, each shard can have one or more replicas and read/search operations can be performed on any of the replica shards.
After the ElasticSearch, the Apache Solr is another popular open source search engine software and also according to the DB Ranking. It is also developed in Java and support full-text search and real-time indexing. Moreover, like Elasticsearch, the Apache Solr is also based on the Lucene and uses its Java search library. It is a standalone enterprise search server with a REST-like API. You can do indexing in the Solr via JSON, XML, CSV or binary over HTTP. And to receive the results your query it using HTTP GET.
Solr has a plugin architecture that allows increasing the capabilities of the search engine for both index and query. Moreover, being an open source you can also customize its codes to work the plugins according to your requirements.
People those already have used the Elasticsearch and looking some other option they can try the Sphinx. It is also a free and open-source information retrieval software library that supports the full text. It can be implemented as a standalone server which is written in C++ and works on Linux (RedHat, Ubuntu, etc), Windows, MacOS, Solaris, FreeBSD, and a few other systems.
It can index and search data stored in the SQL database and NoSQL storage. It powers some highly documented websites where millions of search query generated per days such as Craigslist, Living Social, MetaCafe, and Groupon…
If you talk about this Open source search engine indexing speed then it can index up to 10-15 MB of text per second per single CPU core, that is 60+ MB/sec per server (on a dedicated indexing machine). Its few key features are: Batch and Real-Time full-text indexes, Non-text attributes support, SQL database indexing, Easy application integration, Advanced full-text searching syntax, Rich database-like querying features, Better relevance ranking, Flexible text processing, and Distributed searching.
DataparkSearch Engine is open source web-based search engine that allow searching within a website, group of websites, intranet or local system. It features http, https, ftp, nntp and news URL schemes support, can indexes text/html, text/xml, text/plain, audio/mpeg (mp3) and image/gif mime types natively, Handles Internationalized Domain Names (IDN), allow noindex tags like <!–UdmComment–>, <NOINDEX>, <!–noindex–>, Google’s special comments <!– google_ad_section_start –>, <!– google_ad_section_start(weight=ignore) –> and <!– google_ad_section_end –> consider as tags to include/exclude; can specify a content body tag, Spellchecking and more.
Xapian is another Open Source Search Engine Library written in C++, with bindings to allow use from Perl, Python 2, Python 3, PHP 5, PHP 7, Java, Tcl, C#, Ruby, Lua, Erlang, Node.js, and R.