Sphinx is a full-text search engine, distributed under GPL version 2. Commercial license is also available for embedded use.
Generally, it’s a standalone search engine, meant to provide fast, size-efficient and relevant fulltext search functions to other applications. Sphinx was specially designed to integrate well with SQL databases and scripting languages. Currently built-in data sources support fetching data either via direct connection to MySQL or PostgreSQL, or using XML pipe mechanism (a pipe to indexer in special XML-based format which Sphinx recognizes).
As for the name, Sphinx is an acronym which is officially decoded as SQL Phrase Index. Yes, I know about CMU’s Sphinx project.
- high indexing speed (upto 10 MB/sec on modern CPUs)
- high search speed (avg query is under 0.1 sec on 2-4 GB text collections)
- high scalability (upto 100 GB of text, upto 100 M documents on a single CPU)
- supports distributed searching (since v.0.9.6)
- supports MySQL natively (MyISAM and InnoDB tables are both supported)
- supports phrase searching
- supports phrase proximity ranking, providing good relevance
- supports English and Russian stemming
- supports any number of document fields (weights can be changed on the fly)
- supports document groups
- supports stopwords
- supports different search modes (”match all”, “match phrase” and “match any” as of v.0.9.5)
- generic XML interface which greatly simplifies custom integration
- pure-PHP (ie. NO module compiling etc) search client API
Sphinx distribution includes the following programs:
- indexer: an utility to create fulltext indices;
- search: a simple (test) utility to query fulltext indices from command line;
- searchd: a daemon to search through fulltext indices from external software (Web scripts using Sphinx API; or MySQL with SphinxSE; or your application server);
- sphinxapi: a set of API libraries for popular Web scripting languages (there are native API ports for PHP, Python, Java, Perl, and Ruby).