Search.conf

From CollectiveAccess Documentation
Revision as of 14:23, 23 July 2013 by Jonathan (talk | contribs) (Indexing Tokenizer Regex)
Jump to: navigation, search

IN PROGRESS

Search configuration

Suffixes to add to searches if they conform to a listed regular expression search_suffixes = { [\d]+\.[0-9A-Za-z\.]* = *


search index configuration search_indexing_config = <ca_conf_dir>/search_indexing.conf

Indexing Tokenizer Regex

This is the Regex character class used when indexing; values matched will be used as token delimiters(in other words, the search expression will be broken into words wherever the matched characters are).

indexing_tokenizer_regex = ^\pL\pN\pNd/_#\@\&\.

Search Tokenizer Regex

Regex character class used when searching; values matched will be used as token delimiters (this is the same thing as indexing_tokenizer_regex except that it's used when searching rather than indexing) search_tokenizer_regex = ^\pL\pN\pNd/_#\@\&\.

"As Is" Regex Matching for Accession Numbers

Here you may enter a list of regular expressions that if matched cause search input to be treated "as-is," or searched without processing. This is useful for preventing tokenization of accession numbers and other values that rely upon punctuation being kept intact.

asis_regexes = [
	"^[\d]+[\.\-][A-Za-z0-9\.\-]+$"
]

MySQL Fulltext Plugin Configuration

Set to 0 if you don't want search input stemmed (ie. suffixes removed) prior to search

The plugin uses the English Snoball stemmer (http://snowball.tartarus.org/) and can give poor results with non-English content. If you are cataloguing non-English material you will probably want to turn this off. search_mysql_fulltext_do_stemming = 1

Perl-compatible regular expression used to tokenize text for indexing. The text will be broken up into words using any of the characters specified in the regular expression. The expression should be bracketed with start and end markers (eg. #<regex goes here># or !<regex goes here>!)

If you change this setting you'll have to reindex your database to see a difference. search_mysql_fulltext_tokenize_preg = #[\.\,\!\?\_\- ]#


Solr Plugin Configuration

enter the home directory of the Solr here search_solr_home_dir = /usr/local/solr/

enter the solr URL here search_solr_url = http://localhost:9090/solr


SqlSearch Plugin Configuration

Set to 0 if you don't want search input stemmed (ie. suffixes removed) prior to search

The plugin uses the English Snoball stemmer (http://snowball.tartarus.org/) and can give poor results with non-English content. If you are cataloguing non-English material you will probably want to turn this off.

search_sql_search_do_stemming = 1


ElasticSearch Plugin Configuration

enter the elastic search base url here (without any index names) search_elasticsearch_base_url = http://localhost:9200/

This is the name of the ElasticSearch index used by CollectiveAccess. You probably don't need to change this unless you're using a single ElasticSearch setup for multiple CollectiveAccess instances and/or other applications. search_elasticsearch_index_name = collectiveaccess

Namespaces

Variants
Actions
Navigation
Tools
User
Personal tools