Search.conf

From CollectiveAccess Documentation
Revision as of 14:37, 3 May 2017 by Jonathan (talk | contribs)
Jump to: navigation, search

IN PROGRESS

Indexing Tokenizer Regex

This is the Regex character class used when indexing saved text; values matched will be used as token delimiters (in other words, the search expression will be broken into words wherever the matched characters are). Note that the default class, as displayed in the example below, starts with a caret ("^"), which has the effect of negating the class. In other words, the class defines what characters will not be treated a token delimiters.

indexing_tokenizer_regex = ^\pL\pN\pNd/_#\@\&\.

Search Tokenizer Regex

This is the Regex character class used when searching; values matched will be used as token delimiters (this is the same thing as indexing_tokenizer_regex except that it's used when to break user searches into words rather than text to be indexed).

search_tokenizer_regex = ^\pL\pN\pNd/_#\@\&\.

"As Is" Regex Matching for Accession Numbers

Here you may enter a list of regular expressions that if matched cause search input to be treated "as-is," or searched without being broken up into tokens. This is useful for preventing tokenization of accession numbers and other values that rely upon punctuation being kept intact when being searched.

asis_regexes = [
	"^[\d]+[\.\-][A-Za-z0-9\.\-]+$"
]

Changing the layout of quick search results

With the following format:

ca_<table>_<type>_quicksearch_result_display_template = 

or

ca_<table>_quicksearch_result_display_template = 

The format of the quick search results can be altered. The value of the template uses the same syntax as bundle displays. The below is an example for adding "artists" to an "artwork" search result layout:

ca_objects_artwork_quicksearch_result_display_template = 
<unit relativeTo='ca_entities' restrictToRelationshipTypes='artist'><u>^ca_entities.preferred_labels.surname, ^ca_entities.preferred_labels.forename</u>:</unit>
<em>^ca_objects.preferred_labels.name</em> (<l>^ca_objects.idno</l>) [^ca_objects.type_id]

MySQL Fulltext Plugin Configuration

[The MySQL Fulltext plugin is deprecated as of v1.4 and will be removed in v1.5]

Set to 0 if you don't want search input stemmed (ie. suffixes removed) prior to search

The plugin uses the English Snoball stemmer (http://snowball.tartarus.org/) and can give poor results with non-English content. If you are cataloguing non-English material you will probably want to turn this off. search_mysql_fulltext_do_stemming = 1

Perl-compatible regular expression used to tokenize text for indexing. The text will be broken up into words using any of the characters specified in the regular expression. The expression should be bracketed with start and end markers (eg. #<regex goes here># or !<regex goes here>!)

If you change this setting you'll have to reindex your database to see a difference. search_mysql_fulltext_tokenize_preg = #[\.\,\!\?\_\- ]#


Solr Plugin Configuration

enter the home directory of the Solr here search_solr_home_dir = /usr/local/solr/

enter the solr URL here search_solr_url = http://localhost:9090/solr


SqlSearch Plugin Configuration

Set to 0 if you don't want search input stemmed (ie. suffixes removed) prior to search

The plugin uses the English Snoball stemmer (http://snowball.tartarus.org/) and can give poor results with non-English content. If you are cataloguing non-English material you will probably want to turn this off.

search_sql_search_do_stemming = 1


ElasticSearch Plugin Configuration

enter the elastic search base url here (without any index names) search_elasticsearch_base_url = http://localhost:9200/

This is the name of the ElasticSearch index used by CollectiveAccess. You probably don't need to change this unless you're using a single ElasticSearch setup for multiple CollectiveAccess instances and/or other applications. search_elasticsearch_index_name = collectiveaccess

sphinx

Namespaces

Variants
Actions
Navigation
Tools
User
Personal tools