iOne February 2016

Solr query that contains slash

I've found an interesting query for Solr and it returns search results, but I don't understand, what is the purpose of slash symbol between the words?


Anybody knows? Please, help.


K.Boy February 2016

I am thinking that health/nurse is being viewed as a string literal as there are no spaces between. Health / nurse should yield different results than health/nurse, correct? If so, then health/nurse must be an indexed term in your documents.

Uri Shtand February 2016

Simple. You can look at the analyzer chain to understand what happens. My guess is that the analyzer chain turns the / into a space - which makes the query into

duties: health nurse

To find out your analyzer chain from the configuration - start by checking the type of the field

For example

   <field name="health" type="text_general" indexed="true" stored="true" required="true"/>

Now we look for the definition of the type

     <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <!-- in this example, we will only use synonyms at query time
    <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>

As you can see, we have an index analyzer and a query analyzer.

My query analyzer would turn / in the query into something else by using the StandardTokenizerFactory.

From the solr wiki:


A good general purpose tokenizer that strips many extraneous characters and sets token types to meaningful values. Token types are only useful for subsequent token filters that are type-aware of the same token types. There aren't any filters that u

