Achieve-SANDCamp 2011-Solrx

utahcokeServers

Nov 17, 2013 (3 years and 8 months ago)

71 views

Revolutionizing

enterprise web development

Searching with
Solr

What is
Solr
?


Solr

is the popular, blazing fast open source
enterprise search platform from the Apache
Lucene

project.


What’s
Lucene
?


Apache
Lucene
TM

is a high
-
performance, full
-
featured text search engine library written
entirely in Java. It is a technology suitable for
nearly any application that requires full
-
text
search, especially cross
-
platform.

What is
Solr
?


Its major features include powerful full
-
text search,
hit highlighting, faceted search, dynamic
clustering, database integration, and rich
document (e.g., Word, PDF) handling.


Solr

is highly scalable, providing distributed
search and index replication, and it powers the
search and navigation features of many of the
world's largest internet sites.


See
http://lucene.apache.org/

for more info.


Why
Solr
?


Why
Solr

or why
Solr

with
Drupal
?

Core
Drupal

Search

Solr

Search

Reasonable

performance only for small
sites

Quality performance for

all installations,
including large deployments

Poor scalability: Relies on
Drupal’s

DB
to handle all search results

Quality

scalability: Single
-
purpose
servers independent of
Drupal

Few configuration
options
(better in D7

than D6)

Significant configuration
options out of
the box, including configurable filters
and indexed material

Few search options

Significant search

options out of the
box (based on filters above)

No multi
-
site capability

Multi
-
site (even non
-
Drupal

sites)
capabilities

Where does it fit?


Sits beside your application servers in the stack


PHP communicates with the
Solr

servers
(
Apachesolr

modules handles this for you)


Retrieve: URL strings


Push: XML packets

Solr

Setup


Options


Self
-
Hosted


http://lucene.apache.org/solr/


Look for “Download
Solr

here



Service


Acquia


http://acquia.com/products
-
services/acquia
-
search


Solr

Setup


Example directory


Start.jar



java
-
jar
start.jar

&

> /dev/null &


Solr

directory


Conf directory


Schema.xml


Solrconfig.xml

Solr

Setup


Solr

admin accessible here:

http://localhost:8983/solr/admin



Solr

Setup


Schema.xml


Primarily handles what is indexed

Solr

Setup


Solrconfig.xml


Handles general configuration.


Might need to edit it for replication or if you plan
to do file handling on the
Solr

server.

Drupal

+
Solr


Core Module:
Apachesolr


Optional Modules:


Apachesolr_multisitesearch


Self
-
explanatory


Apachesolr_attachments


Requires an additional
Solr

component
(
Tika
). Allows full
-
text indexing of docs.


Apachesolr_views


Sorta
…& maybe someday

Drupal

+
Solr


Basic

Drupal

Settings

Drupal

+
Solr


Examples of filters that can be surfaced


Example:
Drupal.org


Example:
Drupal.org

Solr

hooks


Add new data to the index


By default, all data displayed on the node view
is indexed. We can also set up additional
information to be indexed and/or filtered even if
the information is not on the node page.


It’s worth taking a look at
apachesolr_node_to_document

(in
apachesolr.index.inc
)

Solr

hooks


hook_apachesolr_update_index



(&$document, $node, $namespace)


Allows a module to change the contents of the
$document object before it is sent to the
Solr

Server

Solr

hooks


Altering the query (3 possible methods)


hook_apachesolr_prepare_query(&$query
,
&$
params
, $caller)


Occurs
before

the query is cached


Modifications you make can be used by
others

Solr

hooks

Solr

hooks


Altering the query (3 possible methods)


hook_apachesolr_modify_query(&$query
,
&$
params
, $caller)


Occurs
after

the query is cached


Modifications that you
don’t

want other
modules to inherit

Solr

hooks


Solr

hooks


Altering the query (3 possible methods)


<caller>_
finalize_query

(&$query, &$
params
)


Occurs
after

the query is cached


Technically only for use by modules
originating
Solr

queries (aka custom
Solr

search invocations,
not
the search page)

Solr

hooks


hook_apachesolr_search_result_alter(&$doc
,
&$extra)


Allows for modification of each search result
independently


Solr

hooks


hook_apachesolr_process_results(&results
)


Allows for modification of all search results

Solr

hooks


No technically a hook, but worth noting that
search
theming

is identical to search module.


search
-
result.tpl.php


search
-
results.tpl.php


If you pass the same values from
Solr

as you had
via
node_load
, the
theming

template becomes
interchangeable.

Summary


Apachesolr

module provides a replacement for
core
Drupal

search with better performance,
scalability, and configuration than
Drupal

default.


Solr

requires a separate service running on Jetty
or Tomcat.


hook_apachesolr_update_index

provides a way to
change what goes into the index.


hook_prepare_query
,
hook_modify_query

and
<caller>_
finalize_query

allow return modifications.


hook_apachesolr_search_result_alter

&
hook_apachesolr_process_results

allow for result
modification.
Theming

is the same as core.

Thank You

Bill O’Connor, CTO

d.o
: csevb10

t
: csevb10

e
:
bill@achieveinternet.com