围炉网

一行代码,一篇日志,一个梦想,一个世界

SolrCloud = Solr 4.6.1 + Apache Tomcat 8.0.1 + zookeeper3.4.5

  • sudo vi /etc/hosts if find something like 127.0.1.1
    • #127.0.1.1      ubx1
      192.168.1.106   ubx1
  • zookeeper
    1. set zookeeper conf file  conf/zoo.cfg
      • dataDir=/home/adamslee/zookeeper-3.4.5/data
      • server.1=192.168.1.106:2879:3879
        server.2=192.168.1.107:2879:3879
        server.3=192.168.1.108:2879:3879
      • dataLogDir=/home/myuser/zooA/log
      • dataLogDir:用于单独设置transaction log的目录,transaction log分离可以避免和普通log还有快照的竞争
    2. 在/home/adamslee/zookeeper-3.4.5/data/目录新建myid文件,内容为1。
      sudo sh -c ‘echo "1" >> myid’
    3. 在其他server上新建myid文件
    4. start zookeeper
      • sh bin/zkServer start
  • install tomcat on all servers 
    1. unzip solr .zip file
    2. unzip apache tomcat .zip file
    3. rename solr-4.6.1/dist/solr.4.6.1.war to solr.war, and copy it to <TOMCAT_HOME>/webapps
    4. unzip solr.war 
    5. update apache-tomcat-8.0.1/conf/server.xml
      •      <Connector port="8983" maxHttpHeaderSize="8192" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="8443" acceptCount="100" connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="UTF-8" />
    6. 创建solr.xml文件放于<TOMCAT_HOME>/conf/Catalina/localhost,内容如下:
      • <?xml version="1.0" encoding="UTF-8" ?>
        <Context docBase="/home/adamslee/apache-tomcat-8.0.1/webapps/solr.war" debug="0" crossContext="true">
            <Environment name="solr/home" type="java.lang.String" value="/home/adamslee/lbse" override="true"/>
             <Resource name="jdbc/my-database" auth="Container" type="javax.sql.DataSource" username="sa" password="" driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://192.168.1.104/ex" maxActive="-1"/>
        </Context> 
    7. copy jar 
      • cp solr-4.3.1/example/lib/ext/* <TOMCAT_HOME>/webapps/solr/WEB-INF/lib/
        mdkir -p <TOMCAT_HOME>/webapps/solr/WEB-INF/classes/
        cp solr-4.3.1/example/resources/log4j.properties <TOMCAT_HOME>/webapps/solr/WEB-INF/classes
      • copy \hsqldb\lib\hsqldb\*.jar if you are using hsqldb (数据库对应的jdbc驱动包,如Oracle oracle10g.jar)
    8. update catalina.sh: add line below
      • export JAVA_OPTS="-DzkHost=192.168.1.106:2181,192.168.1.107:2181,192.168.1.108:2181"
  • upload solr cloud conf to zookeeper
    • java -classpath .:/home/adamslee/apache-tomcat-8.0.1/webapps/solr/WEB-INF/lib/*:/home/adamslee/apache-tomcat-8.0.1/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 192.168.1.106:2181,192.168.1.107:2181,192.168.1.108:2181 -confdir /home/adamslee/exconf -confname exconf
      java -classpath .:/home/adamslee/apache-tomcat-8.0.1/webapps/solr/WEB-INF/lib/*:/home/adamslee/apache-tomcat-8.0.1/lib/* org.apache.solr.cloud.ZkCLI -cmd linkconfig -collection stores -confname exconf -zkhost 192.168.1.106:2181,192.168.1.107:2181,192.168.1.108:2181
      java -classpath .:/home/adamslee/apache-tomcat-8.0.1/webapps/solr/WEB-INF/lib/*:/home/adamslee/apache-tomcat-8.0.1/lib/* org.apache.solr.cloud.ZkCLI -cmd bootstrap -zkhost 192.168.1.106:2181,192.168.1.107:2181,192.168.1.108:2181 -solrhome /home/adamslee/lbse
  • create collection
  • add core into collection
  • notes for conf
    •  <fieldType name="sint" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
  • data import handler
    • 将 apache-solr-dataimportscheduler-1.0.jar 和solr自带的 apache-solr-dataimporthandler-.jar, apache-solr-dataimporthandler-extras-.jar 放到 <TOMCAT_HOME>/webapps/solr/lib 目录下面
    • https://code.google.com/p/solr-dataimport-scheduler/
    • 修改solr.war中WEB-INF/web.xml, 在servlet节点前面增加:
    •        <listener>
              <listener-class>
                      org.apache.solr.handler.dataimport.scheduler.ApplicationListener
              </listener-class>
             </listener>
    • 将apache-solr-dataimportscheduler-.jar 中 dataimport.properties 取出并根据实际情况修改,然后放到 solr.home/conf (不是solr.home/core/conf) 目录下面
    • DIH内存溢出错误
      在使用DIH时,容易报内存溢出错误。可以通过设置jvm大小来解决。设置方法如下:
      在tomcat\bin\startup.bat 加入SET JAVA_OPTS=-Xms128m -Xmx1024m 配置 这里设置的是1024M,根据情况可以适量增大 

Required Config

All of the required config is already setup in the example configs shipped with Solr. The following is what you need to add if you are migrating old config files, or what you should not remove if you are starting with new config files.

schema.xml

You must have a _version_ field defined:

<field name="_version_" type="long" indexed="true" stored="true" multiValued="false"/>

solrconfig.xml

You must have an UpdateLog defined – this should be defined in the updateHandler section.

    <!-- Enables a transaction log, currently used for real-time get.
         "dir" - the target directory for transaction logs, defaults to the
         solr data directory.  -->
    <updateLog>
      <str name="dir">${solr.data.dir:}</str>
      <!-- if you want to take control of the synchronization you may specify the syncLevel as one of the
           following where ''flush'' is the default. fsync will reduce throughput.
      <str name="syncLevel">flush|fsync|none</str>
      -->
    </updateLog>

You must have a replication handler called /replication defined:

    <requestHandler name="/replication" class="solr.ReplicationHandler" startup="lazy" />

You must have a realtime get handler called /get defined:

    <requestHandler name="/get" class="solr.RealTimeGetHandler">
      <lst name="defaults">
        <str name="omitHeader">true</str>
     </lst>
    </requestHandler>

You must have the admin handlers defined:

    <requestHandler name="/admin/" class="solr.admin.AdminHandlers" />

The DistributedUpdateProcessor is part of the default update chain and is automatically injected into any of your custom update chains. You can still explicitly add it yourself as follows:

   <updateRequestProcessorChain name="sample">
     <processor class="solr.LogUpdateProcessorFactory" />
     <processor class="solr.DistributedUpdateProcessorFactory"/>
     <processor class="my.package.UpdateFactory"/>
     <processor class="solr.RunUpdateProcessorFactory" />
   </updateRequestProcessorChain>

If you do not want the DistributedUpdateProcessFactory auto injected into your chain (say you want to use SolrCloud functionality, but you want to distribute updates yourself) then specify the following update processor factory in your chain: NoOpDistributingUpdateProcessorFactory

如执行失误请使用下面命令进行清除

zookeeper 清除 zkcli.sh -cmd clear -zkhost {zookeeperServer1}:2181,{zookeeperServer2}:2181 /configs/cmsFilesCollection

solr 清除 curl ‘http://{solrServer}:8983/solr/admin/collections?action=DELETE&name=cmsFilesCollection’


IK Analyzer配置

在以上所有都配置好之后,开始配置IKAnalyzer。

1.     将IKAnalyzer 2012FF_u1解压缩后,在其文件夹中找到IKAnalyzerFF_u1.jar,将其复制到tomcat文件夹下的     webapps/solr/WEB-INF/lib中,在其文件夹中找到IKAnalyzer.cfg.xmlstopword.dic,并将其放入webapps\solr\WEB-INF\classes中。

2.     在solr_home文件夹中的collection1/conf/schema.xml中添加

[html] view plaincopy

  1. <fieldType name="text_ik" class="solr.TextField">    
  2.    <analyzer type="index" isMaxWordLength="false" class="org.wltea.analyzer.lucene.IKAnalyzer"/>    
  3.    <analyzer type="query" isMaxWordLength="true" class="org.wltea.analyzer.lucene.IKAnalyzer"/>    
  4.  </fieldType>  

3.     在浏览器中登录solr管理界面,core->collection1->analysis->在框中输入中文->在Analyse Fieldname / FieldType: 中选择text-ik->Analysis Value.得到结果:

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注

沪ICP备15009335号-2