Lucene Index: Missing documents -


we have pretty basic lucene set up. noticed documents aren't written index.

this how create document:

private void addtodirectory(specialdomainobject specialdomainobject) throws ioexception     {     document document = new document();     document.add(new textfield("id", string.valueof(specialdomainobject.getid()), field.store.yes));     document.add(new textfield("name", specialdomainobject.getname(), field.store.yes));     document.add(new textfield("tags", jointags(specialdomainobject.gettags()), field.store.yes));     document.add(new textfield("contents", getcontents(specialdomainobject), field.store.yes));      (language language : getallassociatedlanguages(specialdomainobject)) {         document.add(new intfield("languageid", language.getid(), field.store.yes));     }     specialdomainobjectindexwriter.updatedocument(new term("id", document.getfield("id").stringvalue()), document);     specialdomainobjectindexwriter.commit(); } 

this how create analyzer , index writer:

<bean id="luceneversion" class="org.apache.lucene.util.version" factory-method="valueof">     <constructor-arg value="lucene_46"/> </bean>  <bean id="analyzer" class="org.apache.lucene.analysis.standard.standardanalyzer">     <constructor-arg ref="luceneversion"/> </bean>  <bean id="specialdomainobjectindexwriter" class="org.apache.lucene.index.indexwriter">     <constructor-arg ref="specialdomainobjectdirectory" />     <constructor-arg>         <bean class="org.apache.lucene.index.indexwriterconfig">             <constructor-arg ref="luceneversion"/>             <constructor-arg ref="analyzer" />             <property name="openmode" value="create_or_append"/>         </bean>     </constructor-arg> </bean> 

indexing done scheduled task:

@component public class scheduledspecialdomainobjectindexcreationtask implements scheduledindexcreationtask {      private static final logger logger = loggerfactory.getlogger(scheduledspecialdomainobjectindexcreationtask.class);      @autowired     private indexoperator specialdomainobjectindexoperator;      @scheduled(fixeddelay = 3600 * 1000)     @override     public void createindex() {         date indexcreationstartdate = new date();         try {             logger.info("updating complete special domain object index...");             specialdomainobjectindexoperator.createindex();             if (logger.isdebugenabled()) {                 date indexcreationenddate = new date();                 logger.debug("index creation duration: {} ms", indexcreationenddate.gettime() - indexcreationstartdate.gettime());             }         } catch (ioexception e) {             logger.error("could update complete special domain object index.", e);         }     } } 

createindex() implemented follows:

@override public void createindex() throws ioexception {     logger.trace("preparing index generation...");     indexwriter indexwriter = getindexwriter();      date start = new date();      logger.trace("deleting documents index...");     indexwriter.deleteall();      logger.trace("starting index generation...");     long numberofprocessedobjects = fillindex();      logger.debug("index written in " + (new date().gettime() - start.gettime()) + " milliseconds.");     logger.debug("number of processed objects: {}", numberofprocessedobjects);     logger.debug("number of documents in index: {}", indexwriter.numdocs());      indexwriter.commit();     indexwriter.forcemerge(1); }  @override protected long fillindex() throws ioexception {     page<specialdomainobject> specialdomainobjectspage = specialdomainobjectrepository.findall(new pagerequest(0, maximum_page_elements));     while (true) {         addtodirectory(specialdomainobjectspage);         if (specialdomainobjectspage.hasnextpage()) {             specialdomainobjectspage =                 specialdomainobjectrepository.findall(new pagerequest(specialdomainobjectspage.getnumber() + 1, specialdomainobjectspage.getsize()));         } else {             break;         }     }     return specialdomainobjectspage.gettotalelements(); } 

there 2000 specialdomainobject instances , 80 aren't written index (we checked luke).

is there cause missing documents?

we found problem: default encoding of operating system not set utf-8.


Comments

Popular posts from this blog

html - Sizing a high-res image (~8MB) to display entirely in a small div (circular, diameter 100px) -

java - IntelliJ - No such instance method -

identifier - Is it possible for an html5 document to have two ids? -