Apache Lucene Indexer Search with CJKAnalyzer

Apache Lucene Indexer Search with CJKAnalyzer

I am using Apache lucene Indexer Search to search text, and I am using CJKAnalyzer. It search provided word by character, It means If I Search for Japanese word "ぁｘまｎ" , then its showing all the words which is having any character of the provided Japanese word. But I dont want this I want search whole word or the word which is having above mentioned word.

e.g. If I indexed 3 words. i.e "ぁｘまｎ" , "ぁｘま", "まｎ"

case 1 : If I search for "ぁｘまｎ" then it should only give one result. case 2 : If I search for "ぁｘ" then it should give two result.

Now In my case If I search for the word "ぁｘまｎ" then its giving three results which is wrong.

------------------- Indexing code ---------------------------------

writer = getIndexWriter(); List<Document> documents = new ArrayList<>(); Document document1 = createDocument(1, "ぁｘまｎ", "Richard"); writer.addDocument(document1); writer.commit(); private static Document createDocument(Integer id, String firstName, String lastName) { Document document = new Document(); document.add(new StringField("id", id.toString() , Field.Store.YES)); document.add(new TextField("firstName", firstName , Field.Store.YES)); document.add(new TextField("lastName", lastName , Field.Store.YES)); document.add(new TextField("website", website , Field.Store.YES)); return document; } private static IndexWriter createWriter() throws IOException { FSDirectory dir = FSDirectory.open(Paths.get(INDEX_DIR).toFile()); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_44,new CJKAnalyzer()); IndexWriter writer = new IndexWriter(dir, config); return writer; }

--------call to Search ------

TopDocs foundDocs2 = searchByFirstName("*ぁｘまｎ*", searcher); ------------------------------------------------------------- private static TopDocs searchByFirstName(String firstName, IndexSearcher searcher) throws Exception { MultiFieldQueryParser mqp = new MultiFieldQueryParser(new String{"firstName"}, new CJKAnalyzer()); mqp.setAllowLeadingWildcard(true); Query q =mqp.parse(firstName); TopDocs hits = searcher.search(q, 10); return hits; }

can you add your indexing code? what kind of fields do you use?
– dom
Jul 2 at 8:03

@dom I have added indexing code.
– OnkarG
Jul 3 at 4:18

alright and your search code? do you use the same analyzer for searching too?
– dom
Jul 3 at 6:59

@dom yes I am using same analyzer for searching.
– OnkarG
Jul 3 at 7:31

hmm two things: try to analyse your index with luke github.com/DmitryKey/luke . And why using a multifieldQueryParser?
– dom
Jul 3 at 7:36

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

pNnnz ITvSGdWAZsybK

搜尋此網誌

Fjhtyj