About
This blog is maintained by Itamar Syn-Hershko (contact: itamar at this domain).
Itamar Syn-Hershko, a software developer writing mostly for .NET but also in Java and C/C++, really likes fiddling with data, texts especially, so he frequently finds himself working on databases or search engines. Or combining both.
Author of open-source projects like HebMorph and NAppUpdate, a committer to Apache Lucene.NET and an active participant of others (CLucene for example), Itamar strongly believes in the power of open-source projects and the creativity they can bring to the table.
Here you'll find code snippets, ideas, thoughts, and just links I had to dump somewhere. Hopefully any of those will prove useful.
I've been working as a core developer for RavenDB for quite a while, writing core features, providing support to users and customers, and co-authored and delivered the official 2-day RavenDB Workshop. I'm now offering on-site and remote RavenDB consultancy services independently. I'm also available for on-site RavenDB training worldwide (1, 2 or 3 day courses).
July 17th, 2011 - 13:01
hi
im new with solr
and i would like to use it with hebmorph
is there any good tutorial than i can use ?
July 17th, 2011 - 13:10
Not a tutorial yet, but it should be quite straight forward. Just plug the jar and config the solr instance.
See:
http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html
https://sourceforge.net/mailarchive/forum.php?forum_name=hebmorph-thinktank
Recent updates we made should make this process smoother than it was before.
November 22nd, 2011 - 13:21
hi itamar
I’m tring to use your heb analizer
greate job so far!
the problem is when i send the results to hilghter object
it dose’nt hilght’s then extara word your analizer found
for exmple:
i’m searching for “לבסס”
extra word your analizer found id “בסיס”
but the extra word would not be highlite
thank’s
here is the relevant code
//initialized somewhere else
private MorphAnalyzer analyzer ;
//do some search … ScoreDoc[] hits = searcher.Search(query, null, 1000) .scoreDocs;
void DoHighlights(ref Query query, ref Document doc, string FIELD_NAME)
{
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
highlighter.SetTextFragmenter(new SimpleFragmenter(100));
System.String text = doc.Get(FIELD_NAME);
int maxNumFragmentsRequired = 20;
TokenStream tokenStream = analyzer.TokenStream(FIELD_NAME, new System.IO.StringReader(text));
System.String [] result = highlighter.GetBestFragments(tokenStream, text, maxNumFragmentsRequired );
System.String res=”";
for (i = 0; i < result.Length – 1; i++)
{
res = res + result[i] + "”;
}
http.Response.Write(“res= ” + “” + res );
}
November 22nd, 2011 - 13:59
Hi, thanks
It actually should be highlighted, take a look at how this is done here: https://github.com/synhershko/HebMorph.CorpusSearcher
March 4th, 2012 - 00:44
Hi Itamar. I’m trying to use the “synhershko-HebMorph-eb403a6″ with Solr 3.5.0. After compiling the sources by JDK7 (with Lucene 3.5.0 jars in the lib folder), putting the lucene.hebrew.jar in the lib of my Solr server, configuring the “text” fieldType to be analyzed by the org.apache.lucene.analysis.hebrew.MorphAnalyzer and starting the server – I got the exception below:
SEVERE: Cannot load analyzer: org.apache.lucene.analysis.hebrew.MorphAnalyzer
java.lang.ClassCastException: class org.apache.lucene.analysis.hebrew.MorphAnalyzer
Can you please help me with that?
March 4th, 2012 - 04:54
Not sure what to tell you. Can you post this to the mailing list? and what happens when you try earlier versions of Solr?