About
This blog is maintained by Itamar Syn-Hershko (contact: itamar at this domain).
Itamar Syn-Hershko, a software developer writing mostly for .NET but also in Java and C/C++, really likes fiddling with data, texts especially, so he frequently finds himself working on databases or search engines. Or combining both.
Author of open-source projects like HebMorph and NAppUpdate, and an active participant of others (CLucene for example), Itamar strongly believes in the power of open-source projects and the creativity they can bring to the table.
Here you'll find code snippets, ideas, thoughts, and just links I had to dump somewhere. Hopefully any of those will prove useful.
July 17th, 2011 - 13:01
hi
im new with solr
and i would like to use it with hebmorph
is there any good tutorial than i can use ?
July 17th, 2011 - 13:10
Not a tutorial yet, but it should be quite straight forward. Just plug the jar and config the solr instance.
See:
http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html
https://sourceforge.net/mailarchive/forum.php?forum_name=hebmorph-thinktank
Recent updates we made should make this process smoother than it was before.
November 22nd, 2011 - 13:21
hi itamar
I’m tring to use your heb analizer
greate job so far!
the problem is when i send the results to hilghter object
it dose’nt hilght’s then extara word your analizer found
for exmple:
i’m searching for “לבסס”
extra word your analizer found id “בסיס”
but the extra word would not be highlite
thank’s
here is the relevant code
//initialized somewhere else
private MorphAnalyzer analyzer ;
//do some search … ScoreDoc[] hits = searcher.Search(query, null, 1000) .scoreDocs;
void DoHighlights(ref Query query, ref Document doc, string FIELD_NAME)
{
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
highlighter.SetTextFragmenter(new SimpleFragmenter(100));
System.String text = doc.Get(FIELD_NAME);
int maxNumFragmentsRequired = 20;
TokenStream tokenStream = analyzer.TokenStream(FIELD_NAME, new System.IO.StringReader(text));
System.String [] result = highlighter.GetBestFragments(tokenStream, text, maxNumFragmentsRequired );
System.String res=”";
for (i = 0; i < result.Length – 1; i++)
{
res = res + result[i] + "”;
}
http.Response.Write(“res= ” + “” + res );
}
November 22nd, 2011 - 13:59
Hi, thanks
It actually should be highlighted, take a look at how this is done here: https://github.com/synhershko/HebMorph.CorpusSearcher