Mailing List
Home
Forum Home
Maven - Project building tool
Axis - Java SOAP implementation
Lucene - Full-featured text search engine APIs
Cocoon - MVC web framework based on XML/XSL
Fop - Create PDF, PCL, PS, SVG, XML driven by XSL formatting objects.
Log4J - A log library
POI - Java Excel, Word and other Microsoft Office files manipulating library
Oracle database error code ...
Subjects
log4j warning: No appenders could be found
java security AccessControlException: access denied (java io FilePermission clie
java lang InstantiationException: org apache tools ant Main
Apache Axis Tutorial
Struts <logic iterate >
log4j properties How to parse outpu to multiple files
configuring log4j with BEA Weblogic 8 1
How to use XSL FOP Java together
JSP precompile
Servlet File Download dialog problem (IE6,Adobe 6 0)
Proposal: Adding jar manifest classpath in jar and war plugins
Unsupported major minor version 48 0 problem while running the an
   telope task
java security AccessControlException: access denied (java io FilePermission
axis wsdl2java Ant Task usage
net sf hibernate MappingException: Error reading resource: test/User hbm xml
Building EAR ANT Script for websphere 5 0
CREATING WAR Files
jsp data into Excel
Classpath problem
Jboss 3 2 3+ vs Tomcat Axis Question
RE: How to include jars and add them into the MANIFEST MF/Class Path
attribute
Printing problem
InstantiationException
Couldn 't find trusted certificate
Please : How can one install ant 1 6 0 under Eclipse 2 1 ?
Excel: Too many different cell formats
Running junit tests fails
XDoclet, Struts and Maven: Where to start? SOLUTION
1 3 final: now giving me java io FileNotFoundException (Too many
open files)
AXIS: tomcat timeout ?
 
Highlighter returning incomplete field text

Highlighter returning incomplete field text

2007-02-09       - By Fred Eaker

 Back
Reply:     1     2     3  

Is there a limit to how many characters a Highlighter or NullFragmenter will
return?

I have indexed an entire HTML document (145kb). When I use the highlighter with
a NullFragmenter, the getBestFragment and getBestFragments methods return the
text of the field up to 51316 characters.

I have tried indexing other HTML documents as well, but get the same results.

If I change the Highlighter's Encoder to DefaultEncoder, I get more characters,
but not the entire field.

Here is some code:

Highlighter highlighter =
new Highlighter(new SimpleHTMLFormatter(),
new DefaultEncoder(),
new QueryScorer(query));

highlighter.setTextFragmenter(new NullFragmenter());

TokenStream tokenStream =
LuceneUtils.getAnalyzer().tokenStream(
fieldName,
new StringReader(hit.get(fieldName)));

String highlightedHit =
highlighter.getBestFragment(tokenStream, hit.get(fieldName));


-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------
To unsubscribe, e-mail: java-user-unsubscribe@(protected)
For additional commands, e-mail: java-user-help@(protected)