Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Paul Hill <paul <at> metajure.com>
Subject: RE: Lucene 4 - POS and Syntactic Tagging
Newsgroups: gmane.comp.jakarta.lucene.user
Date: Monday 2nd April 2012 16:49:19 UTC (over 4 years ago)
> Mark McGuire wrote:
> I'm working on a project where I need to tag both the part of speech and
other syntactic information on tokens

To pick up on this thread from a few weeks back.

I've never done this myself, but I think that your desire to put extra
information that is not really a token in the index at a particular
location is exactly what Payloads are for.
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/

The above article even mentions:
"A payload can be used to store weights for specific terms or things like
part of speech tags or other semantic information. "

I don't believe that searching on attributes is the way to speak about it. 
Attributes are features of some of Lucene objects, a way to ask for
something from a complex object.  Some attributes return information from
the index, but attributes are not in indexes, tokens and payloads are in
indexes.  But I'm sure my understanding is incomplete also, because using
something other than "WORD" seems like a way to go, but I can't see how to
get a query to search on a particular type of token.

-Paul
 
CD: 3ms