Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Tommaso Teofili (JIRA) <dev-HHKbSdsD6WQPKjDvHGQMeg <at> public.gmane.org>
Subject: [jira] [Updated] (UIMA-2110) Turn the HMMTagger class into a more generic class for tagging tasks
Newsgroups: gmane.comp.apache.uima.devel
Date: Monday 4th July 2011 10:06:21 UTC (over 5 years ago)
[ https://issues.apache.org/jira/browse/UIMA-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tommaso Teofili updated UIMA-2110:
----------------------------------

    Attachment: UIMA2110updated.patch

I updated the patch, tests run correctly, now I am going to test this patch
in a running system

> Turn the HMMTagger class into a more generic class for tagging tasks  
> ----------------------------------------------------------------------
>
>                 Key: UIMA-2110
>                 URL: https://issues.apache.org/jira/browse/UIMA-2110
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Sandbox-Tagger
>    Affects Versions: 2.3
>         Environment: OS
> Linux version 2.6.32-30-generic ([email protected]) (gcc version 4.4.3
(Ubuntu 4.4.3-4ubuntu5) ) #59-Ubuntu SMP Tue Mar 1 21:30:21 UTC 2011
> JVM
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>            Reporter: Nicolas Hernandez
>            Priority: Minor
>         Attachments: AMoreGenericHMMTaggerDesc.patch,
AMoreGenericHMMTaggerSrcClass.patch, UIMA2110updated.patch
>
>   Original Estimate: 1.5h
>  Remaining Estimate: 1.5h
>
> Despite its name, the code of the
org.apache.uima.examples.tagger.HMMTagger 
> class is not totally independant from the pos tagging task. 
> In addition it assumes that the feature path to update with the result of
the 
> tagging is org.apache.uima.TokenAnnotation:posTag.
> We propose to let the possibility to users to specify by parameter the
feature 
> path to set. This parameter is optional. If it is left free, the tagger
will 
> work as usually using the org.apache.uima.TokenAnnotation:posTag as
default value.
>  
> By the way, we propose to add three optional parameters : InputView,
SentenceType and ModelFile.
> Since the HMM Learner has got the possibility to specify the view to use
to 
> train a model, we consequently decide to give the same possibility for
the 
> tagger. By default, it works on the _InitialView. It is actually quite
useful in practice!
> The org.apache.uima.TokenAnnotation type is not the only annotation type
which is assumed 
> to be present in the CAS. Actually, the HMMTagger processes tokens
sentence by sentence. It uses the   
> org.apache.uima.SentenceAnnotation to select the tokens. The SentenceType
parameter aims at 
> letting the users free to specify their own sentence annotation Type. The
default value is 
> org.apache.uima.SentenceAnnotation. 
> The ModelFile parameter is a concurrent way to the resource declaration
way to specify a model.
> Left empty, it won t be considered. Otherwise it will predomine over the
resource declaration. 
> When specified, the multiple deployement of the tagger cannot be allowed
but in practice for the user it may be easier to configure a parameter
through Eclipse.    
> Two distincts patches will be provided, one for the class and the other
for the descriptor.
> Future improvement of the class might offer the possibility to create new
annotations not only to update existing ones.  
> Future improvement of the descriptor may dissociate what it is up to the
tagger and what it is relevant for the pos tagger...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
 
CD: 3ms