马洪宾 | 16 May 12:49 2012
Picon

Fwd: 答复: 答复: [Moses-support] UPDATED: moses training error

is it because that my training corpus is too small?

For performance I only use 90000 sentences.

I chose those in the phrase-table to check it out, like ”我“ and ”我国政府“
and when I try "我" it can translate to "I",
but it still can't translate "我国政府“ (even if it's in the phrase table)

is it normal at all?

thanks!

---------- Forwarded message ----------
From: 马洪宾 <subuliu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Wed, May 16, 2012 at 6:22 PM
Subject: Re: 答复: 答复: [Moses-support] UPDATED: moses training error
To: moses-support-3s7WtUTddSA@public.gmane.org


Hey, guys,

I believe my previous problem was caused by some noise in my corpus.
I've tackled it now.

Now I've passed the training process, (no tuning yet), But I've got a moses.ini in my train/model/ directory anyway.

I use this moses.ini to run a test(according to the official tutorial, this should make sense) 

hongbin <at> ubuntu:~/working1/train/model$ echo "由于时间因素至关重要"|~/mosesdecoder/dist/bin/moses -f moses.ini
Defined parameters (per moses.ini or switch):
        config: moses.ini
        distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 /home/hongbin/working1/train/model/reordering-table.wbe-msd-bidirectional-fe.gz
        distortion-limit: 6
        input-factors: 0
        lmodel-file: 8 0 3 /home/hongbin/lm/corpus.blm.en
        mapping: 0 T 0
        ttable-file: 0 0 0 5 /home/hongbin/working1/train/model/phrase-table.gz
        ttable-limit: 20
        weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3
        weight-l: 0.5000
        weight-t: 0.20 0.20 0.20 0.20 0.20
        weight-w: -1
Loading lexical distortion models...have 1 models
Creating lexical reordering...
weights: 0.300 0.300 0.300 0.300 0.300 0.300
Loading table into memory...done.
Start loading LanguageModel /home/hongbin/lm/corpus.blm.en : [72.000] seconds
Finished loading LanguageModels : [73.000] seconds
Start loading PhraseTable /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds
filePath: /home/hongbin/working1/train/model/phrase-table.gz
Finished loading phrase tables : [73.000] seconds
Start loading phrase table from /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds
Reading /home/hongbin/working1/train/model/phrase-table.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Finished loading phrase tables : [108.000] seconds
IO from STDOUT/STDIN
Created input-output object : [108.000] seconds
Translating line 0  in thread id 140004348253952
Translating: 由于时间因素至关重要

Collecting options took 0.000 seconds
Search took 0.000 seconds
由于时间因素至关重要
BEST TRANSLATION: 由于时间因素至关重要|UNK|UNK|UNK [1]  [total=-104.508] <<0.000, -1.000, -100.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, -11.015, 0.000, 0.000, 0.000, 0.000, 0.000>> 0-0
Translation took 0.000 seconds
Finished translating

It seems that it has not even tried to translate from chinese to engish!
what's wrong with this?I checked those phase-table and language model file, it seems to be normal.

could you please help me on this?

Hongbin

On Wed, May 16, 2012 at 1:13 PM, lixianhua <lixianhua-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> wrote:

There’s a clean-corpus-n.perl in moses, find it and clean your corpus like:

 

./clean-corpus-n.perl corpus l1 l2 clean-corpus 1 100

 

 

发件人: 马洪宾 [mailto:subuliu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org]
发送时间: 2012516 13:09
收件人: lixianhua
主题: Re: 答复: [Moses-support] UPDATED: moses training error

 

I think you're right, do you have any batch to run the cleaning

On Wed, May 16, 2012 at 12:10 PM, lixianhua <lixianhua <at> cn.fujitsu.com> wrote:

There must be something wrong with your extract process

I suggest cleaning your corpus, as  well as deleting the | [ ] characters in your corpus

Then run the train script

 

发件人: moses-support-bounces-3s7WtUTddSA@public.gmane.org [mailto:moses-support-bounces-3s7WtUTddSA@public.gmane.org] 代表 马洪宾
发送时间: 2012516 11:28
收件人: moses-support-3s7WtUTddSA@public.gmane.org

主题: [Moses-support] UPDATED: moses training error

 

 

Hi,

 

I'm trying out a chinese-english baseline system using the latest moses.

I'm running it on a Ubuntu server 64bit.

Although I followed strictly to the tutorial  http://www.statmt.org/moses/?n=Moses.Baseline, when I'm proceding the phrase " training the translation system", I get the info

"ERROR: train/model/extract.o.sorted.gz does not exist in ~/working/train/model" and the program exit with exit code 2.

 

However, I do find that there's a file named extract.sorted.gz in ~/working/train/model.(slightly different, not o.sorted.gz, but sorted.gz)

$ls -l :

-rw-rw-r-- 1 hongbin hongbin 30674272 May 15 16:08 aligned.grow-diag-final-and

-rw-rw-r-- 1 hongbin hongbin       20 May 15 16:10 extract.inv.sorted.gz

-rw-rw-r-- 1 hongbin hongbin       20 May 15 16:10 extract.sorted.gz(but the size seems to be too small)

-rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.e2f

-rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.f2e

-rw-rw-r-- 1 hongbin hongbin        2 May 15 16:10 phrase-table.gz

 

Could you please give me any clew to fix this?

 

PS,

I'm running this step by:

nohup nice ~/mosesdecoder/dist/training/train-model.perl  -root-dir train -corpus ~/corpus/corpus-clean  -f ch -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/corpus.blm.en:8 >& training.out &

(Any problem with this command?)

 

Thanks!

Hongbin

 

 

 

--
Hongbin MA(马洪宾)

Department of Computer Science and Engineering,
Shanghai Jiao Tong University.

Mobile: (86)188-1755-4825

 

_______________________________________________
Moses-support mailing list
Moses-support <at> mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



 

--
Hongbin MA(
马洪宾)

Department of Computer Science and Engineering,
Shanghai Jiao Tong University.

Mobile: (86)188-1755-4825

 



 

--
Hongbin MA(
马洪宾)

Department of Computer Science and Engineering,
Shanghai Jiao Tong University.

Mobile: (86)188-1755-4825

 




--
Hongbin MA(马洪宾)
Department of Computer Science and Engineering,
Shanghai Jiao Tong University.
Mobile: (86)188-1755-4825




--
Hongbin MA(马洪宾)
Department of Computer Science and Engineering,
Shanghai Jiao Tong University.
Mobile: (86)188-1755-4825

_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support

Gmane