DỊCH TỰ ĐỘNG ANH - VIỆT DỰA TRÊN VIỆC HỌC LUẬT CHUYỂN ĐỔI TỪ NGỮ NGỮ LIỆU SONG NGỮ
ĐINH ĐIỀN
Trang nhan đề
Mục lục
Chương_1: Giới thiệu.
Chương_2: Tổng quan.
Chương_3: Mô hình dịch BTL.
Chương_4: Các bài toán cần giải quyết.
Chương_5: Cài đặc thực nghiệm - Kết quả.
Chương_6: Đánh giá kết quả - Bàn luận.
Chương_7: Kết luận.
Các công trình đã công bố
Tài liệu tham khảo
Phụ lục
74 trang |
Chia sẻ: maiphuongtl | Lượt xem: 1805 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Luận án Dịch tự động Anh - Việt dựa trên việc học luật chuyển đổi từ ngữ ngữ liệu song ngữ, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
nphanquatho,
thisekh6ngkhll'nh?pnhangdU'9CmOts6 trU'onghqpmaclingnhanngunghia
nhu'ngconghiakhacnhau.
4.3.1.1 Cd Sd CUAVI~CPHANLdp (NHA.N)NoD NOHIA
Lau nay,chungtaqua"quellvoi cactu di€n thongthu'ong(donnguhay
songngu)du'<;1cs~pxe'prheathutu'abccilam1,lctu,chinhVIV?ymahaimvctu
"animals"(dQngV?t)va "zoo"(sathu),ho~c"aunt"(co/dl)va "uncle"(chuI
baG)du'<;1cd~tavi tri ra'txanhau,ch~ngco lien quailgi voi nhauv~m~tngu
nghla.Tu di€n rheatr?tt1fabcthlh<;1pIy vach~tchev~m~thlnhthuG(hlnhthai)
nhU'ngl<;tikhongh<;1ply v~m~tnQidung(ngunghla)vaclIngkhongphilh<;1pvoi
tU'duyngonngucilaconngU'oi.
Cacnhang6nnguhQc- tamly dachungminhb~ngth1jcnghi~mIa: voi
IDQttukichthiGh"aunt"chonhi~ungu'oikhacnhau,thldasO'd~uchobie'ttrong
dftuhQnghIde'ntu"uncle"tru'ocnha't,di~unaychungtor~ng:ngay"Winoibell
trong"cilaconngu'oichungfa,thl tu"uncle"va "aunt"dacoquailh~voinhau.
Dayclingchinh13.n~ntangly thuye'tv~ngunghlatuv1jngmade nhalamtu
di~nphanlOpy ni~mdad1jaVaGkhixaydvngcach~th6ngphanlOpngunghla
vagallnhanngunghlachom6ilOpdo.Be'nnay,dacomQtsoh~thongphanlOp
105
nhutIeD,nhu:tu di€n thesaurus,LLOCE/LDOCE, mq.ngWeIdNer,h~th6ng
nhiinCoreLex,..
Ke"tquanghienCUDv~phdquatlIgonngilchotha'y:MQtsO'phdquatlIgon
lIgula tu cac hi~ntuQngtamly - ngon ngu hQc, VI the",mQtcach khai quat, no
ph\,!thuQcvaom6iquailh~giualIgonngilva titduycuaconnguoi.MQtsO'pho?
quatlIgonngilkhaclq.ila nhunghi~ntuQngv~daDtQc-ngonnguhQc,VI the"no
ph\,!thuQcvaom6iquailh~giuangonnguvavanboa.CacnbanghienCUDchia
pIuSquatlIgonngilthanh2d<;mg:
.' Cacph6'quatv~th1,1'cthE(substance):la nhungnetchungv~s1,1't6chuccac
th1,1'cth€ ngonngu.Ch~nghq.n,mQingonngud~ut6ntq.icacphq.mtrlldaub
~
tuvadQngtu,nola cosad€ bi€u hi~nca'utrucchImcuacalltrongmQingon
ngu.
-
. Cacph6quatv~dq.ngthuc:ch~nghq.n,nguphapt?Osinhcoir~ngbQph~nco
sacuacuphaptrongmQingonnguthlgi6ngnhau.
NgoaicacphdquatlIgonngilv~nguam,nguphap,ngunghlala nhungph6
quatchid~c~ptdimQtphuongdi~nkyhi~uho~ctdicaibi€u dq.tho?ctdicac
dU'c;5ebiEndq.t,nguoitaconehu)' tdicaephdquatngon ngilv~kyhi~u,chung
d~e~pWi cai quailh~giuacai bi€u d?tva cai duQcbi€u d?t.([29]tr.273-275).
Dii tu lan,trong"GiaotrlnhngonnguhQcdq.icu'ong"cuaFerdinandde
Saussurediichirahaidq.ngquailh~:ngang(tuye"ntinh,hInhtuyen,ngudo<,m)va
dQe(h~hInh, tr1,1'ctuye"n).Tuongung vdi quailh~ngangc6 tru'ongnghIa ruye"n
tinhva truongnghla lien wang, con ung vdi quailh~dQcco truongnghIabi€u
v~tvatru'ongnghIabi€u ni~m.TruongnghIabi€u v~tla t~phQpnhungtud6ng
nghlav€ y nghIabi€u v~tva truongbi€u ni~mla mQt~phQpcaetu co chung
Calltrucbi€u ni~m([2]tr.172- 191D.
106
4.3.1.2NHA.NXET cAc H~THONGNHANNGD'NGHlACOLIEN QUAN
QuakhEWsat cach~thongTItanngunghiacuaLLOCE (xinxemphlfIlfC
8.1),LDOCE (xin xemphlf ll.lc8.2),WordNet (xin xemphlf IlfC8.3.4),CoreLex
(xinxemphlf1l.lc8.4),chungtoi nh?ntha'y:
. Cach phan chia cac lOp cua LLOCE tht!ccha't13.dt!atIeD co so 19thuyet
phanchia tru'ongngunghiarheatrlfcdQc(tru'ongnghiabi6u V?t va bi~u
ni~m).Doi vai Wordnet,ngo3.ivi~cd1,1'atIeDcoso19thuyetphanchiarhea
tru'ongbi6uV?tvabi~uni~m,nocondt!arheacosophanchiarheatru'ong
nghiatuyentintvatru'ongnghialientu'ong(quacacquailh~chucDang,bQ
pMn,tint cha't..). I,
. Voi mlfctieubandgu la h~thongcac9ni~mchungnha'tchomQingonngu
cliaTItan1o~i,Denvi~cbi~udi~nh~thongcac9ni~mtrongWordNetdu'Qc
d11atIeDco so 1)'thuyelv~ngonnguhQc- tri nMn (cognitivelinguistics),
ngonnguhQc- tam1)'(psycho-linguistics),..nhu'ngta'tcacac1)'thuyetnay
d~uhu'angtaimQtmuclieu chungla nghiencuuv~s11chungnha'tclia mQi
ngonngutrenthegioihaycongQila ph6quat (universal)cuangonngu.
. H~thongTItanLDOCEchichutrQngdendanhtu,cosolu'QnglUkhalOn
(45.000)nhu'ngsu'phanchialOpngunghiaquatho(chico32lOp).
. H~thongTItanLLOCE cou'Udi6mla dongian,h~thongphanca'pchigBm3
cfip(chud~- nhom- lOp),soTItankhongqualOn(chig6m2441TItan).
. H~thongTItancuaWordNetra'tchitier,d§ydu(chocactulo~ichinh)VIV?y
s61u'QngTItanrfit lOn (han 100.000TItan).WordNetco u'Udi6m 13.phanca'p
chi tier (hangchlfCcap) va giuacac lOp (synset)con co nhi~uki6u quail h~
khacnhau.
. H~thongTItanCoreLex(danhchodanhtu)phanbi~tdu'Qctu dBngnghia
(homonym)vatu'dBngt1;l'(homograph)trongkhidoWordNethlkhong.
107
>-TrQngtameilah~th6ngTItantrongBTL la d€ khunMpnh~ngngunghIaeila
tuehoml,lcdiet diet,ehukhongph,Hehoml,lediet hi€u (dn e6tri thueehi
tierv~thegiai th1,1'e)Denkhongdn phaiphangiai ngil'nghlaehi tiernhu
trongWordNet.Vai s6luQngTItanquaIOnnhutrongWordNet,thl ehungta
khongth~xayd1,1'ngdilduQenglili~ll;huin luy~nt6ngquatehotit d caetu
(dn ngli li~uhangt)1tu).
>-H~th6ngTItanLDOCE thl qua tho,khongdil suekhu nh~pnh~ngehocaetu
eunglOpnhungkhaenghla.
>-H~th6ngTItanCoreLexduQexayd1,1'ngtueaelOpeCibancilaWordNetvae6
caemaso'TItanla caetUviet tatehlid~u,d~nhahClneacTItaneilacaeh~,
th6ngkhac(ehidungeonso',hayehli-s6).Tuynhien,chungelingchig6m39
TItan eCibii.nva 126lOpd~nxuit chidanhehokhoang40.000danhtu Den
kh6apdl,lngvaomQtM th6ngmaivaicaedanhtukh6nge6trongdanhsach
do.
>'H~th6ngTItanLLOCE eokhoang2441TItanehoh~llbetcaetu loqi ehinh
Dene6th~ehipnh~nlamh~th6ngTItaneilaBTL duQe.Tuynhien,h~th6ng
phaneip eilaLLOCE ehig6m3 eip, DengiliacaelOpv~nkh6tlmm6iquail
M vai nhau.Ngoaifa, so'luQngtu trongLLOCE eonkhahqnche(ehig6m
16.000ml,letu), Dennell mu6nap dl,lngvao mQth~th6ngth1,1'et , dn phai
rnarQngthem.
K~tluan:Dendungh~th6ngTItanLLOCE vaimQtso'di liensan:
. MarQngv6ntu (d1,1'atIeDLDOCE vaWordNet)
. Phancip sanhCln(d1,1'atIeDWordNet)
. Themcaeki€u quailh~khaengoaiquailh~phaneip (d1,1'atIeDWeIdNer)
. Hlnhthanhcaenhan nglinghlaeCiband~sudl,lngkhidn thietvacaeTItan
naydn duQegQinha(gi6ngnhuCoreLex).
108
4.3.1.3 H~THONG NHAN NGU NGHIA TRONG MO IDNH BTL
Quavi~ckhaosatcach~thongTItanngunghlad phh tren,chungtoi da
quye'tdint sil'dl:111gh~thongTItannglinghlacuatiXditinLLOCE cungvoi mQt
s6d.i tie'ntrenno d6 lamh~thongTItanngunghIachinhthuctrongmo hint
BTL. Chungtoi chQnLLOCE lamh~thongTItanngunghlaVIh~thongnayco
s6TItanvila phai (2441TItan),vila du d6 khil'nh~pnh~ngh~uhe'tcac tu c1a
nghlathongthuong.Ngoaira,tudi6nLLOCE conchuanhi~uthongtinv~ngu
phapkhac,dn thie'trhovi~ckhil'nh~pnh~ngnghlatvc1Qng.CachphanchialOp
rhealoai,chungla~i,...cuaLLOCE rungphuh<;jpvoituduycilachungfa.
LLOCE (xinxemth~mphVlvc 8.1)la tu di6nlOpnglinghlatie'ngAnh,
nghlala khi tratrongtu di6nnay,ta se xac dint c1uQctrongm6i lOpc6 chua
nhlingtu tie'ngJ\nh nao,nhun~chungta khongtrangu<;jcl~ic1u<;jc,nghIala tra tlr'
naGthuQclOpnao.Trong lu~nan,chungtoi xem tu ai6nLLOCE la tu ai6n CED
(Class-to-EnglishDictionary)gOGo£)6 tra ngu<;jcl~i (tiX"tlr'tie'ngAnh" suy ra
lOp),chungtoi aa dva vao chinhtiXai6nCED nay (c6 ngu6ngoctULLOCE) c16
xaydvngmQttu ai6nngu<;jcl~i,co tenla ECD (English-to-ClassDictionary).
Vin a~xaydvngcactuai6nphanlOpngunghIaaaau<;jctrlnhbaytrong4.1.3.
*Vi dvmQtvai lOp(class)trongso2441lOpcuaCED:
0 LopAl (V): exist,be,create,animate,...(t6nt~ivat~osvt6nt~i)
0 Lop A2 (V): live, live on,exist,die,decay,decompose,survive,...(song/che't)
0 Lop AlSO (N): apple,apricot,peach,pineapple,pear,plum,papaya,cherry,
grape,mango,dates,fig,pomegranate,...(traicay)
0 Lop G148 (N): letter,character,capital (letter),lower-case(letter), ... (chu
cai)
0 Lop G155(N): letter,epistke,note,envelop,label,...(thutiX,ghi chu)
109
* Vi dlfmQtvaimlfctu(entry)trongso16.000mlfctucilaECD:
0 apple:AlSO
a apricot:AlSO
a exist:AI, A2,N1
a letter:G148,GlSS
a bank:1104(money~"nganhang"),L99(natural~"basong"),...,
Tuy Dillen,h~itudi€n CED vaECD trendaychili!.sl,l'phanlOpngfi'nghia
ehocactu tiengAnh,d€ coh~thongphanlOpngfi'ngbiatUvl,l'ngtu'ongtl,l'nhu'
-
lIennhu'ngdanhchocactUtiengVi~t,chungtoi d1ixaydl,l'ngthem2 tu di€n
phanlOpmoi, do la: tu di€n CVD (Class-to-VietnameseDictionary)va VCD
(Vietnamese-to-ClassDictionary).Tli di€n CVD dungd€ traxemmQtlOpco
nhfi'ngtutiengVi~tnao,contu di€n VCD d€ traxemillQttli tiengVi~tnaodo
sethuQclOpnacoVi~cxayd1,1'ng2 tudi€n CVD vaVCD du'Qcdl,l'atrenb~lDdich
tiengVi~tciiatudi€n LLOCE. Tieuchill,l'achQndatilamlfctUtiengVi~td~co
th€ du'avaoCalltrucvi mo cuatu di€n chuyell du'Qcdl,l'arheacongtrinh[14].
*Vi dlfmQtvai lOptrongso2441lOpcuaCVD:
. Lop Al (V): t5n t<;1i,t<;10ra, t<;10sl,l'song, ...(t5n t<;1iva t<;1Osl,l't5n t<;1i)
. Lop A2 (V): song,tiep tlfc song,t5n t<;1i,chet,bu',thai, rfi'a,thai rfi'a,...
(song/chet)
. LopAlSO (N): lao,mo,dao,thorn,dua,Ie, m~n,dudu,anhdao,nha,xoai,
eM la, va, ll,l'u,...(trai cay)
. Lop G148(N): chfi'cai, m~utl,l',ky tl,l',chfi'boa,chfi'thuang,...(chfi'cai)
. LopG155(N): thu',thu'daivaquailtrQng,thu'ng~n,~~n-gill chep,phangbl,
baathu,nh1in,...(thu'tu,ghichu)
*Vi dlfmQtvaimlfctu(entry)trongso16.000mlfctucilaVCD:
110
. ehfi'cai: G148
. metA136 (diy/la IDa),AlSO (tnEIDa),B82'(n~mmo)
. t6nt~i:AI, A2, N1 (t6nt~i)
. thu':G155(thu'tit),L177(hoanl~i)
Caedi tie'nLLOCE nhu':Dangdung1u'<JngtU,phanea'pchitie'thondu'<Je
chungtoithlfehi~ndlfatrenh~th6ngWordNet.Caehthuethlfehi~ndu'<Jctrlnh,
bay!rangph~n4.1.3.Ngoaifa,d~d~theed6icaeke'tquagallnhanngil'nghIa,
chungWidasud\lllgcaeTItanngil'nghIacobancilaCoreLextrong
Bang8.8.Caemuephanea'pcobannhu',Hinh4.12sau:
prt~I~eel - - pho:r
I /T
mle hum anm pit art fiat
SPc~I~tme reI
atr mea
41 !
lme qud qUI comtme
ii
fad ehm
/\
grb grs
)h\
Hinh4.12.Caemuephanea'pngfi'nghIacobantrongBTL.
con pro
111
4.3.2 CACNGu6NTRITHDcD~XGLtNGDNGH~
D~ xli'19ngunghIa, ngu'aita phai ket h<;ipnhi~ungu6ntri thue:tU'cae tri
thuev~ngonngu(nhu':hmhthai,nguphap,ngunghla)ehodencaetri thue
ngoaingonngu(tri thuev~the'gioi th1,1'e)[136].Caengu6ntri thued6 thu'ang
baag6m:
4.3.2.1 TRI THUC VB TV LOAI (POS)
Trangtru'angh<;ipcaetu d6ngt1,1'(homograph)va e6nghlakhaenhauvoi
caetulo~ikhaenhauvaungvoi mQttUlo~ichIe6mQtnghladuynha"t,thlnha
thongtintulo~i,chungtasexaedintdu'<;ieehinhxaenghlaeuachung.Vi du:tu
,
"can"c6 nghla1a"e6 th~"(tr<;idQngtu), "cai hQP"(danhtu), "d6nghQP" (oQng .
tu).VI V?y, voi cac tru'ongh<;ipnay, ne'ubier du'<;icchinhxaetu loai, chungta.
heaDroankhli'du'<;ienh?pnh~~gnghlaeuachung.Vi d1,1:"IpRocanAUXCallyaDET
eanNN"(Toi co ad donghl)pmQtcai hl)p).Vi~exacdinhtu lo~idadu'<;ietrlnh
bay0ph~n4.2.1.
TheothO'ngke trongtUdi~nLDOCE, e6 toi 88%m1,1etu thuQcd~ngn6i
tren,ngoairac67%tru'ongh<;ipmam1,1ctu(t?Pcaetud6ngt1,1')e6nhi~utu lo~i,
m6itu lo~ic6 th~e6nhi~unghlakhaenhau,nhu'ngtrangd6e6it nha"tInQttu
lo~ic6duynha"tmQtnghla.DO'ivdi tru'ongh<;ipnay,tac6th~khli'nMp nhhg
nghIane'utulo~icuan6(trongngucant) chinh1atulo~imachIc6mQtnghTa.
Voi caeconsO'thO'ngke nhu'tren,takhongth~n6ilachIdungthongtintu
lo?i,tac6th~khli'nh?pnh~ngnghIacuatutie'ngAnhvdidQchinhxactren88%
vI2nguyenTItansail:
. ConsO'88%la thO'ngke trenlo<:titu(wordtype)chukbongphai1u'<;ittU(word
token),vatrenth1,1'cte,caetukhongroiVaGlo~itrenl?i xua"thi~nnhi~u.
. DQchinhxaeeuabQgallTItantU'lo?ikhongphai100%(eB96-97%).
112
4.3.2.2 TRI THUC VE QUANH~CDPHAp (S-V-O-M)vA RANG BUQC
NGUNGHIA(SELECTIONALRESTRICTIONS)
Doi vdi caetru'onghQpclingtir lo<:iinhu'ngco nhi~ubonmQtnghia,th1
thongtin tir lo<:iikhongdud6khii'nh~pnh~ngnghIa.Vi d1,l:tir "bank"(co2 fir
lo?i: dQngtir va daubtir)vdi tir lo?i daubtir co caenghla:"nganhang","bo
(song)","day",.. Trongtru'onghQpnay,tadn sad\lngthemtrithuev€ the'giOi
thljethongquacaerangbuQengil'nghIagiil'acaethanhph~neuphaptrangeau.
Vi d\l: trongdiu "I enteranoldbank",saukhi quaph~ngallnhanngil'phap
(4.2),tadu'Qe:
[IpRo]NP[enterv[anDEToldADJbankN]NP]VPva cay eu ph3.pnhu' HInh 4.13 du'di
day:
~.~
r .~
P
r
o verb ~ IP~
del adj noun
I I I
an °fd A-N ~ank
I DoN If J
1 rLt &Y
HInh4.13.Caequailh~euphapvarangbuQengil'nghIa.
Trencayeuphapnay,taxaedintdu'Qecaequailh~ellphdpnhu:S-V (chu
ngil'- dQngtir), V-0 (dQngtir- doitir),A-N (tinhtir- daubfir),D-N (dinhtir-
daubtir).M6i tirthlje(contentwords)trangcauireD,ehodlidaxaedint du'Qcfir
lo?i ehinhxae,nhu'ngd€u v~ngaynMp nh~ngv€ ngil'nghIa.Vi d\l: dQngtir
"enter"(di vao/nh?p),daubtir "bank"(nganhang/bosong/day),tint fir "old"
(gia/cG).VI V?y,chungtaph,hsii'dl:111gde'nnhil'ngrangbuQcngil'nghIanhu'sau:
113
Bang4.1.DanhsachcacnghlavarangbuQccuacactuth11ctrongcau.
~v~
y/ nt
r
rl \ ~ er2S- 'S- -0V-O
Human Clo-SPA Human Data
I- ~ I I
T . T
I bankl I
lA-N
old2
Hinh4.14.Cayquyetdtnhtrongvi~cchQnnghlaphilhc;iP.
Quayi~cduy~tdiy tutrenxuongyei gocla dQngtu(Enter),cu6iclingta
chQndu'~ccac nghlaphil hc;ip:enterl (di vao), bankl (nganhang)va old2 (cil).
Trangvi~cxetdi~uki~nrangbuQcv~ngunghla,chungtaphaixetdentinhcap
b~c(hierarchical)trongh~thongnhanngunghla(ontology)ma trongdo khai
ni~mconsekethuacacnetnghlacuakhaini~mchavacothemconetnghla
moiriengcuachung.Thongtinv~d~cdi€mngunghla(type)cuatUngtuth11c
clingnhu'cac rang buQcda du'c;icxac dtnh trong tu di€n LDOCE va
FrameNet[58].
Tu
I RangbuQc/nhanngITnghla RangbuQc
I (Toi) Type:Person(Ngu'oi)
Enterl (divao) S:Human(ngu'oi) 0: Closed-SPA(khongkin)
Enter2(nhp) S:Human(ngu'oi) 0: Data(dITliu)
Bankl (nganhang) Type:Hou(nhacli'a,khonggiankin)
Bank2(bosong) Type:Nat(congtrmhthiennhien,khonggianho)
Oldl (gia) N: Ani (cos11song)
0ld2(cil)
114
4.3.2.3 TRI THUCVE NGONTV (COLLOCATION)
Slf rangbuQcv~ngunghIagiuacacthanhph~nCllphapn6i trenkhong
phaihienaoclinggiaiquyetdu'9CmQinh~pnhAng,vi c6nhlingquailh~ti~m5n
v€ logic,v~ngu nghIaho~cth~mchi do th6i quell ma vi~cnh~nbier phai doi
hoinhungtri thuc thegioi thlfCma dennay ngu'oita clingchu'ath€ richh9P her
,vaotudi€n haycaccdsatrithuckhactrangmaytinh.
Vi d\l: danhtU "bank" trongcali "I go to the bank..."c6 nghIagi ? "ngan
hang/ bo (song)/ day";danhtu "way" la "duong(di)/cach(thuc)";danhIU
"letter"la "bucthu/ chucai"; ...NeutachIxetcacrangbuQcv~ngunghIanhu'
phh IreD(khongphaihicnaocaerangbuQenay cli~~e6m~tday du) thila
clingkh6maxaedintduQcchinhxacnghlacuacaetunMpnh~ngd6.
Vi v~y,d€ khli'rrh~pnh~pglrangcaetru'onghQpnay,nguoitathuongxet
Mn hint thai va ngu nghIaeua cae tu Ian e~nhay con gQi la ngon tu
(collocation).Ch~ng h<;tnkhi thay "bank ... river" -7 "bo song", "bank ...
account/money"-7 "nganhang"; "way to" -7 "duong(di)", "way of' -7 "each
thue";"write ... letter...to" -7 "buc thu", ".. letter A" -7 "chu cai", ".. letters,
digits,symbols..."-7 "chITeai", "write ...papers, letters,messages,... " -7 "b((c
thu"';...
Thongtin ve eactu c6 quailh~ngu nghlanhutren(thuQetruonglien
Wang)e6th€ timtha'ytrangcaetudi€nthesauruseuaRogetho~eLLOCE.
PhC;lmvi Iane~neuatudn khungunghlae6th€ labelltrai1,2hayn tuva
benphai1,2hayntu.Ch~ngh<;tn,DavidYarawsky[150]diisud\lngeuas6ngu
canh(context-window)c6 dQrQng[-50,+50],con Mark Stevensonva Yorick
Wilks[136]sud\lng10tU Ian c~ntu dn khu nh~pnhAng,bao g6m:tu tbu
nhat/thunhi bell trailben phai, danhtUldQngtu/tinh tu d~ulien bell trai/ben
phai.
115
4.3.2.4 TRI TRUCvE CHl]BE (SUBJECT)
Trong illQt so truongh<;5pnMp nh~ng,chungta c6 th~xac dtnhdu<;5cnghla
dungcua tll DeUta bier dU<;5cchu d~cua van b,ln. Ch~ngh~ntll "bank", neu
dangn6iv~vand~v~lInt vlfc"taichanh"thlnothuongnghlala "nganhang";
tll'"driver"~ "trlnhdi~ukhi~n"(neuchud~la lInhvlfc"tinh9C"); "sentence"
-7 "cau"(neuchud~la "ngonngu/ vanph(;lm")ho~c"banan"(neudangn6i
v~"phap lu?t"); "element"~ "nguyen to'" (trong"hoa") / "ph~ntU'" (trong
"toan/tinh9C"), ...
B~ xac dinh dudc chu d~cua van ban dan£dn dich, ta dn xem xet su. . ~. .
xuathit$nmQts2 tll chuyenmantronglInhvlfcd6.Ch~ngh~n,neutrongvan
bantathayxuathi~ncactll nhu:"ellipsis"(tInhlu<;5c),"bilingual"(songTIger),
"anaphora"(the d~iill), "phr~se"(ngu),...th1ta c6 th~GOaDnh?n van ban nay
dangn6iv~chude"ngonnguh9C";tudngtlfchocactll "computer",memory",
"peripherals", "CPU",...~ dangn6i ve "tin hQc";...
ChinhVI V?y, trong tU di~nLDOCEILLOCE d~uc6 ma s6 chu d~cho cac tll
ehuyenmannay.Chungtac6th~xacdtnhdU<;5cchud~mQtcachtlfdQngb~ng
eachxemxetcactll chuyenmanIanc?n tll dangdn khU'nMp nh~ng(gia sti
trongph~mvi [-50,+50])thuQcchud~naotheocongthuccuaYarowskytrong
~ Pr(wlSCa0Pr(SCa0
[150]nhusau:ARGMAJ(;colL.,log P 'C.) trongd6SCat:machu'1'EW 1 VI,
d~,W: khungcU'as6ngucanhchuacactll w.Do xacsuatPr(SCat) khongph1;1
thuQc vao w, Den cong thuc tIeD du<;5c viet thanh:
ARGMAX" 100 PrCw ISCat)SCaI,L.; b
WEW PrC w)
(4.13)
Tacoth<ith6ngkecaccons6trentIeDmQtnguli~ucionngu(tiengAnh) 16n.
116
4.3.2.5 TRITHUCVE TAN SUATNGHIACUATV (SENSEFREQUENCY)
Khong phai tu naG rung thuQcv€ mQt chu d€ naG d6 (trong tU di€n
LDOCE,hon 56%tu thuQcd;:tngnay), VI V?y tinhthongdt,mgcuamQtnghlanaG
d6condu'<;fCd1,1'atrendQdov€ t~nsua'txua'thi~ncua tu d6voi nghla C1,lth€ d6.
Chhg h;:tn,danhtu "pen" se c6 nghlathongd1,lngnha't13."but/vie'C'(ben qnh
caenerhlait thonerdUllerhon nhu""chu6ner""lonerchim")' "ball" c6 thu'onerc60 0.0" 0' 0' 0
nghlala "quabanh/honbi" honla "bu6ikhieuvu",...
BQ do t~nsua'txua'thi~ncuam6inghlacuam6i tu du'Qcthongke tren
nhungnguli~ura'tIOnthuQcnhi€u lo;:tivanbankhacnhau.ChinhY1v~y,trang
WordNetva trongLDOCE, 'cacnghladu'Qcs~pxe'prheathli't1,1'giarnd~n(nghTa
thongd1,lngnha'tseduQcli~tked~ulien)..
4.3.2.6 TRI THUC TRONGDJNH NGHIA CUA MOl ~GHIA(DEFINITION)
Trongcactu di€n LDOCE / WordNet,m6inghlaseduQcdtnhnghiavavi
dt,lkernrhea.Vi d1,l:tu "bank"trangLDOCE se c6 cacnghlakernd~nhnghla
cilanonhu:
-"landalongthesideofariver,lake,etc."(da'tdQcbellsong/ h6)
-"aplacewheremoneyiskeptandpaid " (noigillti€n vatrati~n...)
- "arow,a lineof..."(rnQthang,mQtday...)
D1,1'atrenthongtin trongcacdinhnghlanay,va sosanhvoi thongtin cua
ngucanh,tac6 th€ xacdinhduQcnghlaphuhQpcuatutrongngucanhdo.D€
th\fChi~ndi~unay,Wilkset.al.[147]datinhtoaDph~ngiao(overlap)cuata'tca
caet6h<;fpnghlacuacaetu th1,1'Ctrongcali tie'ngAnh dungd€ dinhnghlam6i
nghlacuatu.
117
4.3.3 CAN NIlAN NGU NGRIA CRO TV TIENG ANR
Voi h~th6ngnhanngunghladuQctrlnhbayd tren,th1bai roankhli'nMp
nh~n.gn hlacilatll duQcduav~bai roangallnhanngunghla.Tucla nghlacila
tITdanghlaseduQcxacdinhngayne'ubie'tnhanngunghlacilano,vi d\l:danh
tIT"bank"se co nghlala "nganhang"ne'uduQcgall nhanla "HOD", va co
nghlala "bo(song)"ne'u,duQcgallnhanla "NAT", Wongt1!rhocaetll "letter",
"line",....
Trongcacmo h1nhgallnhanngunghlarheaeachtie'pc~nd1/atrencac
ngu6ntri thuGnoitren,nguoitathuongsli'd\lngbQnhancodQmill (granularity)
khaenhau.BQ nhanrang min (chi tie'thangtramngannhannhuWordNet) thl ~
oQehinhxac cila vi~cgall nhan se tha'pbon nhungkha Hangkhli'nh~pnh~ng
nghlacilanosecaobon(vIkh_ongcotruonghQpnaGclingnhanmakhacnghla).
Ngu'Qel<;li,ne'uchQnbQnhancangtho(chico36nhannhuLDOCE),thldQchinh
xacsecaobonva tatnhienkbaHangkhli'nh?pnh~ngnghlasethapbon(seco
nhi~utruonghQpclingnhannhungkhacnghla).
Ngoai fa, vi~cgall nhanngu nghla con duQcphanbi~ttheequy mo gall
nhan:ho~cla gall chomQtso'it cactll di~nhlnh(nhu HweeNg va Hian Lee
[89]eho1ill interest,David Yarowksy [151]cho 12tll, ...)ho?c la gall cho hgu
h€t cac tll th1/c(nhu Mark Stevensonva Yorick Wilks [136],Mona Diab va
PhilipResnik[66]).
Vi~cchQnngu6ntri thuGnaGrhomoitlnhhu6ngduQch~th6ngquye'tdint
b~ngphuongphaphQcgiamsatlIen nguli~udi'iduQcgannhanngunghlachinh
xac(daychinhla nguli~uhuanluy~nhaycongQila nguli~uvang).Giai thu~t
hQcoth~la m<;lngNeural,cayquye'tdint, MBL, TBL,...matrongdo cacgiai
thu~th9Cd1/atrenkyhi~u(symbolic)torachinhxacbon[112].
118
Trongvi~chQcgiamsat d€ chQnngunghlathichhQpnhuda trlnhb3.y d
tIeD,thlnguli~uhuanluy~nla vanderatdangduQcquailtamVInguonnguli~u
nayrat hiem(chi c6 bQSEMCOR chilakhoclng250.000tU trichtu ngu li~u
BrownvaduQcgallrheanhanWeIdNer)vavi~cxaydl;fngn6rattonkern.Chinh
VII:;'do nay, khien cho caebQkhli'nh~pnh~ngngu nghlahi~nnay chu yeti chi
khli'cho mQtso it cac tu, VI neu muonkhli'cho tat eelcac tu (hang Vt;llltu da.
nghla,m6ituphclixuathi~nvaitraml~n),thlnguli~uhuanluy~nphaichilavai
tri~ututrdlen.Neusonhandlllg IOn(nhuWordNet),thlnguli~unaycangphai
IOnhon(hangtytu).
D€ kh~cpht,lckh6khanve nguli~uhuanluy~n,mQtsotacgia trongcac
eorigtrinhnhu[151J,[66J,[62Jdachuy€nsangdungnguli~ukhongdanhdati
VahQcb~ngphuongphaphQckhonggiam sat,nhungtatnhienket qua se thap
hoDphuongphaphQcgiamsat.
MQtxuhuangg~ndaynhatla dungnguli~usongngu(mada duQcdiet
ehinhxac bdi nguei) d€ xiiy dt,l"ngngG'li~uhuan luy~nehoWSD. Y tudngnay
duQeBrownvacaedonguicghidadexuatd~utieDtrongcongtrlnh[51].Theo
ong: "M(Jt tli do nghfotrongngonngilngu6nthuangdu(Jcdjchthanhnhilngtli
kluicnhoutrongngonngilarch...Diiu naycangdungclIocacc{jpng6nngilkhac
lo(lihrnh".Vi dt,l:tU"bank"c6nhieunghla,nhungtrongmQtngil'canhcuamQt
bandichchinhxacbdinguei,n6 seduQcdichthanhho~cla "nganhang"ho~c
Iii "besong",...vadaychinhla "nhanngG'nghladung"cuatU "bank"trongngil'
li~utiengAnh.MQttrdng<;ticuad€ xua'tnayla sl;fkh6khantrongvi~cxiiydl;fng
ngG'li~usongngu1-1d6.Nnungg~nday(2000-2002),khinhungnguonngil'li~u
songngil'nhuv~yduQchint thanh,thl tIeDthe'giai da c6 nhi€u cong trinhchQn
eachtie'pc~nnay,nhu:[66J,[101J,[90J,...
119
14.3.3.1 GAN NHA.NNGU NGHIA CUA TO'TRONGMO HINH BTL
Trangmohint BTL, chungtoi dach9nh€; th6ngTItanLLOCE vdi mQtso'
5sungnhl1datrlnhbayaph~n4.3.1.3.Vdi h€; th6ngnhanngii'nghTanay,biii
DanWSD trangmohinhBTL dl1Qcdl1av~bai tmingallTItanngunghTa.Th?t
~y,nghlaeuatu "letter"se dl1Qcxac dtnhngaynEubiErnhanngii'nghTacua
6:nEuno dl1Qcgall nhanla G155,thi no co nghI~la tInt,G148-+ chll c6i;
u'clngn;reho tu "bank", "way", "line", Do h€; th6ngTItancuaBTL chI g6m
Gang2500TItan,Denvdingii'li€;ukhoang5.000.000tu,thisl.l'pMintandu1i~u
6ngxayra.
TrongmohinhBTL, bai roangallnhanduQcthl.l'Chi€;ntrenta'td cactu
hl!c(contentwords)baag6m:danhtu,dQngtu,tinhtu,d,;iitu.D6i vdicaetuhu
functionwords),chllngtoi sesii'd1,lngcacthongtinv~ngG'dnh d~chQnnghia
hUhQP.Ch~nghq.n:gidi tu "in" seduQcdiet la "trang"nEu danhtu theesail
6chiv~khonggian(nhanla SPA); se duQcdich1a"b~ng"nEudanhtu do
huQcv€ vanban,ngonngii',..(TXT, LING);...
Trangmo hint BTL cl'1achungtoi, chungtoi da kh~cph1,lctra ngq.iv~ngG'
i~uhua'n1uy€;nb~ngcachtl.l'xay dl.l'ngngii'li€;u hua'nluy€;n dl.l'atren ngii'li~u
ongngG'da duQclien kEt tu [67].Vi€;c tl.l'xay dl.l'ngnhuV?y se g~p nhi~ukho
an(v~ngu6nngii'li€;u songngii'dadl1Qcdichchinhxacbaingl1oi)nhungding
eml<;iikhongit thu?nlQi (lamchuduQcngii'li€;u, ngu li~uphuhQpvdi lInh v1,1'C
h dich).Chungtoi may m~nkE thuadl1QcmQtngu6nngii'li~usongngu Anh-
i~tdi~nHi cotenlaEVe tueongtrlnh[13].Eve cokichthudc5.000.000tu,
huyenv~khoah9Cvakythu?t(Tinh9C,Vi€n thongLYi~clienkEttuehangu
i~uEVe nayda duQctrlnhbaychi tiErtrongph~n4.1.1.Vi~cgall TItanngii'
ghiachongii'li~uhua'nluy~nseduQctrlnhbaytrangph~ndudiday.
120
4.3.3.2 GAN NHA.NNGG NGHIA CHONGG LI$U HUAN LUYJ;,N EVC
Ngfi'li~usongngfi'EVC g6mkho?mg250.000c~pCalinhuduoiday:
(E):Jetplanesfly aboutninemileshigh.
(V): Cacmaybayphanh,tcbaycaDkhoangchind~m.
Giasaxettu"plane"(g6mcacdanhtu:"maybay","matph~ng";dQngtu:
"bay,lu<;1n","di maybay","bao";tinhtu:"b~ngph~ng")va "fly" (dQngtu:.
"bay","tha", danhtu: "conru6i",...) trongcali tie'ngAnh n6i freD,chungd~ula
nhungtu da tu lO<;liva danghla,nhungsailkhi xac dinhdu<;1cmoi lien kef tu
chinhxac(ph~n4.1.1)vatUlO<;lichinhxac(ph~n4.2.1)taduQc:
[JetNN planesNNP]NP[flYvJvp
~ \
[aboutRB milesNNP highRB]MP
[Caemaybayphanl1,rc]NP[bay]vp [caD khoang chin d~m]MP
Trangd6 : "plane"la danhtu,"fly" la dQngtu; "plane"lienketvoi "may
bay","fly" lienketvoi "bay".Duatrenthongtinnay,chungtoi lien hanhxac
dinhnhanngfi'nghlaciladanhtu"plane"vadQngtu "fly"nhusail:
GQiejla tu tiengAnhthuQcCalitiengAnhe,vi la tutiengVi~tthuQccali
tiengVi~t v ma dU<;1clien ket voi ej.Do eiva \j d~uc6 th~la nhfi'ngtll'da nghla,
Denejva vi d~uc6 th~thuQcnhi~ulOpngfi'nghlakhacnhau(nhungv~nclingtll'
1o?i).GQi Xi 1amQttrongnhfi'nglOp ngfi'nghla X child eiva Yj la mQttrong
nhfi'nglOpngfi'nghIaY childvi'VI ejvavi la tu du<;1cd~chboinhau,DenlOpXi
phaitrlingho~cg~nnghlavoi lOpYj nha't.VI v~y,nhanngfi'nghladungcilaej
va. vi chinh 1a lOp Xi va yJ du<;1C xac dinh nhu sail:
ARGMAX X;Ex'yjEyClassSim(Xi' Yj) (4.14).
trongdo,ClassSim(Xj,Y)du~ctinhrheaGongthuG(4.3).
121
Vi du:xetdanhtu "plane",1atha'y"plane"thuQcaclOpX={J41,M180},
"maybay" thuQccac lOp Y={M180},Den khi tinh e~p(X,Y) ta duc;>c
(M180,M180).V~y nhan ngil'nghIadungcila "plane" trongTruonghc;>pnay 1a
M180(phuongti~nhangkhong).TuongtVehodQngtu "fly" va "bay", tae6
X={M19,M28,M29,M30,M35}va Y={M19,M28,M35}.TrangTruonghQpnay,
tae6th~e6tdi 3nhand<;1tcved<;1irhea(4.14),d6la: M19,M28vaM35,nhu'ng,
dotinhtoaDdQtuongd6ng(trangGongthue4.3),tasechIduQcM191anhanciia
"fly".Chitie'tv~vi~cxaydVngnguli~uhua'nluy~ndaduQcGongb6Trang[67].
DVatrenGongthue(4.14),chungtaclingd6ngthaixaedinhduQcnhanngu
nghIachinhxaccilatu tie'ngVi~tTrangnguli~usongngiIEVe. Ngu1i~utie'ng
k
Vi~tduQcgall nhiinngunghlanayse giupfchra'tnhi~uTrangvi~cxaedtnh
(b~ngcachth6ngke) pho.ngeach,ea'utruc,tr~ttVtu Trangcautie'ngVi~t,g6p
phht<;1oranhungcautie-figVi~tggnvdivanphong,ca'utructVnhiencilanguoi
Vi~tbon.Ngaaira,ngu6nngerli~utie'ngVi~tduQcdanhda'unay se13ngu6n
nguli~uva rungquigiadoivdi caenhangonngerhQesosanhAnh-Vi~tTrang
vi~csosanh,d6i chie'ucaedi~mwongd6ngva dtbi~tacac ca'pdQ IreD cae
blnhdi~nkhac nhau.Chinh caenghienCUDnay, l<;1id6ngg6p nguQcfro l<;1irho
vi~cehQncacye'uto'lac dQngTrangvi~cehuy~~nger(th~hi~nTrangcac khung
lu~teilagiaithu~thQcKFTBL) tuAnhsangVi~tTrangmohlnhBTL.
4.3.3.3 GLA.ITHUAT HQCBE GANNHA.NNGD'NGHlA TU TRaNGBTL
Giai thu~thQcduQcSUd\lllgTrangmohlnhBTL 13giaithu~thQcgiamsat
KFTBL (diiTrinhbaychitie'tTrangphgn3.3.4.6)vdih~th6ngnhanLLOCE di
tie'nvangerli~uhua'nluy~nEVC diiduQcgannhannguphapvangunghIa.
122
GannhanngunghlachotutiengAnhclingHiIDQtd~ngdi~nhlnhcuavi~c
gallnhanng6nngubiingTBL. Sovdicacbailoangallnhanng6nngukhac,bai
lOannayyetidu d~uVaGla nguli~udaduQcgallnhantulo~ivaell phapehinh
xac.
Gonhanngunghla
Nguli~uhuinluy~nEY
dagallnhantUlo~i,Cll
phapvanglinghIa
Nguli~uchiconhan
nguphap
Cacngu6n
tri thlicv~I
ngunghla
trongtudi~n
BQ gallnhan
ngunghlacd sa
,"""""""""""""""".. .. .
, ~~~~.~l.~~t..j
Ngu li~uduQcgall
nhanhi~nhanh
,.
Caclu?t ling VieD Ngu li~ugall nhan
rhealu?t ling vieD
Lu? t toi U'U
Hlnh4.15.M6 hlnhhuin luy~nbQgallnhanngunghIatutiengAnhtrongBTL.
123
Daylu~tdi bie'nduQcrutraseduQcdungd6 gallnhanngunghIachQtu
tie'ngAnhtrongvanbanmditrongh~dtchEVT. Van bannaydn philhQpvdi
vanphongva llnhvl1ccuangfi'li~uhua'nluy~nEYC. Bai roang5mcacva'nd~
sail:
Gannhancdsd(baseline):
Vi~cgallnhancd sa duQCthl1chi~novatIeDthongtin v~t~nsua'txua't
hi~ncuanhanc doi vdi nhfi'ngtu w mangtu lo<;iip trongngfi'li~uhua'nluy~n
EVC:P(clwp)'Tu lo<;iip la tulo?i chinhxaccuatuw duQcxacdinhsailkhi qua
thg gall nhantu lo<;iicua BTL. Boi vdi tu mdi (khongco trongngfi'li~uhua'n
luy~n),thongtin v~dn sua'tnhanngfi'nghIanay ducjcla'ytu ngu6ntri thuc
\:
WordNet(ta'tnhienla cotinhde'nthongtintu lo?i vachUd~).Cachgallnhancd
sanhutIeDd1'ith~hi~nvi~csli'dt:mgngu6ntri thucv~tulo?i,tri thucv~chud~
vatrithucv~t~nsuatngfi'nghIamadlltrlnhbaychitie'trongph~n4.3.2.
Khungluat(template):
Caekhunglu~trongbairoannaysesli'dt,mgcacnhantotucacngu6nIri
thuccolienquailde'nvi~ckhanMpnhangngfi'nghIa(ngoainhfi'ngngu6ntri
thucda:duQCsadt,mgtrongph~ngallnh1'incdsaa tIeD),c\lth~nhusail:
. sa dt,mgtrithucv~quailh~cuphapvarangbuQcngfi'nghIa:
n((Syn, E TagJ 1\Word; 1\(Wordj E Tag}) 1\POS;) ~ (Wordk / Synk +- Tag k) vdi
3;e[-m,+n]
Synla bie'nth6hi~nvaicuphap{Sub,Obj,Noun,Adj,Verb,...};Tagla bie'n
th6hi~nnhanngfi'nghIa{HUM,ANI, NAT, M1,...};Wordla bie'nth6hi~n
hlnhthaitu(nhu:"of", "account",...);P~Sla bie'nth~hi~ntu 10(;li{NN,VB,
DT, IN,..}.
. sa d\lngtrithucv~ngontu(collocations):
nWord;, nWordj I\POSj hay nTag; ~(Wordk/Synk +-Tagk)
3;e[-III,+II] 3;e[-.'.+IIJ 3ie[-III,+II]
124
Day luatt6iu'u:
Sailkhihuanluy~n,h~th6ngnayserUtraduQcdaylu~t6iullcod<:tnghusail:
1. ((SUBoE HUM) 1\(Wardo="enter')1\(POSo =VB)1\(WardoEMOV)~ OBJo *- HOU
co nghlala: CHU TV (duQcgallnhanngunghlala NG001) cuaBONG TV
A? -t',
"enter"vdinhannghlala Dl CHUYEN thiBOl TO cuanoseduQcgallnhan
ngunghlala ToA NHA. Vi dl,l:"1enterthe bank"thld6itu "bank"sedu'Qc
gallnhanlaToA NHA (nghla"nganhang").
2. ((Wardo="way")1\(Ward+i="0/")) =>toga*- MANN
conghIala: tu "way"seduQcgallnhanngu~ghlala CACH THUC nelili~n
sailnola tu"of".
3. ((3iE[+1,+3]IWort!;="river"JI\(Wordo ="bank)1\(POSo =JvW))=>tag0*-NAT
co nghlala: danhtu "bank"se duQcgallnhanngunghIala CONG TRINH
THIEN NHIEN (bosong)neusailnocotu"river".
4. (Wordo="in")1\ (POSo=IN) 1\ (NOUNoE LING)=> tag0 +-"bang"
co nghlala: giditu "in" seduQcgallnghlala "bang"neusailnocodanhtu
chinhcuagiOingu"inNP" dothuQcv~ngonngu.Vi dl,l:"inEnglish".
5. (Word-i E SPORT) 1\(Wordo="l71atch")1\(POSo =NN)=>tago*-SHOfiV
co nghlala: danhtu "match"se duQcgallnhanngunghlala BUGI TRINH
DIEN (nghIala "tr?ndau")neuli~ntrudcnola danhtu thuQcv~ THE
THAO. Vi dl,l:"footballmatch".
Lull v: dc nhanngunghladuQcgallnoi trenchom6ituphaiDamtrongdanh
sachdc nhanngunghlaco th~cocliatu do (vditu lo<:tixacdtnhcuano).Ket
quagallnhannayduQcdanhgiabangeachsosanhvdinguli~uyangnhuTBL
(3.3.1.5).
125
~ ~ A , ,
4.4 CHUYEN DOl CAY CD PHAP TRONG BTL
Trang h~diet theaehie'nluQeehuy~nd6i ell phap(syntaetie-transfer-
based),thl vi~eehuy~nd6icayellphapdongvai trara'tquailtrQnglmhhuang
de'neha't1uQngdiet euah~.Chuy~nd6icayell phapnh~IDehuy~nIDQtdiy ell
phapcuang6nngungu6n(dayla tie'ngAnh) sangcayellphapcuang6nngu
diehd~philhQpvoi vanph<;lIDeuang6nngudieh(tie'ngVi~t).Vi d\l:Calitie'ng
Anh "] enteran oldbank"saukhidi quacaedng phantichhint thai,nguphap
vangunghla,taduQe:
Bang4.2:Thongtinphantichngonngucuacalitie'ngAnh.
Dlja tIeDcaethongtin v~ngonngunay,ta tie'nhanhchuy~nd6icayell
phaptutie'ngAnhsangtie'ngVi~tnhuHlnh4.16duoiday:
/~ /~
r ~ r ~
del adi noun del noun adi
I I I I I I
enter an old bank T6i diviw mi}tnganhimgcu]
Hinh4.16:Chuy~nd6i cayellphapngonngungu6n.sangcayngonngudich.
Hinhthai' I I enter a old bank
Tu la<;li
I
PRO V DET ADJ N
1
Cll phap
I
[]NP
-
[ [ ]NP]VP
Vai
i
Agent Act Qty Time Location
I
I
HUM MQV INDEFNaunahla I TME HOU;:;, ;:;,
Lien ke't ' 1
I
2 3 5 4
I!
126
, , A' A A A? A? A , ,
CAC CACH TIEP CAN CHO VI~C CHUYEN DOl CAY CD PHAP:.4.1
D~ giai quye'tbai loan nay,ngu'oita cHie6nhi€u eaehtie'pe?n khaenhau,
,hu':eaehtie'pc?n dl1atren Iu?t, quabi~udi~nbell trong,phan tIch ell pM p
ongsong,thO'ngke tr~ttl1,hQctUnguli~u,..Saildaytaxemxet so Iu'<;1ccac
aehtie'pc~nd6:
..4.1.1 CACHTIEPcAN DUNGLuAT (RULE-BASED)
Day Ia eaehtie'pc?n du'<;1Capd1,lngsamnha't(tiInhungth?pnien 60-70)
)~ngcachdungnhungIu?tehuy~ndcheO'dinh.doconngu'oifighTravag~nIi€n
lingIu?tsinhcuabQvanph(,tmdungd~phanrichCallngu6n.Vi d1,l:d~chuy~n
16iVl tri giuatinhtiI~danhtiI trongtie'ngAnh voi tie'ngVi~t,ta dungIu?t
:huy~nd6i sail [6]:"(E): 1\T --+Det Adj Noun =>(V): NP --+Det Noun Adj"..
\TghTal khiphanrichellpMp calltie'ngAnh,ne'utrongcayCllpMp ngu6nc6
;ud1,lngIu?tE, thlkhi chuy~nsangcayellphaptie'ngVi~t,cayd6sedu'<;1cvie't
;:titheoIu?t V tu'ongling.Voi eachanhx(,tnhu'tren,taclingc6 th~giai quye't
:actru'ongh<;1p"chen"ho?c "xoa"nhungnodekhi chuy~ncay tiI ng6nngu
19u6nsangng6nnguc1ich.Tuynhien,eachthlicnaychIgiaiquye'tvi~cchuy~n
16i!chenixoacnocaethanhph~ntrongcungmQtIu?tsinh(nhu'RlsangR2 trong
-nub4.17). NghTaIa n6 kh6ng th~chuy~nIDQtnodesangIDQtca'pkhac ho?c
,angIDQtchakhac (nhuR1 sangR3):
R1
A~
/1B~
R2
A4\
/1C~
B2 Bl81 B2 C1 C2C1 C2
A4\
/1B
Cl
C
~
Bl C2B2
Hlnh4.17:KhaDangchuy~nd6i cayellpMpb~ngIu?1.
127
4.4.1.2 CACH TIEP CAN DUNGNGD'PHAp SONGSONG(ITG)
Bay la cachtie'pc~ndvatrenvi~cphanrichcuphapsongsongchosong
nguAnh-Vi~tcuaDekaiWu [148]chosongnguAnh-Hoa.Vi~cphanrichCll
phapsongsongnay su d\lngbQvanph<;lmchuy6nd6i nguc;lc(ITG: Inversion
TransducerGrammar).Van ph<;lmnay duc;lcbi6u di~n:G=(Nl,Wl,W2, R, S),
trongd6:
- N: t~phc;lPhUllh<;lncaeleihi~uchuake'tthuc(nonterminal),
- WI: t~phC;phUllh<;lncacleYhi~uke'tthuc(terminal)cuangonnguthuI,
- W2: t~phc;lphUllh<;lncactuvvng(terminal)cuangonnguthuII,
- R: t~phullh?n cac lu~tsinh
- S E N la leihi~ukhdid~u.
Khi d6, khCfng ian cac C?PJu (terminal-pairs)X =(WIU {c:})x(W2u {c:})chua
nhungC?Pdichtuvvng(lexicaltranslation)duc;lcth6hi~nbdix/y,connhungtu
adndQc(singleton)duc;lcth6hi~nbdi x/£ hay £/y, trongd6 xEu;~va y EW2.
M6i lu~tsinhse c6 dang:hoacla tr~tv blnhthuongA--+[aj,a2,...,arJ,hoacJa
tr~tVblnhdao A-'7 vdi Q;EN u X va r la b~ccuacaclu~tsinh.
T~phc;lpnhungchuy6ndtchduc;lcsinhbdiG auc;lckyhi~ula T(G). Chungtoi dil
apd\lngvanph::tmnaychovi~cchuy6na6icaycuphapAnh-Vi~tdvatrensong
nguAnh-Vi~tEVC (chitie'tail trlnhbaytrong[17]va [28])vdike'tquac6dang
sail:
[[[If/Ne'u] [a/mQt]NP
[controls/ai~ukhi6n NP]VP ]SBAR [,I,]
[you/b?n]NP[should/Den[aSk/yelldu [the/(administrator/nguoiqUailtrt]NP]
VP]VP [to/( [install/caid?t [the/(<proper/aung d~n
rand/vasoftware/ph~nm~m]J>]1\TpJVPJVP[./.J].
[hardware/ph~ncling
.>..:..u
4.4.1.3 CACH TrEP C~NDUNG DJCH THONG KE (SMT)
Cachtie'pc~ndungvanph<:lmsongsongITG n6itIeDv~ndlfavaobe)phan
richcuphapdlfatIeDlu~t,Denke'tquakhangcaosovei caccachtie'pc~nmei
hi~nnaydlfatIeDvi~chQctunglili~usongngli.Me)ttrongnhlingcachtie'pc~n
meid6la thongke tIeDnglili~usongnglida:dU'Qclienke'ttu,d€ rutraxacsutt
chuy€ndichvi tri (transposition)cuacactutrangngannglidichsoveinganngli
ngu6n.Day chinhla eachtie'pc~ncuah~dichthongke SMT (da:gidi thi~ua
2.2.2).Theacachtie'pc~nnay,ne'utac6 Calitie'ngAnh e, dn dichsangcali
tieng Vi~t v, thi vi~c nay tU'dng dU'dng vei vi~c di tim Cali
v =argmax P(v) *Pee Iv). Trangd6,taphaitint dU'Qcxacsuttmahintngan,. ,, ,.
ngli PCv) va xac sutt ma hint dich peeIv). Ta tint PCv) nh~md~tim cali dung
ngliphap(dt nhienla phiiidu'ngv~tr~tlftu)tie'ngVi~t.D€ tinhxacsuatnay,
chUngtaidi:isud\lllgcaerangthueduaiday[19]:
P(V)=PCV1V2...V,)=PCVI)*P(V2IVj)*P(V3IvIV2)*"'*PCV, Ivlv2",v,-I) (4.15)
Ne'uchungtagiasur~ng1£llakichcBcuattt ciinhlingtuvlfngtrongtudi€n
tie'ngAnh,thisotrU'onghQpkhaenhautungdaimQtcuae1e2...ek-1axacsuttdi~u
ki~nthuk sela 1£('.Day lamQtcansortt IOnvadod6chungtakhangth~tint
mQteachchinhxacrhome)tsoh<:lngPCVkIVjV2...Vk-J)duQc.Vily dod6,chung
t6isud\lllgmahint trigramd~tint xtp Xl rho tungsoh<:lngtrang(4.15)rhea
eachsail: PCVk IVjV2",Vk-I);::O P(Vk IVk-2Vk-j) (4.16)
C6nghlala thayvi xctk-1tuvlfngVIV2",Vk-ldaxayraa phiatruactuvlfng
Vk thichungtaigiaih<:ln<:lichIxct2tuvlfngaphiatrU'acdi:ixayrathai.Moi be)
ba(Vk-2Vk-lVk)dU'QcgQila trigram.NhU'v~y,tu (4.16)chungtac6 th~vie'tl<:linhu
sail:
129
I
P(V)=P(VjV2",Vk)=P(V1)*P(V2Iv1)I1P(V; IVI-2V;-1)
1=3
(4.17)
Chungtathiy rangc6th6c6Id trigramklulcnhau,vac6nhungtruongh<;5p
trigramkhongbaogiGxuit hi~ntrongnguli~uhuin Iuy~ncilachungta.Luc d6
se t5n t~i mQt P(Vk IVk-2Vk-1)=0,vadi~unaysed~nde'nP(v)=0.Vily dod6,
chungt6i rnotaP(VkIVk-2Vk-I)b1ngcachtinhg~ndung(lamiron: smoothing)nhu
sail:
P(Vk IVk-2vk-j)=B1 *T(v* Ivk-2v*-I)+B2 *B(Vk !vk-j)+B3 *U(vk)+B4 (4.18)
Voi: 8j, 82,83,8lIa cach~so(h~ngso)va81« 82« 83« 84
T: xacsuit trigram(batll'lientie'p)
B: xacsuit bigram(haitll'lientie'p)
U: x:icsuit unigram(d_uynhitmQtll')vab~ngl/lel
ChungtoitinhGongthuG(4.18)lIennguli~utie'ngVi~ttrongsongnguAn11-
Vi~tEVC d~xacdint tr~tVtliphilh<;5pnhit chotie'ngVi~t.Tuynhien,ke'tqua
khongdU<;5cnhu mong d<;5i(chI khoang60%, kern hon so voi 2 each tie'pcan
d~u).Di~unay chothiy ne'uChIxet lIen b~m~t nhueachtie'pc~nSMT nay, thl
hi~uquasethip. Chinhvi v~ymatrongcaeGongtrlnhg~nday,nguoitadi'itint
xacsuitP(v)naylIeneelciu truccaycuphap[149].
Tuy ke'tquaconh~nche',nhungr6 rangphuongphapnayc6 u'udi6mv~
phuongphaplu~nla tv dQnghoanroan,chungtakhongdn xay dvngt~plu~t
chuy~nd6i band~unhu2 tie'pc~nd~u(lu~tco dint vavanph~rnsongsong).
Honnua,c6nhungchuy6nd6ivi tri made'nnaychungtadIng chuat6ngquat
hoathanhquilu~tdU<;5c(c6th6doth6iquellhaydo_mQinhantonaod6).Vi v~y,
chungtoi v~nkhaithacke'tquaxacdint tr?ttv tli cilacachtie'pc?nnaynhumQt
thamsotronghamquye'tdinttr?ttl,rtlicilacalitie'ngVi~t.
130
4.4.1.4 CACH TIEP CAN DUNG NGD LI~U (CORPUS-BASED)
Cachtie'pc~nnaysedungGongngh~mayhQc,hQctrenkhongii'li~usong
ngii'dadu<;1clienke't(tiI',ngii')vaphantkh cuphap,d€ 1\1dQrtgrutra cacqui
lu~tchuy€nd6icaygiii'a2ngonngii'.Trangcungxuhuangnay,nguoitadathl!C
,
hi~nb~ngnhi~uphudngphapkhacnhau,di€n hinhla: phudngphapTAG d6ng
bQ du<;1CgQi t~t la STAG (SynchronousTree-Adjoini,ngGrammar)[131]la
phudngphapchuy€nd6i dl,l'atrencaccaysddip cuavanph~mTAG[92);thu~t.
loan baa loan nut eha (dominance-preserving)va rang buQcLCA (Least
CommonAncestor)[108];...
Vai cach tie'pc~nhQctrennguli~uki€u nay,bQlu~tchuy€nd6i (tr~t l,l'
tu,them/xoa,...)se du<;1c\1dQngrutra tu nguli~uthl,l'cte'.Ne'unguli~uhua'n
luy~neangd§yau,baaquat,thlbQlu~trutracanghi~uquavachinhxae.Tuy
nhien,d~codU<;1cmQtkhonguli~unhuv~yla di~ura'tkho,VIv~ykhongphai
M thongnaGclingco th~sli'dl;lngcachtie'pc~nnay.Ngoaifa, co nhungsl,l'
ehuy~nd6ib~tbuQedod~cdi~mloaihInh,nhungchuy~nd6inaynenc1u(jc1u'a
thanhlu~tcodinh(corephenomena)vachIDend~h~thonghQccaelu~tehuy~n
d6iphl;l(marginphenomena)nh~mtanghi~uquacuah~thong.
4.4.1.5
, "" A , ,:! ~
CACHTIEP CANDUNGBIEUDIENTRUNGGIAN
ThayVIehuy~nd6i trl,l'ctie'pnhu4 cachtie'pc~;tren,nguoitaconsli'dl;lng
sli'dl;lngki~uchuy~nd6i thongquamQtbi~udi~ntrunggian.Ch~ngh~nthong
quabi~udi~n"khungnguphapcach"(caseframe)[144]cuacall(lo~i,loi, dQng
tIT,vai),cuacaengu,..;haybi~udi~nQLF (Quasi-LogicalForm);...Cachtie'pc~n
nayho~tdQngtot vai caeCallchuffnml,l'c,ng~n,nhungkhi g~pcaecall ba't
ky/dai,thl h~khongduav€ duQccaed~ngbi~udi€n bentrongdo.
131
4.4.2
;:, , ~ ~?~? ~ , ,
NGUON TRl THUC CHO Vl~C CHUYEN DOl CAY CU PHAP
Xet v€ m?t ea'utrue,thl vi~eehuy§nd6i cay eu phapehinh1aehuy§n d6i
ea'utruechill euacali, con xet tIeDb€ m?t, thl dayehinhIa bai roanehuy€n d6i
tr~t1!tutrongcali.Khaini~mtr~tvtuphaidu<;)ehi§urQng1atr~tveuatu,eua
ngu,euacaethanhph~ntrongCali[33].B6i vdi tie'ngVi~t,dothuQelo~ihlnh
ddnl~p,Dentr~tt1!tu.la IDQttronghaiphudngti~nnguphapquailtrQng(ben
qmhphuongti~nnguphaptuhu).
Tuy tie'ngAnh va tie'ngVi~teunglo<;iihinhtr~ttVtu S-V-O (day1alo<;ii
hint ph6bie'nthunhi sau lo<;iihinhS-O-V [31D, nhungnhinchung,tr~t t1!tu
trongtie'ngAnhva tie'ngVi~tvankhaenhau[36],nha't1athanhphgndinhngy
trongdanhngu[9].VI v~y,vi~exemxetcaenhfmt6anthu6ngde'ntr~t1!tu1a
di€u ra'tdn thie'tvacaenhan_t6nayehinh1acaengu6ntrithuedungd§ ehuy§n
d5icaycuphaptutie'ngAnhsangtie'ngVi~tdu<;)eehinh:xaehdn.
4.4.2.1 NHA.NTO VB LOAI HINH NGON NGU
GreenbergnghieneuucaengonngutIeDthe'gidivanh~ntha'y:lo<;iihlnh
ng6nnguvalO<;iihlnhtr~t tvtuanhhu6ngnhi€u de'ntr~tvvad?edi§meuacac
thanhph~ntrongcau.Vi d1,l:
- Tr~tt1!giuatinhtuvadanhtu(tie'ngAnh:A-N, contie'ngVi~t:N-A)
- Vi tridanhtuehinh(headNoun):tie'ngAnh:sail,tie'ngVi~t:trude
- Vi tricuagidituvadanhtu:tie'ngAnhvatie'ngVi~t:P- N
- Vl tridanhtus6huuvav~tS6huu:John'sbook~ saeheuaJohn.
- "I methimyesterday~'__~"Hornqua,toi(dil)g?Panha'y"(beta "dil")
132
4.4.2.2 NHANTOVE HINHTHAI
Vi dl,lxettr~tt1!giua2 thanhphgnb6 ngutnjctiep(DO:directobject)va
b6ngugiantiep(10:indirectobject)trongti€ng Vi~t,tatM'ynhusail:
BIit fayvaotIii (+);BIifvaotIii fay(-);
BIit banfaydaydaum{jvaotIii(+);BIit vaotIiihemfaydaydaum{j(+);
C6nghlala: khailuc;1ng(chi~udai)cu~DO seanhhuangdentr~tlfcuano.
4.4.2.3 NHAN TO VE cD PHAp
TiengVi~tthuongdunglai chudQng(activevoice),contiengAnhl~ithfch
dunglai thl,ldQng(passivevoice),di~unaycangtha'yr5 trongvanphong
KHKT. Vi dl,l:"whentheprogramis activated"-+ "khi ]dchho~tchuang
trlnh".
Tieng Anh chu yell dungdqngdanhboa (nominalization),con tiengVi~t th1
lqi thfchdung dqng dQngboa (verbalization).Di~unay cangthay r6 trong
vanphongkhoahQc.Vi dl,l:"machinetranslation"-+ "djchmay".
4.4.2.4 NHANTOVE NoDNGHIA
DanhtuchI mQtloai/thli.chungchungnaGdo,seduc;1cchuy~nd6i (tr~t tlf
!chen/xoa)khacvdidanhtuchImQtloai/thuCl,lth{ Vi dl,l:
LenngI;Ca/xuo'ng ngI;Ca(-7-)
lenngI;Ca(; (-) ; Lenlung con ngI;Ca{;(+);
nh(1cvanglen (+);vanglennh(1c(-); vanglentitngnh(1c(+);.
4.4.2.5 cAe NHANTOKHAc
Ngoaifa, connhungnhanto' v~ngudl,lng(cautruc£)~-Thuyet),nhanto'
logic,tam1:9,nhanto'vanhoa,muannha'nmqnhlxemtrQng,bi~nphaptutu, ...
133
4.4.3
, ,.',. ,.?,.? ,. , , ?
CACH TIEP C~N CRO CHUYEN DOl CAY CU PHAP CUA BTL
Sail khi xemxeI caeeaehtiepe~nkhaenhauelingnhucaeyell to anh
huangdenslfehuy~nd6icayeuphap(haytr~tlftil), chungtoinh~ntha'ydog:
- Neu chi dungeaclu~tco dtnh(nhutrongcachtiepc~ndlfatrenlu~t)doeon
nguaitlf fighTra thise giaiquyetduQecachi~ntuQngloi, nhungkhonggiai
quyetduQccaehi~ntU'Qngphl,l.Ngoaifa, conc6 nhii'ngslf chuy~na6ima
chungtachuaphathi~nho?cchuaauatMnhlu~tduQc.
- NeuheaDloandungcachtiepc~nhQc(b~ngthongkehaymayhQc)trenngG'
li~uchuaauQcbeanchinh,thi<iQchinhxaechuach~ccaova thaigianhuan
luy~nclingkhaHiu.
"
Chinhvi v~y,chungtoi <iffchQncachtiepc~nlai, nghTala: vila dungcac
lu~tco dtnhdlfatIeDcac"n~anto v~lo?i hinhngonngG'"(4.4.2.1),vila dung
mayhQcd~hQctrenngii'li~uciaduQcphilotichcuphapvalienkefcayehocae
nhanto conI?i. Voi cachtiepc~nnay, chungtadamba0 vilachuy~nd6iehinh
xaecaehi~ntuQngloi, vila giai quyetduQccaehi~ntuQngphl,lmataehuaco
lu~t.Voi nguli~uhua'nluy~nngaycangbeanchinhbon,thicachi~ntuQngphl,l
seduQcgiaiquyetngaycangtotbon,vachungcuQcla kefquachuy~nd6icay
euphapsecaobon.
4.4.3.1 CHUYENDOrcAY cD PHApBANOalAI THUAT KFTBL
D~GaplingduQeyelldu lai t?Ogiii'a2cachtiepc~nn6i tIeD,chungtoisii'
dl,mg iai thu~thQeKFTBL - mQtgiaithu~tchophepke thiladgura cuamQt
M/cachtiepc~nkhacvatiept\Icsii'a16itrenketquanay,d~choraketquatot
hon(xinxeml?i d?cdi~mcuagiai thu~tTBL a phgn3.3.1.4).NghTala chung
L
' toisedungcae lu~te6 djnhd€ gan nhanedso, r6i M sotie'p tue ciii thi~n tren
do.
134
(j 113.'j,\)~i \.()il\ c,\\\l'j~l\l1&ic,3.'jc,\1"Q\\i"Qti~l\~~\\ ':>a.l\~c,3.'jc,\1"Q\\i"Qtl~l\~
Vi~tduQeduav~d;:mgbairoangallnhanV!triehocaethanhph~nellphap trong
calltiengVi~tsovdivttrieuachungtrongcalitiengAnhgee.
GBnhan
ehuy~nd6i
Ngfi'li~uhuanluy~nEV
gallnhantli'lo<:ii,ellPhap'
lngfi'nghiavachuyinii6i
Ngii'li~ukh6nge6
nhanehuy~nd6i
Caengu5n
trithliev~
ehuy~pd6i
tr~ttvtU
T
BQ gallnhan
ngii'nghlacd sa
"""""""""""""""". .. .. .
i Khunglu~ti
. .. .'
Nguli~uduQegall
nhanhi~ntanh
Caelu~t lingvieD Ngu li~ugall nhan
theolu~t ling vieD
Lu~ t t<5iu'u
y
'\.
llfinh 4.18.M6 hinhhuiln'luy~ndE gallnhanehuyEnd6i cay enphaptrangBTL.
135
4.4.3.2 NGU LI~U HUAN LUY~N
Vdi cachtie'pc~nmayhQclien ngfi'li~u,chungtaphaichuffnbi ngfi'li~u
huanluy~n.Ngii'li~u($dayphaila ngfi'li~udadu<;jclienke'ttu, lien kef ngu
phanrichcuphaprhoca2ng6nngfi'.Chungtamay~n kethliacackefqua-
naytUcactftngtren,baag6m:lienkettu(4.1.1),gallnhanngfi'phap(4.2),th~m
chieelgallnhanngfi'nghia(4.3)rhongfi'li~\lsongngfi'Anh-Vi~tEVe.
4.4.3.3 GA.NNHANcd Sd
ChungWisad1:lngcaclu~t ehuy~nd6ico dinh,gifnli~nvdi m6ilu~t sinh
dU9Csad\lngtrongbuckphanrichcuphap.D1,1'aliencaclu~tsinhdu<;jcsadl;1ng
trongcali tie'ngAnh, chungt6i trabangd~rimlu~tanhx~sangcae lu~tsinhrho
calitiengVi~t.Cl;1th~nhusail (dungki hi~ucuavanph~mITG ($phftn4.4.1.2):
Bang 4.3:Th~dl;1v~caelu~tehuy~nd6i codinh.
So Lu~ttie'ngAnh Lu~ttiengVi~t
1. s~ [NPVP]
NP~ [ADJP NP]
s~ [NPVP]
2. NP~
3.
4.
NP~ [NP1NP2]
VP~ [haveVP]
NP~
VP~ [eVP]
5. VP~ [alreadyVP] VP~
Vi du:cali tieng Anh: "I have already read that interestingbook" se du<;je
chuy€nd6i (b~ngcac lu~tco dinh($Bang4.3)nhusail: [[I/T6i]NP [have/e
ADJP book/euonsaeh>NP
jVP>VP]VP.] ----
K€tqualasecocau tiengVi~t:"T6i dadQccuonsaehthuvi dor6i".
136
4.4.3.4 KHUNG LuA T
Cac khunglu~ttrongbai roannaysesli'dl:mgcacnhanto'tr~tt1Jtutucac
ngu6ntri th(tcco lien quailde'nvi~cchuySnd6ivi tri (ngoainhtingngu6ntri
thti'cdadu'Qcsli'd1,lngtrongphfingallnhancosaa tren),vi d1,lnhu'sau.:
Word Word
LEN LEN LEN
pas pas pas
SJ'N SJ'N SJ'N
NSEM [NSEM - NSEMa a1 a2
Word
LEN
pas
SYN m ,
- N:~M]~ U No[Na; - Naj - X - Nak]
;,j ,k=O
;:;;:j:;;:k
Tron£do:
- Wordla bie'nthShi~nhlnhthai(nhu':"of", "account",...)cuatu;
- LEN labie'nthShi~nchi6udaicuanode(solu'Qngtutrongnode);
- P~Sla bie'nth~hi~ntuloi;li{NN,VB, DT, IN,..}cuatuhaynhancuphap
cuanode{VP,NP,..};
- SYN labie'nth~hi~nvaicuphap {Sub,Obj,Modifier,Goal,...}cuanode;
- SEMIabie'nth~hi~n hanngunghlacuanode
- x: Iathanhph~ndu'Qcchenthemvao;
- Chiso'du'oichivi tritu'ongd6icuathanhphfindotrongNodecha,voim
la vi tri xanhattint ill d~u.NaIa nodethapnhatmabao(dominate)mQi
s11chuySnd6itrongIu?t.
Lu'Uy: Khongph,hmQithongtin Word,LEN, P~S,SYN vaSEM d6udn thie't
clingmQtIuc.Moi nodeconNaiENa co th6chuacacnodeconNaixvoi cac
thOngtindu'Qcdinhnghla(mQteachd~qui)tu'ongt1;1'nhu'Nai. Chungtaxemxet
vid1,lsail(d6dongian,tabocacleYhi~uchisolIen chiv6tri thucdungtrong
chuySnd6i):
No[Nal[NalI -NaI2]-Na2[Na21-Na22]]=>NJNa22-NaJNa21]-NaJNaI2 -NaIl]]
137
Lu~tehuy~nd6iliencod,~lllgcaycuphapnhuHInh4.19dudiday:
Na
/1i1;(
c;
Na
N." /'1\~i -~N
I al
T ~
NaIl Nal2 Na21, Na22 Na21 Nal2 Nail
HInh4.19:KhaDangehuy~nd6i diy ellphapb~ngKFfBL.
4.4.3.5
- ~ ~?~? ,
DAY LUA TCHUYEN DOl DUQC RUT RA
Sail quatrlnhhua'nluy~n,h~th6ngnayserllt ra duQcdaylu~tt6i u'uco
d~nggi6ngnhu'nhunglu~tdudi day (chi tie'txin xemphftnthl1cnghi~ma
chu'dng5):
(POSA'=Qwh)n(POSN =Aux)n(POSN =SP)n(POSN. =VP)
n 0). oJ 0.' Co nabla la:
""1 N N "d kh~" t>=>1\a2- aJ- a3- u'Qc ong
ne'ula calihoicocayellphapngu6ng6m:trQdQngtu- chungii'- vtngii',thlno
sedu'Qchuy~nd6isangtie'ngVi~tthanh:chungu- tr9dQngtu- vtnguva
themhu'tu "kh6ng"vaocu6icali.Vi dl;1:"CanyouspeakEnglish":=>"Anhco
th€noitie'ngAnhdU'Qckh6ng?".
4.4.3.6
, , ,~, ? ~?:!
HAM DANH OlA KET QUA CHUYEN DOl
H~th6ngse danhgiLlke'tquachuy~nd6ib~ngeachsosinh cayell phap
dichvdicaycuphaptie'ngVi~ttrongnguli~uvangEVe. Vi~csosinhduQc
tht,tchi~nd6ivdi caeky hi~uda'ungo~c(bracket)vacaenode.Tuy nhien,cae
nodetie'ngAnhnay phaidu'Qc-chuy~ndtchsangcaenodetie'ngVi~ttu'dngung
(thOngquam6ilien ke'ttutrongEVC) d~mdisosinh khdpvdicau tie'ngVi~t.
M{Jtphftncaeke'tquanghienCUllv€ kh6ichuy~nd6icaycuphapdadu'Qcc6ng
b6trang[72].
138
4.5 BAI TOAN DANH GIA CRA T LU<)NGlIt DJCH EVT
M1,1ctieuchinhmalu~nannaymuond<;ttduQcchinhla hl$dichtl,l'dQng
Anh-Vil$tco chatluQngkhavachatluQng~d6ngaydmgduQcDangcao.Nhung
lamsaodaubgia duQCchatluQngdichcua n6 ? TITtru'ocdennay,nguaita
thuangdaubgiab~ngray.Nhunglamnhuv~yseva rungtonkern,ch~mvac6
th€ khongkhach quail (neuchinhnguai xay dl,l'nghl$thongtl,l'danhgia). Ngoai
fa, neu danh gia b~ngray nhu tren, ta se khong th€ danhgia nhii'ngthay d6i
hhg ngayva thuangd~ndenhil$ntuQngflip-flop(dayIa hil$ntUQngmakhihl$
th6ngc6mQtsl,l'thayd6inaod6d€ kh~cph1,1cmQt16isainay,nhunghl$thong
sed~nden16isaikhacmatakhongngatoi).Trai l<;ti,neuchungtac6duQcmQt
GongC1,1danhgia tl,l'dQngvatucthai,chungsegiuptahi~uchinhhl$thongngay
l~ptucvad~dangchQndU<;5ccacthamsotoiu'Urhoh~thongd~ch.Ngoaifa,voi
ketquadaubgiatucthl (on-line)nhuv~y,chungsela ngu5n.dQngvieDratJOn
ehocacnhaxaydl,l'ngh~dichmay.
B€ giaiquyetbairoandanhgianay,chungtahaynhlnl<;ticacbaitoaDda
giaiquyeta nhii'ngphgntruoc(nhu:gallnhanhlnhthai,ngii'phap,ngii'nghTa,
ehuy€nd6i),chungtad~udunKhamdaubgia Scorenhutronggiai thu~tTBL
(bhg cachso sanhnhandu<;5Cgall voi nhandungtrongngii'li~uvang).Tuongtl,l'
nhutren,chungta fighTngaydenvi~cdanhgia b~ngcachrho h~chungta dich
caecalltiengAnh trongEVC, r5i saild6sosanhcautiengVi~tdichra cuahl$
vdicalltiengVi~tchu~ndaduQcnguaidichtrongEVe. Th~tv~y,vaiDamg~n
day,cacnhadichmaydalamnhuv~y.NhungquailtrQngla sosanhnhuthe
naG,khongth€ sosanh1-1duQc,VIrungmQtcalltiengAnh,chungtac6th€ c6.
nhi~ucautiengVil$tkhacnhau(v~cu phap,v~tr~tW tu,v~cachdungtu,...)
nhungv~nduQcxemla dung.
139
Xuatphat tuytu'ongcuaKeh-Yih Suva cacd6nglacgiatrongc6ngtrinh
[139]v~daubgia h!$dichAnh-Hoa,chungt6i duabai roanso sanhgiuacali
tiengVi!$tdomaydichvadiu donguaidiehv~-bai roantlm1<)trlnhngannhat
giua 2 di§m: tu di§m xuat phat - R va di§m dich Q. Trang d6
R = {CII'CI2,..., Clm} la cali tiengVi!$tdo maydichc6 m tu (tITchinht3.)va
Q ={C]I ,CI2 , , Cln} la Calidonguaidiehc6n tu (tuchinhra).
C2n
End
t6i
C22
C2l
cua
Start Cll Cl2 Clm
Blnh 4.20:L<)trlnhdi chuy§ntu caliR dencali Q.
Chi phI duQctlnhb~ngD =Wd *nd+Wi* ni +wr * n,.+Ws* nstrongcI6:
nct,OJ,nr va ns1aso badexoa (Del), chen (Ins), thaythe (Rep) va ciao(Swap)
tu'dngling (NOP:kh6ngthayd6i) va Wct,Wj,Wrva Wsla cach!$so tudngling (h~
sonaytuythu<)cvaong6nnguva chQntheothlfCnghi~m).Vdi wd=l,wj=5,
wr=5va ws=6.
Tu R -7 Q, tac6th§e6nhi~u1<)trinh,vacacchiphisekhacnhau,vi d\l:
-
/
/
nayLL' / I I \t
.'.
tinhl I..".+.'.
mayL
R day la may tinh
? t6icua
R day la___- may tinh
? t6i tac6: nd=l,nr=3cua
Q may tlnh nay
?
t6i D=lxl+5x3=16cua
Opl REP REP REP DEL NOP NOP (duangnetdlit)
140
Bai 1<)trmhtren(conc6th€ c6nhi~u1<)trlnhkhac)tuonglingvoi2d6thikhac
nhautrong
Hlnh4.20henphai, trongd6:duongngangtuonglingvoiDEL, duongdQc:
[NS,duongcheo:REP ho?cSWAP. D<)chinhxaccilaCalidichdu<,:jctint bdi:
l( =m~. voiDIDiola du&ngdi ngiinnha-t(chiphitha-pnha-t)tuR toiQ vamla
. /Lmm
lhi~udai cali.Vi~cxacdinhDmioduabai toaDv~bai toaD"Nguoibanhang" ),
lienthu<)ctrongky thu?tquyho?chd<)ng(dynamicprogrammingtechnique).
Tuynhien,theecachgiaiquy€t tIeD,tathayKeh-YihSu chIquailHimv~
~tkyhi~u(tuchinhtit),chlichuaquailtamd€n nhungy€u to ngonngu,nhu:
U,ngu;nghlacilatu;...D€ kh~cph1,lckhuy€tdi€m tIeD,KishoretrongGongtrmh
LEU(BiLingualEvaluationUnderstudy)[96]daduaray tudng:thayVIchIso
sanhlienlilli-gram(l tuchinhtit)nhu Key-Yih Su,thltasf:sosanhtIeDn-gram
(ntUchinh!ii, voi n=4,~hl"maytint cila toi" trongvi d1,ltIeDsf:du<,:jcxem nhuJ
L I Count clip (n - gram)
Ce(CulldJale.,' }n grameC"
adnvi)nhusail:P n = L I Count (n - gram)
C e I Ctmdidale." ) II - gram e C
(4.19)
Voi Candidatesla t?P caccall dichb~ngmay,conCountclip(n-gram)la
nhungngu(g6mn tuchinhtii)xua-thi~ntrongCallngu6nvacalldiehtrongngu
li~uvang.Caiti€n naydatinh_~€ndonvicaohontuchanttit:d6la tuvangu,
nhungchuatintd€n ngunghIacilatu(cungm<)tynghla,nhltngc6th€ dungcac
titkhacnhaud€ th€ hi~n).VI V?y,chungtoi khongsosanhtrungkhop,maso
sanhcoclingho~cg~nlOpngii'nghlahaykhongnhutrongph~n4.1.1.3.
R day 1a
may I tinh
? toi tac6:nd=2,nj=lcua
Q may1 tinh nay
? toi D=lx2+5x1=7cua
I
Op2 DEL DEL NOPjNOP
INS NOP NOP (duongnetlin)