ỨNG DỤNG KỸ THUẬT KHAI KHOÁNG DỮ LIỆU TRONG NGHIỆP VỤ XỬ LÝ CƯỚC ĐIỆN THOẠI TẠI BƯU ĐIỆN TỈNH NINH THUẬN
HỒ ANH TÀI
Trang nhan đề
Mục lục
Mở đầu
Chương1: Tổng quan về khai khoáng dữ liệu.
Chương2: Luật kết hợp.
Chương3: Khai khoáng luật kết hợp mờ.
Chương4: Kết quả khai khoáng dữ liệu cước điện thoại và đánh giá.
Tài liệu tham khảo
Phụ lục
29 trang |
Chia sẻ: maiphuongtl | Lượt xem: 1766 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Luận văn Ứng dụng kỹ thuật khai khoáng dữ liệu trong nghiệp vụ xử lý cước điện thoại tại bưu điện tỉnh Ninh Thuận, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
13
CHu'ONG2
LU!TKETH(jP
2.1.Ynghiacua Im}tke'thc;ip
Lu~tke'thQpla mQtlinhv\fcquailtrQngtrongkhaikhmlngdfi'li~u.Lu~t
ke'thQpgiilptlmdu'Qcacm6ilienh~gifi'acacml;1cdfi'li~u(items)cuacdsddfi'
li~u.Trongnganhvi6nthong,cac lo~idichVl;1clingca'pchokhachhangngay
cangnhi~u,dod6c6th~tlmm6ilienke'tgifi'avi~csadl;1ngcaclo~idichVl;1d~
phl;1cVl;1cho vi~cquangcao,tie'pthi. Vi dl;1nhu'd~tlmhi~uth6iquell sa dl;1ngcac
dich Vl;1vi6n thongcua khachhang,ngu'oita thu'ongd~tdiu hoi "Nhfi'ngdich Vl;1
naokhachhangthu'onghaysadl;1ngcunglilc voi nhaukhi dangky sadl;1ngt~i
trungtamchams6ckhachhang?". Cacke'tquanh~ndu'QCc6th~dungchovi~c
tie'pthidichVl;1nhu'li~tkecacdichVl;1khachhanghaysadl;1ngcunglilcn~mg~n
nhau,ho~ekhuye'nmaidiehVl;1kernrhea...
S6giaolacmua
maydi~ntho~idi
dQng
Trang s6 cac
giao lac mua
may di~ntho~i
di dQngthl co
80% giao lac
mua them
simcard
s6giaolac
muasimcard
30%s6giaolacmua
cahaim~thang(dQ
h6tr<;1-support)
Hinh 2.1.Minhho~v~lu~tke'thQp
Lu~tke'thQpla nhfi'nglu~te6d~ng"80%khaehhangmuamaydi~ntho~i
di dQngthlmuathemsimeard,30%c6muacamaymaydi~ntho~idi dQngl~n
simeard"ho~e"75% khachhanggQili~ntinhvas6ngd caehuy~nthlgQi di~n
tho~iIP 171lien tinh,trongd625%s6khaehhangvilagQilien tinh,s6ngd cae
14
huy~nvuagidi~ntho~iIP 171lien tinh"."muamaydi~ntho~idi d(}ng"hay
"gili~ntinh va s6nga cac huy~n" a day du<;1cxemla vt trai (ti~nd~-
antecedent)cualu~t,con"muasimcard"hay"gidi~ntho~iIP 171lientinh"la
vt phai(ktt lu~n- consequent)cualu~t.Caccons630%hay25%la d(}h6tr<;1
cualu~t(support- s6phffntramcacgiaotacchuaca vt traiva vt phai),con
80%hay75%la d(}tinc~ycualu~t(confidence- s6phffntramcacgiaotacthoa
manvt traithlclingthoamanvt phai)[13].
Gi_LT:c6 --+gLIP_171: c6 (£>()h6 tr()tin c~y=75%)
DQhtJtH!vadQtincgylahaithuacdochom(}tlu~tktt h<;1p.
£>()h6tr<;1b~ng25% c6nghlala "Trangcackhachhangc6 sad\mgdi~n
tho~ithlc625%khachhangsad\mgdi~ntho~ilientinhvadi~ntho~iIP 171".
£>()tinc~yb~ng75%c6nghlala "Trangcackhachhangc6sad\mgdi~n
tho~ilientinhthlc675%khachhangsad\mgdi~ntho~iIP 171".
Cactri thucdeml~ibai lu~tktt h<;1pa d~ngtrenc6st;1'khacbi~tcdbanso
vai thongtin thudU<;1ctucacCalil~nhtruyva'ndii'li~uthongthuangnhungon
ngii'SQL. £>6la nhii'ngtri thuc,nhii'ngm6ilienh~chuabitt tru'acvamangtinh
dt;1'baadangtlmffntrangdii'li~u.Nhii'ngtri thucnaykhongddngianchi Ia ktt
quacuaphepnh6m,tinht6nghaysa:pxtp mala ktt quacuam(}tquatrlnhtinh
toankhaphuct~pvat6nnhi~uthaigian.
Tuy lu~tktt h<;1pla d~nglu~tkhaddngiannhu'ngl~imangkhanhi~uy
nghla.Thongtin mad~nglu~tnaydeml~ila ra'tdangkS vah6 tr<;1khongnho
trangquatrlnhra quyttdinh.TIm kitm dU<;1Ccaclu~tktt h<;1p"quyhitm" va
mangnhi~uthongtin tucdsadii'li~utacnghi~pla mN trangnhii'nghuangtitp
c?n chinhcuaIInh vt;1'ckhai khoangdii'li~u.
15
2.2.MQts6hd8ngtie'pc~ntrongikhaikhoanglu~tke'thqp
LInh vvc khaikhO<lnglu~tke"thQpchode"nnaydii duQcnghiencoova
phattri~ntheonhi~uhuangkhacnhau.Co nhii'ngd~xua'tnh~mciii tie"nt6cdQ
thu~troan,conhii'ngd~xua'tnh~mtimkie"mlu~tcoynghIahdn.
MQts6huangtie"pc~nchinhtrongkhaikhoanglu~tke"thQp:
Lu~tke"thQpnhiphan(binaryassociationruleho~cbooleanassociation
rule):1ahuangnghiencoodffutieDcualu~tke"thQP.Hffuhe"tcacnghiencooa
thaiky dffuv~lu~tke"thQpd~ulienquailde"nlu~tke"thQpnhiphan[13].Trong
dC;lnglu~tke"thQpnay,cacm\lcdii'li~u(thuQctfnh)chiduQcquailtamla cohay
kh6ngXU:lthi~ntronggiaotaccuacd sadii'li~uchlikh6ngquailtamv~"muc
dQ" xua'thi~n.Co nghIala vi~cgQi10cuQcdi~nthOC;liva 1cuQcduQcxem la
gi6ngnhau.Thu~troantieu bi~unha'tkhai phadC;lnglu~tnay la thu~troan
Apriori vacacbie"nth~cuano.Bay la dC;lnglu~tddngianvacaclu~tkhaccling
coth~chuy~nv~dC;lnglu~tnaynhamQts6phudngphapnhurairC;lcboadii'li~u,
maboadii'li~u...
Vi d\lv~dC;lnglu~tnay:"gQiliendnh='co' vagQidi dQng='co' ~ gQi
qu6cte"'='co' vagQidichv\l1O8='co',vaidQh6trQ20%vadQtinc~y80%".
Lu~tke"thQpco thuQctfnhs6 va thuQctfnhhC;lngm\lc(quantitativeand
categorialassociationrule):trongthvcte"cacthuQctfnhcuacaccdsadii'li~uco
ki~ura'tdadC;lng(nb!phan- binary,s6- quantitative,hC;lngm\lc- categorial...).
B~phathi~nlu~tke"thQpvdicacthuQctfnhnay,cacnhanghiencoodiid~xua't
mQts6phudngphaprai rC;lcboanh~mchuy~ndC;lnglu~tnayv~dC;lngnhiphand~
coth~apd\lngcaethu~troandiico.
16
Vi d\l v€ dl;lnglu~t nay: "phu'dmgthuc gQi ='tv dQng' va giO gQi E
'23:00:39.. 23:00:59'va thai gian dam thol;liE '200 .. 300' q gQi lien tinh =
'c6',vdidQh6trQla23.53%,vadQtinc~yla 80%".
Lu~tk€t nhi€u muc(multi-levelassociationrule):vdi cachti€p c~ntheo
lu~tnaysetlmki€m themnhii'nglu~tc6dl;lng"milamaytinh~ mila h~di€u
hanhvamilaph~nm€m ti~nichvanphong..."thayVIchinhG'nglu~tquaC\lth~
nhu'"milamaytinhhi~uIBM ~ mila h~di€u hanhMicrosoftWindowsvamila
ph~nm€m ti~nich vanphongMicrosoftOffice ...".Nhu'v~yd(;lnglu~td~ula
dl;lnglu~ttangquatboacuadl;lnglu~tsailva tangquattheonhi€u muckhac
nhau.
Lu~tk€t hQpma(fuzzyassociationrule):voinhii'nghl;lnch€ cong~pphai
trangquatrlnhrai rl;lcboacacthuQctinhs6,cacnhanghienCUlldffd€ xua'tlu~t
k€t hQpma nh~mkh~cph\lccachl;lnch€ trenva chuy~nlu~tk€t hQpv€ mQt
dl;lngtvnhienhdn,g~ngiiihdnvdingu'aisad\lng[3].
MQtvi d\l cuadl;lngnayla: "khachhangtu'nhanva thaigiandamthol;li
IOnva gQinQitinh~ cu'dckhonghQpl~='c6', vdidQh6trQ4% vadQtinc~y
85%".Tronglu~ttren,di€u ki~nthaigiandamthol;liIOnd v€ traicualu~tla mQt
thuQctinhdffdu'Qcmaboa.
Lu~tk€t voithuQctinhdu'QCdanhtrQngs6(associationrulewithweighted
items):trongthvct€, cacthuQctinhtrangcdsddG'li~ukhongphaihJcnaocling
c6vaitronhu'nhau.C6mQts6thuQctinhdu'QchutrQnghdnvac6mucdQquail
trQngcaohdncacthuQctinhkhac.Vi d\lkhi khaosatv€ doanhthusad\lngcac
dichV\l cuakhachhang,thongtin v€ thaigiandamthol;li,vungcu'dchayd6i
tu'Qngkhachangla quailtrQnghdnnhi€u sovoi thongtin v€ phu'dngthucgQi.
Trangquatrlnhtlmki€m lu~t,thaigiangQi,vungcu'ocdu'QcgallcaetrQngs6lOn
17
bonthuQctinhphuongthlicgQi[1].Day HihuangnghiencUura"thuvi va dil
duQcmQts6nhanghienCUud~xua"tcachgiaiquye'tbaitoannay. Vai lu~tke't
hQpco thuQctinhdu<;1cdanhtrQngs6,se thai khoangdu<;1cnhii'nglu~t"hie'm"
(tliclacodQh6tr<;1tha"p,nhtingco9nghlad~cbi~tho~cmangra"tnhi~u9nghla).
Lu~tke'thQptie'pc~ntheohuangt~ptho(miningassociationrulesbaseon
roughset):timkie'mlu~tke'th<;1pd1;iatren19thuye'tt~ptho.
KhaikhoangLu~tke'thQpsongsong(parallelminingof associationrules):
bell qnh thai khoanglu~tke'thQptuffnh!,cacnhalamtinhQcclingt~ptrung
vaonghienCUucacthu~tgiaisongsongchoquatrinhphathi~nlu~tke'thQp.Nhu
cffusongsongboavaxU'19phantanla cffnthie'tboikichthuacdii'li~ungaycang
IOnbonnendoihoit6cdQxU'19clingnhudungluQngbQnhacuah~th6ngphai
duQc(Hmbao.Co ra"tnhi~uthu~ttoansongsongkhacnhaudild~xua"t[12]d€ co
th€ thongphl;!thuQcvaophffncling.
Ngoaifa, concomQts6huangnghiencookhacv~thai khoanglu~tke't
hQpnhu: thai khoanglu~tke'thQptr1;ictuye'n,thai khoanglu~tke'th<;1pduQcke't
n6i tn!c tuye'nde'ncac tho dii' li~u da chi~u(Multidimensionaldata,data
warehouse)thongquacongngh~OLAP (OnlineAnalysisProcessing),MOLAP
(MultidimensionalOLAP), ROLAP (RelationalOLAP), ADO (ActiveX Data
Object)...
Ngoaivi~cnghiencoov~nhii'ngbie'nth~cualu~tke'th<;1p,cacnhanghien
cooconchutrQngd~xua"tnhii'ngthu~toannh~mtangt6cquatrinhtimkie'mt~p
ph6bie'ntuco sOdii'li~unhuthu~toanApriorinhiphan, phuongphaptimt~p
dffyducacm~uph6bie'nd1;iatrenFP-treemathongcffnphatsinhlingvien[7].
18
2.3.Phatbi~ubfliloankhaikhoanglu~tke'thqp
I = {i],i2,...,in}:la t~pbaag6mn ml,lc(Item- congQila thuQctinh-
attribute).X c I duQcgQila t~pml,lc(itemset).
T ={t1>t2,...,tm}:lat~pg6mmgiaotac(Transaction- congQilabanghi
- record),m6igiaotacduQcdinhdanhbaiTID (TransactionIdentification).
R lamQtquanh~nhiphantrenI vaT (hayR c IxT).Ne'ugiaotact co
chuaml,lci, coth€ vie't(i,t) E R (ho~ciRt).
MQtcdsadfi'li~uD, v€ m~thinhthuc,chinhla mQtquanh~nhiphanR
nhutren.V€ y nghIa,mQtcdsadfi'li~ula mQtt~pcacgiaotac,m6igiaotact la
mQtt~pml,lc,t E 2' (2'la t~pcact~pconcuaI) [13].
Vi dl,lv€ cdsadfi'li~u:I ={A,B, C, D,E},T ={I, 2,3,4,5, 6}.
Thongtinv€ cacgiaotacchoabang2.1:
Bang2.1.Vi dl,lv€ mQtcdsadfi'li~udc;mgiaotac- D
ChomQtt~pml,lcX c I.
Ky hi~useX)la dQh6trQ(support)cuamQt~pml,lcX - latyl~phffntram
s6giaotactrongcdsadfi'li~uD cochuaX trent6ngs6cacgiaatactrongcdsa
dfi'li~uD. seX)=Card(X)/ Card(D)%.
Blnh danhgiaotac(TID) Tp mQc(itemset)
1 AB DE
2 BC E
3 AB DE
4 ABC E
5 ABCDE
6 BCD
19
Tap ~h6bitn: Cho mQtt~pm11lcX c I va ngu'ongph6bie'nt6i thi€u
minsupp(MinimumSupport)E (0, 1] du'Qcxacdinhboingu'oisudl:mg.MQtt~p
ml;lcX du'QcgQila mQtt~pph6bie'ntheongu'ongminsuppne'uvachine'udQh6
trQcuanoIOnhonho~cb~ngmQtngu'ongminsupp.seX)~minsupp[13].
Ky hi~uFX(T, I, R, minsupp)la t~phQpcact~pph6bie'ntheongu'ong
minsupptuclaFX(T, I, R, minsupp)={X c I I seX)~minsupp}[2].
Vdi (T, I, R) trongcosOdfili~ubang2.1vagiatq ngu'ongminsupp=50%
seli~tketa'tcacact~pph6bie'n(frequent-itemset)nhu'bang2.2.
Bang2.2.Cact~pph6bie'ncuacosOdfili~ubang2.1vdiminsupp=50%
DQh6trQscualu~tke'thQpX -+ Y la tyl~phgntramcacgiaolactrong
DcochuaX vaY sex-+Y) =Card(XuY) 1Card(D)%.
Lu~tke'thQpcod(;lng X c 7 Y trongdo:
X vaY lacact~pml;lcthoamandi~uki~nX (\ Y =0.
c la dQtinc~ycualu~t.
c =s(XuY)/s(X)%(c=Card(XuY)/Card(X)%):latyl~phgntramcacgiao
lactrongD cochuaX thlchuaY. V~m~txacsua't,dQtinc~yc cuamQtlu~tla
xacsua't(codi~uki~n)xayraY vdidi~uki~ndaxayraX.
Cactp IDQCph6bie'n DQh6trf! ttfdngilltg
B 100%(6/6)
E,BE 83%(5/6)
A,C,D,AB,AE,BC,BD,ABE 67%(4/6)
AD,CE,DE,ABD,ADE,BCE,BDE,ABDE 50%(3/6)
20
Luat ke'thdptin cay:mQtlu?tdu'Qcxemla tinC?yneudQtinC?yc cuan6
IOnbonho~cb~ngmQtngu'ongtinC?yt6i thi€u minconfE (0, 1].(c ~minconf)
[13].Ngu'ongminconf(MinimumConfidence)phananhmucdQxuffthi~ncuaY
khichotru'ocX.
Lu?tkethQpdn Hmla lu?tkethQpthoaminsuppvaminconfchotru'oc.
(Chiquantamdencaclu?tc6dQh6trQlOnbondQh6trQt6ithi€uvadQtinC?y
IOnbondQtinc~yt6ithi€u).
Bai loankhaiphalu?tkethQp(d"mgdongian)d~tranhu'sau:
,ChomQtcosadli li~uD, dQh6trQt6i thi€u minsupp,dQtinc~yt6i thi€u
minconf.Haytlmtfftcacaclu?tkethQpc6d~ngX ~ Y thoamandQh6trQ
s(XuY) ~minsupp,dQtinC?ycualu~tc(X ~ Y) =s(XuY)/s(X),c ~minconf.
Hftubetcacthu~tloandu'QCd€ xufftd€ khaiphalu?tkethQpthu'ongchia
thanhhaipha[3]:
Pha1:tlmta'tcacact?Pml;lcph6bientucosadli li~utucla Hmtfftcacac
t~pml;lcX thoamanseX)~minsupp.
Pha2:sinhcaclu?ttinc~ytucact~pph6biendatlmthffyapha1.
Neux lamQt~pph6bienthllu?tkethQpdu'QcsinhtuX cod~ng:
X' c :7 X \ X', trongd6:
. X' la t~pconkhacr6ngcuaX.
X \ X' la hi~ucuahait~phQpX vaX' ..
. c ladQtinC?ycualu~thoamanc ~minconf.
Vi du:voicact~pph6biennhu'trongbang2.2,dQh6trQt6ithi€u minsupp
=50%vadQtinc~yt6ithi€u minconf=70%.Xett~pph6bienABE c6dQh6trQ
67%,c6th€ sinhracaclu~tkethQptut~pph6bienABE nhu'bang2.3.
21
Bang2.3.Lu~tk~thQpsinhtitt~pph6bi~nABE
Tauh<Jbie'nt6idai:
Cho M E FX(T, I, R, minsupp),M du'QegQiHi t~pph6bi~nt6i d(;lin~u
kh6ngt6nt(;liX E FX(T, I, R, minsupp),M"* X, M eX [2].
2.4. Thu~ttminApriori vaApriori nhiphand~timcaet~pph6bie'n
Thu~toaDAprioritlmcaet~pph6bi~nthtfehi~nqua3bu'oe[1]:
. Bu'oe1:d~mde>h6trQehom6iItem,dtfavaominsuppehQnt~pph6
bi~ne61phffntU'.T~pph6bi~ngQilaLargeItemset.
. Bu'oe2: tUt~pLargeItemsetphatsinhra t~pU'ngVieD(Canddiate
itemset),d~mde>h6trQehot~plingVieDnayvaehQnracaeitemset
e6 de>h6 trQIOnhon ho~eb~ngminsuppd€ du'avao t~pLarge
Itemset.
. Bu'oe3:l~pl(;libu'oe2ehod~nkhikh6ngcontlmtha'ythemme>tt~p
LargeItemsetnaonITa.
Lut ke'th<jp DC)tin cy c minconf?
A 100%) BE (e=s(ABE)/s(A)=100%) C6
B 67% ) AE Kh6ng
E 80% ) AB C6
AB 100%) E C6
BE 80% )A C6
AE IOO%)B C6
22
Mo phongthu~ttoanApriori [I]:
LI =T~o_L_I(D.minsupp);
L=0;
k=2 ;
While (Lk-I "*0)
Ck=T~o_C_k(Lk-l);
Lk=Tinh_dQ_h6_tr(LC_k(Ck.minsupp);
L =L U Lk ;
k =k+l ;
}
Answer=L ;
+ChuangtrlnhconT~o_L_I(D. minsupp):hamnaysinhra LI la t~pcae
t~pph6bie'nco me>tphgntii'. Caet~pph6bie'nnaycode>h6tr<jIOnhanho~c
bangde>h6tr<jt6ithi~uminsupp.
Thu~ttoan:
For all transactiont E D do
For all itemi E t do
i.support++;
LI ={iI i.support~minsupp};
+ HamT~o_C_k(Lk-I):th1!chi~nke'tn6i caec~p(k-l) ItemSetd~phat
sinhcaet~pk ItemSetlingvienmoi.
23
Thu~ttmin:
INSERT INTO Ck
SELECT P.item_l, P.item_2,...,P.item_k-l, Q.item_k-l
FROM Lk-lP, Lk-l Q
WHERE (P.item_l =Q.item_l)AND ..AND (P.item_k-2=Q.item_k-2)
AND (P.item_k-l<Q.item_k-l)
Di~uki~nP.itemk-J<Q.itemk-Jnhdmkhongphatsinhcaebf)trungnhau.
+HamTinh_dQ_h6_tr(LL_k(Ck,minsupp):duy~tquacdsadii'li~uD d~
c~pnh~tdQhotrqchocacthuQctinhtrongCkvachQnnhii'ngt~pph6bie'ncodQ
hotrQIOnhdnho~cb~ngminsuppd~c~pnh~tvaoLk'
Thu~tloanApriorikhi th\ichi~ncoke'tquatotnhogiamd~nkichthu'oc
cuacact~plingvien.TuynhienchiphitinhloandQhotrQchocact~plingvien
conIOn.Thu~tloanApriorinh!phanciiithi~ndangk~chiphineutren.
Thu~tloanApriorinh!phan[9]sadl,mgcacvectdbitchocacthuQctinh,
vectdnh!phann chi~ulingvoin giaolactrongcdsadii'li~u.Sadl,mgcdsadii'
li~ubang2.1d~minhho~chothu~tloannay. Co th~bi~udi~ncdsadii'li~ucua
bang2.1b~ngmQtmatr~nnh!phantrongdodongthli i tu'dnglingvoi giaolac
(banghi)tjvacQtthlij tu'dnglingvoiml;lc(thuQctinh)ij.Ma tr~nbi~udi~ncdsa
dii'li~uband~uchotrongbang2.1nhu'sau:
TID A B C D E
1 1 1 0 1 1
2 0 1 1 0 1
3 1 1 0 1 1
4 1 1 1 0 1
24
I : I : I : I : I : I : I
Caeveetdbi~udi6nnhiphanehocaet~p1thuQetinhcod~mgsau:
Caeveetobi~udi6nnhiphanehocaet~p3thuQetinhcodl;lngsau:
{A} Veetd {B} Veetd {C} Veeto {D} Veetd {E} Veeto
1 1 0 1 1
0 1 1 0 1
1 1 0 1 1
1 1 1 0 1
1 1 1 1 1
0 1 1 1 0
Caeveetdbiu di6nnhiphanehocaetp 2thuQetinhcodl;lngsau:
{A,D} {A,E} {B,C} {B,D} {B,E} {C,E} {D,E}
1 1 0 1 1 0 1
0 0 1 0 1 1 0
1 1 0 1 1 0 1
0 0 1 0 1 1 0
1 1 1 1 1 1 1
0 0 1 1 0 0 0
Caeveetdbiu di6nehothffytp {A,C},{C,D}codQh6tr<J33%nenbi lOl;li.
{A,B,D} {A,B,E} {B,C,E} {B,D,E}
1 1 0 1
0 0 1 0
1 1 0 1
0 1 1 0
1 1 1 1
0 0 0 0
25
Cacvectdbi~udi~nnhiphanrhoIcact~p4thuQctinhcod~ngsau:
Cacvectdbi~udi~nrho tha'yta't~acact~p4 thuQctinhd€u codQh6tnJ
nhohondQh6trQt6ithi~uminsupp=50%Denthu~ttoandung.
Ke'tquaHmdu'QCt~pph6bie'ngi6ngbang2.2.
2.5. PhIidngphaptill t~pph6bie'nmakhongcfinphatsinhHogvieD
Khaikhoangcact~pph6bie'ntrongcacgiaolaccosddii'li~udavadang
du'QcnghiencU'uph6bie'ntrongvi~cnghiencU'ukhaikhoangdii'li~u.H~uhe't
cacnghiencU'utru'ocdaydvavaonguyen19Aprioriva9 tu'dngchinhcuaphu'ong
phapHmcact~pph6bie'ndvatrenthu~tloanApriorila trongcosadii'li~une'u
ba'tky t~pthuQctinhnaocochi€u dai la k makh6ngph6bie'n,thlcact~pchaco
chi€u dai la (k+1)rungkh6ngph6bie'n[14].Nhu'v~ychiphMsinhcact~pling
vieDco chi€u dai (k+1)tut~pcact~pph6bie'nco chi€u dai la k (k2::1),saudo
tinhdQh6trQtu'onglingcuacact~plingvieDtrongcosddii'li~u.
Thu~tloanApriorikhi thvchi~nco ke'tquara'tt6tnhogiamd~nkich
thu'ocuacact~plingvieD.Tuy nhientrongtru'onghQpco sddii'li~udn khai
khoangcorfitnhi€u cact~pph6bie'n,cact~pph6bie'ncokichthu'ocIOnho~cdQ
h6trQt6i thi~uminsuppkhanhothl thu~tloanAprioriphaichill 2 lo~ichi phi
quailtrQngla:
{A,B,C,D} {A,B,C,E} {A,C,D,E} {B,C,D,E}
0 0 0 0
0 0 0 0
0 0 0 0
0 1 0 0
1 1 1 1
0 0 0 0
26
. Chiphixli'lymQtlu'<;1ngIOncaet~pU'ngvien.
. ChiphI phail~pdi l~pl~ivi~eduy~tcdsddfi'li~uvaki€m tramQt
lu'<;1ngIOncaet~plingvienb~ngeachsokhapcaem~u.
VI dl)ne'uco 104caet~pph6bie'nconeh thu'aela 1thithu~toanApriori
sephatsinhhdn107caet~plingviencoehi~udai la 2. Hdnnfi'ad€ Imampha
caet~pph6bie'ncoehi~udai 100nhu'{iI, i2, ...,ilOO}thithu~toanApriorise
phatsinh2100U'ngvien,di~unaylamehochiphI d€ khaikhoangcaet~pph6
bie'nla ra'tIOn.
V~yva'nd~chi phI IOnd€ khaikhoangcaet~pph6bie'neuathu~toan
Apriori la d eh6phatsinhvaki€m tracaet~pU'ngvien.Ne'uOmeachh~nehe'
vi~ephatsinhcaet~pU'ngvienthicdbanciiithi~ndu'<;1eva'nd~neutren.
CaenhanghieneU'u[4],[7]dii tie'pe~ngiai quye'tva'nd~neutrentheo
hu'angd~nghimQtea'utruedfi'li~umaigQila diy caem~uph6bie'n(Frequent
PatternTree,ky hi~ula cayFP). CayFP la mQtea'utruecayti~nto'mdrQngd€
htutrfi'co dQngcacthongtin chuye'uv~cacm~uph6bie'nva phattri€n mQt
phu'dngphaptimt~pd~ydueacm~u.ph6bie'ndvatrencayFP makhongdn
phatsinhU'ngvien[7].
CayFP:
La d~ngcaycocaenutlacaet~pph6bie'nehi~udaila 1vacaenutnay
du'<;1cs<lptheothU'tvcacnutcodQh6tr<;1eaosen~mdvitrithu~nti~nhdnsovai
eacnutco dQh6tr<;1tha'p.
Ne'uconhi~ugiaotaecoeungt~pcaeitemph6bie'n,chungsedu'<;1cgom
l~ithanhmQtgiaotaco
XaydungcayFP
27
Vi d\l:vdiminsupp=60%(3/5)vacdsadii'li~unhu'bang2.4sau:
Bang2.4.Cacgiaotactrongcdsadii'li~u
f)~utienduy~tquacd sadii'li~utrenbang2.4va tinhd9he)trQchocac
item,dlfavaod9he)trQto'ithi~utlmdu'Qct~pph6bi€n chi~udaiIa 1vasiipx€p
theothiltlfdQhe)trQgiamd~nla: {c:4,f:4,a:3,b:3,m:3,p:3}.sO'4,3sauda'u":"
la G9ph6bi€n.
siip x€p l'.licacgiaotactrongcd sa dii'li~ubang2.4theothiltlf d9ph6
bi€n tu caod€n tha'p,k€t quacd sadii'li~usaukhi dadu'Qcsiipx€p nhu'trong
bang2.5.
Bang2.5.Cacgiaotactrongcdsadii'li~udadu'Qcsiipx€p
Vdi cd sadii'li~udadu'Qcsiipx€p trongbang2.5,ca'utruccayFP du'Qc
xaydlfngnhu'hlnh2.2du'diday:
TID CacdichV\lsli'd\lng
1 c,a,f, d,g,i, m,p
2 a,b,c,f, I, m,0
3 b,c,h,j, 0
4 b,f, k, s,P
5 c,a,f, e, I, p,m,n
TID CacdichV\lsli'd\lng CacdichV\lph6bin(da siipxp)
1 a,b,c,f, I, m,0 c, f, a,b,m
2 c,a,f, d,g,i, m,p c,f, a,m,p
3 c,a,f, e, I, p,m,n c, f, a,m,p
4 b,c,h,j, 0 c,b
5 b,f, k, s,P f, b,p
28
BangL: t~pI itemph6bien
itemI sp IHead_node_links
Hinh 2.2.Ca'utrucdiy FP
Blioc tieptheot(;lOnutg6cchocayco tenNULL vaduy~tquacdsadfi'
li~udffdliQcs1tpxepl~nthuhai.Voi giaotacd~utien,xaydlfngnhanhd~utien
trongcayla «c:l), (f:1),(a:1),(b:1),(m:1».Voi haigiaotacgi6ngnhauthuhai
vathuba,VIcacitemph6biendffdliQcs1tptheothut1;1'la vacoclIng
cacitemtrongnhanhd~utien,dodocacnut «c:l), (f:1),(a:l» setang
giatri lenhaiddnvi «c:3), (f:3),(a:3» vat(;lOmQtnutmoi(m:2)la nutconcua
nut(a:3)vamQtnutmoikhac(p:2)clingdliQct(;lOravala nutconcuanut(m:2).
Th1;1'Chi~ntlidngt1;1'd6ivoigiaotacthutli,giatricuanut(c:3)tanglenmQtddnvi
va t(;lOmQtnutmoi (b:1) la concuanut(c:4).Tlidngt1;1'chogiaotacthu5 t(;lO
nhanhmoi «f:1),(b:l),(p:I» la concuanutg6c.Ket quata se co cayFP nhli
hlnh2.2atren.
c 4 1--------------------
f 4
a 3
b 3
m 3
p 1 3
29
Ca'utruecityFP:
CayFP c6ca'utrucnhu'du'<;lcmieutadu'oiday:
+CayFP baag6mmQtbangmGta lienke'tcuacacitemph6bie'nchi~u
daibAng1dffdu'<;lcs~pxe'ptheothlitt! dQh6tr<;ltucaode'ntha'p(giamdiln).
+M6i mQtphilntii'trenbangmGta lienke'tcuaitemdu'<;lcmieutabAng
haithuQctinhla: tencuaitemvahead_node_link,trongd6head_node_linkdung
d€ lienke'tWinutdilulientrongcayFP c6tennutgi6ngtencuaitemnay.
+CayFP baag6mmQtnutg6cc6tenlaNull vamQtt~pcacnutcon.M6i
mQtnutcontrencaydu'<;lcmieutabAngb6nthuQctinhla tenitem,s6liln xua't
hi~n,tencuanutchava lienke'tcuanut(node_link).Trongd6tencuanutcha
c6 th€ chide'nnutchacuan6ho~cla chide'nnutg6c.Lien ke'tcuanutchide'n
nutthuQcnhanhgilnnha'trongcayFP c6 tencungtenvoi nuthi~nhanh
(node_linkc6giatqIathlitt!cuanutthuQcnhanhgilnnha'ttrongcayc6cungten
voitennut).
CayFB dtren(hlnh2.2)du'<;lClu'utronghaibang2.6va2.7nhu'sau
Bang2.6.BangL -t~p1itemph6bie'n
Item SP Head_node_link
(tenthuQctinh) (s6lilnxua'thin) (lienke'tde'nnutdilulien)
C 4 (80%) 1
F 4 (80%) 2
A 3 (60%) 3
B 3 (60%) 4
M 3 (60%) 5
P 3(60%) 7
30
Bang2.7.Bangdfi'li~umotacacnutdiy FP
ThuattminxaydUngcayFP:
Input:dffuVaGlacdsadfi'li~uD vadQh6tr<,1t6iti€u minsupp.
Output:diy cacm§:uph6bie'n(diyFP).
PhudngphapxaydtfngcayFP thtfchi~nquahaibucksailday:
Buoc1:duy~tcdsadfi'li~ulffnthlinha'td€ tinhdQh6tr<,1ciiacacitemva
dtfaVaGminsupp,tlmdu<,1ccact~pph6bie'n1phffnta,gQila t~pF. Saildoslip
xe'pcacitemtrongt~pF rheathlittfgiamdffnciiadQh6tr<,1vake'tquala t~pcac
itemph6bie'n1phffntadficothlittf,gQit~pke'tquanayla t~pL (bang2.6).Slip
xe'pl'.licd sadfi'li~ubandffurheathlittfnhut~pL vagomcacgiaolaccocling
caeitemph6bie'n.
Buoc2:duy~tcdsadfi'li~ulffnthlihai.Ungvoim6igiaolactrongcdsa
dfi'li~uthtfchi~n2congvi~csail:
STT Item SP I Ten_nuCcha Lien_ke'cnut
(TT nut) (tennut) (S6lffn) (Node_link)
1 C 4
2 F 3 C 9
3 A 3 F
4 B 1 A 8
5 M 1 B 6
6 M 2 A
7 P 2 M 11
8 B 1 C 10
9 F 1
10 B 1 F
11 P 1 B
31
Congvi~c1:sokhdpvdi t~pL vasotheothut1!nhu'trongt~pL d~chQn
cacitemph6bie'ntrongm6igiaohieT (Transaction).
Congvi~c2: GQihamT(;lo_Oiy(T,FP_Tree)d~du'acacitemvaotrong
diY FP.
ChucnangT(;lo_Cay(T,FP_Tree)du'Qcth1!Chi~nnhu' sau:U'ngvdi moi
giaotacT, xetcacitemph6biSn(cacnutTj E L). Duy~tcacnuttrencayFP, ne'u
cay FP co mQtnutN gi5ngnhu'Tj theodi~uki~nla N.item=Tj.itemva
N.Ten_NuCcha=Tj.Ten_NucchathltanggiatricuanutN tenmQtddnvi (N.sp
=N.sp+l).Ngu'Qcl(;lit(;lOra mQtnutmdi co tenla Tj.itemco gia tq la 1, va
ten_nucchathllien k€t tdinutTkla chacuanutTj ne'uconutchaTk(k<i)ho~c
lien k€t Wi nutg5c(ne'ui=l), vanode_linklienke'tWi cacnutmacoclingten
vdi tencuaTj.item.Ne'uT concacitemph6bie'nTj U>i)chu'axetde'nthll~pl(;li
thuWcT(;lo_cay(T,FP) chocacnutkStie'p.
PhantiehcayFP:
Theothu~toanxayd1!ngcayFP nhu'tren.Cffnphaiduy~tcacgiaotac
trongcdsadii'li~u2 lffn.Lffnduy~thunha'tnh~mml;1cdichtlmdu'Qcact~pph6
bi€n 1phffntit'.Lffnduy~thuhainh~mml;1cdichla xaydlfnghORnchinhcayFP.
D~xayd1!nghoRnchinhcayFP phait5nthemchiphiO(ITI)d~du'amQtgiaotac
T vaotrongcay,trongdoITlla s5cacitemph6bie'ntrongT.
D1!atrenquatrlnhxayd1!ngcayFP, m6imQtgiaotactrongcdsadii'li~u
du'QcanhX(;lthanhmQtdu'ongdi trongcayFP vadu'ongdibi~udi~nm6igiaotac
phaib:1tdffutug5ccuam6icaycon.Thongtinv~t~pcacitemph6biSntrong
m6imQtgiaotacdlidu'Qchtutrii'dffydutrongcayFP vamQtdu'ongdi trongcay
FP coth~bi~udi~nchocact~pph6bi€n trongnhi~ugiaotaco
32
Kich thu'ocuadiy luaubi giOih'.lntheokichthu'octu'dngungcuacdsadfi'
li~uvabi rangbuQctheot6ngs619nxua'thi~ncuacacitemph6bie"ntrongcdsa
dfi'li~u,chiSucaocuacaybi rangbuQctheos61u'QngIOnnha'tcuacacitemph6
bie"ntrongba'tkygiaolacnaotrongcdsadfi'li~u.
XetquatrlnhxaydvngcayFP thlba't-kymQtgiaolacT naotrongcdsadfi'
li~u co th€ t'.lOramQthoi[LcnhiSunuttrencaynhu'ngt6nt'.limQtdu'ongdi P duy
nha'trongcayFP vachiSudaicuadu'ongdi bttngt6ngs6cacitemph6bie"n
tronggiaolactu'dngung.
Thongthu'ongcacgiaolaccochungcacitemph6bie"nengi6ngnhau
hoanloanhoi[LcmQtphgndu'ongdiP vacacnuttrencaylacacitemph6bie"nDen
kichthu'ocuacaythu'ongnhohdnra'tnhiSusovoi kichthu'ocuacdsadfi'li~u
g6c.
Cai d~tthl1cte':caidi[Lttrencdsadfi'li~udoanhthucuakhachhangthang
07nam2004co 19.655giaolac,trongm6igiaolacco 13item,voi dQph6bie"n
cvcti€u la minsupp=20%,ke"tquat6ngs6lgnxua'thi~ncuacacitemph6bie"nla
687.925trongkhit6ngs6nuttrencayla 62.
Khai khmingcaeroduph6bie'nbitne:eachs~dune:db FP
Xet mQts6thuQctinhquailtn;mgcuaca'utruccayFP, cacthuQctinhnay
giupthvchi~nthu~nti~nhdntrongquatrlnhkhaikhoangcacm§uph6bie"n.
ThuQctinhHead_Node_link:nhothuQctinhnay,khixetcacitemph6bie"n
Litem trongbangL coth€ troyxua'tde"ndungvi tridguliencuanuttrongcayFP
cotengi6ngvoitenLitem.
ThuQctinhNode_link:nhothuQcHnhnayDenba'tky itemph6bie"nh nao
thlta'tcii cact~pph6bie"ncoIkcoth€ du'Qcxacdinhbttngcaclienke"tcuacmnh
33
nutIktrongdiy, biitd§utit'cacitemIk &ph§nd§ucuadiy. Day la thuQCtinhcd
sd trongquatrlnhxayd1ft1gcayFP. ThuQctinhnaygiupthu~nti~ntrongvi~c
tinhdQhe>trQchot~pcacmfiucoquailh~voi Ikbangcachduy~tquacayFP mQt
l§n theocacnode_linkcuaitemIknay.
Vi dl;I:quail sat hinh 2.2, dva vao thuQctinh Head_Node_linkva
Node_link,co th€ t~phQpta'tcacact~pph6bie'nmanutIk co thamgiabang
cachbiitd§utit'nutIkvatie'ptl;lcdi theocacNode_linkcuanutIknay.
Xetnutp:dvavaothuQctinhHead_Node_linkvaNode_link,comQtitem
ph6bie'n(p:3),vahaidu'ongditrongcayFP la:va<f:I,
b:l, p:I>.Du'ongdi thti'nha'tchirarangcacgiaotaccogiatri (c,f, a,m,p)xua't
hi~n2 l§n trongta'tcacacgiaotaccdsddii'li~u,m~cdligiaotaccogiatri (c,f,
a) xua'thi~n3 l§n va (c)xua'thi~n4 l§n,nhu'ngiaotaccogiatri (c,f, a,m)chi
xua'thi~n2 l§n clingvoip. Do do,d€ nghienCUucacitemxua'thi~nclingvoip,
chicodu'ongdi du'Qctinh.Tu'dngtvnhu'v~ydu'ongdi thti'hai
chiraranggiaotaccogiatri (f,b,p) xua'thi~nmQtl§n trongtrongcdsddii'li~u,
vachicodu'ongdi du'Qctinh.Ca haidu'ong",<f:l,
b:I>" t<,lOnencdsdcacmfiuconcuap du'QcgQila cdsdmfiucodi6uki~ncuap
(conghIala cdsdmfiucondu'oidi6uki~nt6nt<,lip).Vi~cxayd1ft1gmQtcaytren
cd sdmfiudi6uki~nnaydfintoi chico duynha'tnhanhchungdfinde'nnutp la
(f:3)co dQhe>trQIOnhdnho~cbangminsuppnenchi nh~ndu'Qct~pph6bie'n
«f,p>:3).Vi~ctimkie'mcacmfiuph6bie'nlienquailde'np ke'thuct<,liday.
Tu'dngtv xetnutm,nh~ndu'QcmQtmfiuph6bie'n(m:3)vahaidu'ongdi
dfinde'nnutm trongcayFP la: va .
Chuy lap xua'thi~nclingvoi m,nhu'ngd daykh6ngdn du'ap vaotronghai
du'ongdi tren,VIcact~pph6bie'ncop dadu'Qckhaosattrongtru'onghQpnutp.
34
Nhu'v~ycorangbuQcuanutmtrendiy la conhanhchungva
gQinola mQtdu'ongdiddntrongcayFP.Bi€u di~ncacrangbuQcuanutmla
Mine(1m).
Tronghlnh2.3du'oidaymatarangbuQcMine(I m)baa
g6mbaitem(c),(t),(a).
Motacdsdmill di~uki~nnutm.
c
f
4
4
3
-----------------------
BangL: t~p1itemph6bi6n
itemI sp IHead_nade_links
a
ffinh2.3.MatarangbuQcnutm
D§u tieDnh~ndu':3)va Mine«c:3,f:3» I
:3>I :3)va
«c,a,m>:3).GQi d~guy"Mine«c:3>I <f,a,m»"nh~ndu'<jct~pph6bi6nIon
nha't«c,f,a,m>:3).
35
Thuhainh~ndu'QeIamott~pph&bie'n«f,m>:3)vaMine«e:3>I<f,m»,
gQi"Mine(I:3).
Thubanh~ndu'QeIa «e,m>:3).DodoHitd caet~pph6bie'ncoitemm
Ia: {(m:3),(a,m:3),(f,m:3),(c,m:3),(f,a,m:3),(e,a,m:3),(c,f,m:3),(c,f,a,m:3)}.
DiSunayehotha'ydingcothSkhaikhoangt~pph6bie'ndlfatIeDdu'ongddncay
FP b~ngeachxua'trata'td caet6hQpcuacaethanhphftntrongdu'ongddndo.
Tu'dngtlf nhu'v~y,xet nutb nh~ndu'Qemotm~uph6bie'n(b:3)va ba
du'ongdi trongcayIa:,va.Vl m~ucocd
sd diSuki~n:{(e:1,f:1,a:1),(e:l), (f:1)}khangphatsinhcaeitemph6bie'nco
ehuaitemb (do minsupp=3) Dentrongtru'onghQpnay se ngitngvi~ekhai
khoang.Tu'dngtlfehocaenut(a),(f) conI~i.
Bang2.8matake'tquakhaikhoangcaem~ue,f, a,m,p b~ngeacht~ora
caem~ucocdsddiSuki~n.
Bang2.8.Ma tacaem~ucocdsddiSuki~n.
Caet~pph6bie'ncuamotcityFP cothSxaedinhdlfatIeDt6ngs6caeeach
ke'thQpcaedu'ongdi coneuadu'ongdi P voi dOph6bie'nnhanha'teuacaenut
DamtIeDdu'ongdi connay [7].Gia sa du'ongdi ddnP euaciiy FP Ia (N1:sPh
Nz:spz,...,Nk:sPk).Do ph6bie'nSrieuacaenutNi ehinhIa s6Iftnxua'thi~neua
Tp ph6bie'n Mu cocdsddiSukin DiSukin cuacayFP
P:3 {(e:2,f:2,a:2,m:2);(f:1,b:I)} {(f:3)}Ip
M:3 {(e:1,f:1,a:1,b:1);c:2,f:2,a:2)} {(e:3,f:3,a:3)}1m
B:3 {(e:1,f:1,a:1);(e:1);(f:l)} 0
A:3 {(e:3,f:3)} {(e:3,f:3)}Ia
F:4 {(e:3)} {(e:3)}If
C:4 0 0
36
nutNj. ChinhVI dieunaymab{tky ml>tslfke'thQpcuacacitemvoinhautrong
dtiongdi nay,ch~ngh~nnhticachke'th<jp(Nt. ...,Nj), (1~i, j~), la mQtt~pph6
bie'nvoidQph6bie'nla dQph6bie'nnhanh{tgiii'acacitemnay.Hdnnii'a,khang
com~uph6bie'nnaodti<jcphatsinhbellngoaicayFP.
Thuattminkhaikhmingtapph6bie'ntrOll!!CayFP:
Input:Bangdii'li~umatacayFP (bang2.7)vat~pFL la t~pph6bie'ncochieu
daibang1(bang2.6)
Output:T~pcact~pph6bie'n.
Thu~ttoaDDuy~ccay(FL,FP_Tree): duy~ttheothli tlf cac nut co dQh6 tr<j
titth{pde'ncao(duy~tngti<jcl~itrongt~pL)
For eachmItem E FL
mNodelink=HeadDodelink
SPk=0
WhilemNodelink>0
mNodelink=FP_Tree.nodelink
P =TIm_P(FP_Tree,mItem,mNodelink):M~ucdsadieuki~nmItem
P.sp=Tinh_SP(P):TinhdQh6tr<jcacnuttrongdtiongdiP
(dQh6tr<jSPk=SPk+min(mItem.sp,P.nodek'sp),nodeklanutk trongP)
TIm_t~p_ph6_bie'n(Mine(P),mItem,P.sp): tlmcact~pph6bie'ndlfaM§:u
cocdsadieuki~nmItemvadQh6trQcaenuttrongP
37
Thu~itmin khai khmingtap ph6 bie'ntrongcayFP biingeachgia tangdo
}!hanmanhcaem§u:
Input:cayFP.
Output:t~pcact~pph6bie'n.
Tht!chi~n:gQithut\lCFP_Growth(FP,Null).
ProcedureFP_Growth(FP,a)
Ne'uFP cochuamQtdu'ongdidonP thl
vdim6icachke'thQpcuacacnuttrongdu'ongdiP (kyhi~ula ~)tht!c
hi~nphatsinht~pmfiu~u a vdidQh6trQ=dQh6trQnhanha'tcuacac
nuttrong~;
ngu'Qcl~iU'ngvdim6iajtrongphffndffucuaFP tht!chi~n
phatsinht~pmfiu~=aj u a vdidQh6tro=dQ_h6_trQcuaaj;
xaydt!ngcosddi€uki~nrho~vasaildoxaydtfngcayFPptheo
di€u ki~ncua~;
Ne'uFPp"*0 thlgQil~ihamFP_Growth(FP,~)
2.6.LuiJt ke'th«;tpcothuQcHohs6vathuQcHohh~ogm\lc
Khai khoanglu~tke'thQpvdi thuQcHnhsf) va thuQctinh h~ngm\lC
(quantitativeandcategoricalassociationrule)du'Qcd€ xua'tnghienCUu trong
[14].
...
38
Vi dl,lminhho~mQtco sa dfi li~ubaa g6mcac thuQctinh nhi phan
(binary),thuQctinhsf)(quantitative),vathuQctinhh~ngml,lc(categorical).
Bang2.9:Co sadfili~uchitie"tcua8cuQcdi~ntho~i
Trongco sadfili~utrenthaigiandamtho~ila thuQctinhsf),df)itu<;1ngla
thuQctinhh~ngml,lc,phuongthucgQi,gQiliendnhla thuQctinhnhiphan.Voi co
sa dfi li~utren(bang2.9),co th~nit ra lu~tke"th<;1psau:<gilJg9i: 23:00:39..
23:00:59>AND AND <ThlJigiandamthor;d:200..
300>~ ,voidQh6tr<;1la62,5%,vadQtinc~yla 80%.
B~ tim kie"mlu~tke"th<;1pd~ngthuQctinh sf)va thuQCtinh h~ngml,lcneu
tren, co th~phankhoangmi€n gia tri cua cac thuQctinh sf)va thuQCtinh h~ng
ffil,lCd~chuy~ntatd v~thuQctinhnhiphan,ffil,lCdichapdl,lngcacthu~troan
khai khoanglu~tke"th<;1pnhiphan.
2.7.Phu'dngphaprOir~cboadfiIi~u
Cacthu~troankhaikhoanglu~tke"th<;1pnhiphanchicoth~apdl,lngtren
nhfingco sadfi li~uquailh~chico thuQctinhnhiphanho~cco sadfili~ud~ng
giaotacnhutrongbang2.1,vakhongth~apdl,lngtnfctie"pvoicaccosadfili~u
GiG gQi Phuongthucg9i Bf)i tu<;1ngThai giandam GQi lien dnh
(1:TB,O:NC) (1,2, 3,4) thoi (giay) (1: co, 0: khong)
23:00:45 1(t\l'dQng) l(tu'nhan) 206 1
23:00:39 1 1 239 1
23:00:39 1 1 286 1
23:00:37 1 1 255 1
23:00:53 1 3(HCSN) 274 1
23:00:39 1 3 273 0
23:00:39 0(nhancong) 2(B1;Ii19) 288 0
23:00:52 0 3 277 0
39
co thuQctinhs6 va thuQctinhh~ngm1;lcnhu'trongcd sa dii'li~ucuabang2.9
[13].f)~kh1{cph\lcdu'<;1cva'nd€ neulIen, tie'nhanhroi r~cboadii'li~uchocac
thuQctinhs6vathuQctinhh~ngm\lcd~chuy~nv€ thuQctinhd~ngnhiphan.
MQts6phu'ongphaprmr~cboanhu'san:
Tn(i1nghap1:ne'uA la thuQctinhs6roi r~cho~cla thuQctinhh~ngm\lc
co mi€n gia tri hOOh~ngd~ng{V.,V2,...,Vd vak dunho«100) thibie'nd6i
thuQctinhnaythanhk thuQctinhnhiphanA_V I, A_V2,. . ., A_Vk.Gia tri cua
thuQc tinh A_Vi =I (ho~cTrue) ne'ugia tri cua thuQctinh A ban d§u b~ngVi,
ngu'<;1cl~igia tri cuaA_Vi =0 (ho~cFalse).
Vi d\ltrongbang2.9, chuy~nthuQctinhd6i_tu'<;1ngthanhb6n thuQctinh
nhiphanla d6i_tu'<;1ng_l,d6i_tu'<;1ng_2,d6Ltu'<;1ng_3,d6i_tu'<;1ng-4.
1
D6Ltu'<;fng
2
3
4
Bang2.10.Roi r~cboathuQctinhs6vathuQctinhh~ngm\lc
Tnti1nghap2: ne'uA la thuQctinhs6lien t\lCho~cA la thuQctinhs6roi
r~chaythuQctinhh~ngm\lcco mi€n gia tri hOOh~ngd~ng{Vh V2,...,Vp}(p
IOn)thibie'nd6ithuQctinhA thanhq thuQctinhnhiphan,<A:
start2..end2>,..., .Gia tri cuathuQctinh=1
(ho~cTrue)ne'ugiatricuathuQctinhA band§un~mtrongkhoang[startj..endi],
ngu'=0 (ho~cFalse).
Sailkhiroi I tJoI_nn;rng_lI tJoI_nn;rng_LID6i_tu'<;fng_3I D6Ltu'<;fng-4
rc boa 0 0 0
0 0 0
0 0 0
0 0 0
40
Vi dl;1thuQctinhthaigiandamtho<;litrongbang2.9,chuySnthuQctinhthai
giandamtho<;li(duration),thanhcacthuQctinhnhi phannhubang2.11sau:
Duration
59
117
154
206
Bang2.11.Rai r<;lcboathuQctinhthaigiandamtho<;li(duration).
Phuongphaprai r<;lcboatreng~pphaivffnd~"diSmbiengay" (sharp
boundaryproblem)[3]. Vi dl;1phanbo dQh6 trQcua mQtthuQctinhTGDT
(Thai_gian_dam_tho<;li)comi~ngiatri tu1de'n10.
0£)
,S
0£)
t::
:g
.....
"0.
1-4.....
16
14
12
10
8
6
4
2
0
1<0
..s:::
<0.
"d
12345678910
giatrithuQctinh
Hinh 2.4.Vi dl;1v~vffnd~"diSmbiengay"khitie'nhanhrair<;lCboadii'li~u
Ne'utie'nhanhrai r<;lcboathuQctinhTGDT thanh2 khoiingla [1..5]va
[6..10],vdidQh6trQctfcti6uminsupp=41% thlcacgiatrjcuaTGBT n~mtrong
khoang[6..10]co dQh6trQla 40%sekh6ngthoamandQh6 trQtoi thiSum~c
<Duration: <Duration: <Duration:
61..120> 121..180> 180..86400>
0
I
0
I
0
.. ,
0 0 0
0 0 0
0 0 0
41
dffuIanc~nbientnli cuakhmlngnaynhu'khmlng[4..7]codQhe>tr<jla 55%thml
manIOnhonminssup.Nhu'v~yphepphankhmlngnayt~onenmQt"di€m bien
gay" gifi'agia tri 5 va 6. Do dovoi eachrai r~ctren,cacthu~toankhongth€
khaikhoangranhii'nglu~tlienquailde'"nthaigiandamtho~icogiatrin~mtrong
khoang[6..10].
B€ kh3:cph\lcva'nd~"di€m biengay"neutren[14]dad~xua'tmQtcach
phankhoangmoisaDchocackhoangli~nk~nhaucomQtphffn"cham"lennhau
(overlapped)d phffndu'angbiengifi'achung.Cachphankhoangnaygiai quye'"t
du'<jcva'nd~tren,nhu'ngl~ig~pphaimQtva'nd~moila mQtsf)giatri n~md gffn
biendu'<jc"coi trQng"honso voi cacgia tri khaccuathuQctinh.Bi~unayse
thie'"utl!nhienvacophffnmallthu~n.
Rai r~cboatheokhoangclingco va'nd~v~ngfi'nghla.Vi d\lrai qc boa
theothaigiandamtho~itrongbang2.11chotha'yr~ng180giiiyva 181giaychi
cachnhaumQtgiayl~ithuQcv~haikhoangkhacnhau.Ne'"uchokhoang[1..60]
la ng3:n,[61..180]la trungbinhva [121..86400]la daithi 181giayxemnhu'cu'oc
keDdai,trongkhido180giayl~ixemla trungbinh.Trongthl!cte'"thaigiandam
thoai181giaychi "daihon"180giayra'tft,di~unaythie'"utl!nhientheocachtu'
duycuaconngu'ai.
B€ kh3:cph\lccacva'nd~ "di€m biengay", [3]dad~xua'tmQtphu'ong
phapphankhoangmoidl!atrent~pmavakhaikhoanglu~tke'"th<jpmoila:LwJt
ktth(1pm(J.D~nglu~tnaykhongchikh3:cph\lccacnhu'<jcdi€m cuava'nd~phan
khoangmacondeml~imQtd~nglu~ttl!nhienhonv~m~tngfi'nghlavagffngiii
honvoingu'aisad\lng.