KHAI KHOÁNG DỮ LIỆU GIA TĂNG CÁC MẪU TUẦN TỰ THEO THỜI GIAN
BÙI VĂN THÀNH
Trang nhan đề
Lời cám ơn
Mục lục
Chương_1: Tổng quan.
Chương_2: Khảo sát các thuật giải cho bài toán khai khoáng dữ liệu.
Chương_3: Nghiên cứu thuật giải khai khoáng gia tăng ISE.
Chương 4: Xây dựng hệ thống.
Chương_5: Ứng dụng.
Chương_6: Kết luận và hướng phát triển.
Tài liệu tham khảo
38 trang |
Chia sẻ: maiphuongtl | Lượt xem: 1919 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Luận văn Khai khoáng dữ liệu gia tăng các mẫu tuần tự theo thời gian, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
10
---------------------------------------
ChU'O'ng2
KHAo SAT cAc THUATGIAICHO BAI ToAN.
KHAI KHoANG DO'LIEU.
2.1 TONG QUAN cAe NGHIENcO'u
Nhfmgnamd~ucuath~pnien90,baitoankhaikhoanglu~tk~thqpd~utien
dugcgi6ithi~uvasaudophatri~nrOngrai.Baitoankhaikhoanglu~tk~thqpban
d~uthuangchid~nnhulabaitoan"khachhang-muGsam",baivi duli~ugiaotic
muabandugcchQnlQctircaccirahangbanIe chiramOtllnhVl,l'CUngdl,mgtieu
bi~uchovi~ckhaiphatri thuc.Khaikhoangduli~u,clingdugcbi~tnhula Imam
phi trithuctrongCSDL,dugcnhinnh~nhulamOtllnhVl,l'Cm6id~yhuah~ncho
vi~cnghienClmCSDL.LinhVl,l'Cnaycoth~dugcxacdinhnhulaSl,Imamphihi~u
quacacm~uthuvi tirCSDL100.M~utu~ntvth~hi~ncacy~ut6quailh~v6inhau
theethaigian.
Baitoankhamphilcaem6utwintl!,dugcgi6ithi~ud~utiennam1995[11],
la bai toanImamphi cacm~uph6bi~ntheethaigian,v6i dOh6 trgt6i thi~u
minsuppdonguaisird\mgxacdinh,madOh6trgcuamOtm~utu~ntvlatYl~ph~
tramcuacacday-duii~uchuatrongmftudo.Baitoannayband~udugcthucdAy
bai cacUngdl,mgtrongcongnghi~pbanIe,baag6mthutindinhkern,buonban
khuy~nmaivath6amanthihi~ucuakhachhang.Saudonhfmgk~tquanayl~i
dugcUngdl,mgtrongnhi~ullnhVl,l'ckhoahQCvathuangm~i.Vi dl,l,tronglinhVl,l'C
Y khoa,chamsacsuckh6e,nocoth~dugcsirdl,mgd~dl,Ideanphatb~nhill mOt
chu6icactri~uchUng,trongcongnghi~ptaichinh-nocoth~dl,IdeanSl,Irtiirov~
d~utudl,IatrenmOtchu6icacSl,Iki~nv~thitwangv6nc6ph~,...
j
11
2.2 cAe THU~T GIAI KHAI KHOANG TiNH
Ph~nnayt6ngquathoabai tmlnkhaikhoangm~utu~ntt,r,ill thu~tgii!i
AprioriAllt6'ithu~tgiciiGSPvaPSP,saudoseneubaitoankhaikhoang iatang
v6'ieachti€p c~nSPADEd€ dit6'imohinhdugcchQncualu~ vannay.
Bili toaDkhaikhmingmftutu§nt1!:
GQi:
8 DB laCSDLband~ubaag6mt~pcacgiaotaccuakhach ang,m6igiaotacT
baag6mmakhachhang,thaigiangiaotacvat~pcacm~thangcotronggiao
taco
8 I={iJ.iz,...,im}:t~pcacm~thangi (items).
8 itemset1at~pm~thangkhacr6ng,ky hi~u:'Sj,j E loon.
8 Dayslat~pdic itemsetdugcs~ptheothutt,rthaigian.
8 Day-k ladaycok m~thanghaydaycodQdailak.Vi d\lmQtkhach angmua
cacm~thang1,2,3,4,5theothaigiannhugall:
s =
trongdom~thang2 va3 dugcmuacimgmQthIe,trongsu6tgiaotacchung,
cacm~thangdugcmuatheotrinhtt,r,v~ys 1amQtday_5.
8 Day 1amQtdayconcuadayn€u t6nt~it~ps6nguyen
i1<iz<...<insao cho S'IC SiJ. S'2C Si2, , S'nC Sin'
Vi du 2.1:days' =1adayconcuas vi (2)C (2,3)va(5)C (5).
Tuynhien,khongladayconcuasvi cacm~thangnaykhongdugcmua
trongclingmQtgiaotaco
if 8 T~teelnhfmggiaotacill clingmQtkhachhangdugcnhoml~iv6'inhauva
dugcs~px€p theothutt,rtangvadugcgQilamQtmJu tudntf!dftli~u(gQit~t
la day-dftli~u).
12
. DQh6tr(J'supp(s)las61anxufithi~nth\IccuadaystrongDB.Noidch khac,
aQh6tr(J'cuamOtday1atYs6cuatoanbOcacdayduli?umachungchuas:
supp(s)=s61anxufithi~n(s)/ IDBI
. Ranhgiai am1at~ptfitcacacdaymachungkh6ngph6bi~nnhungtfitcacac
dayconcuano 1aph6bi~n.Ranhgi6iamtrongCSDL DB duqcghichu
b~ngNBD(DB),v6iNBD(DB)={a I a E CDB- LDB , CDB:cacdaytrng
vien trong DB, LDB: cac day ph6 bi~ntrong DB, va supp(a)>
Min_nbd_supptrongDB},v6iMin_nbd_supp1angu5TIgcuaranhgi6ifun
trongDB.R5rangMin_nbd_supp<minsupp,dodovlingh6trq(NBD(DB))
1a[Min-nbd- supp,minsupp].
. MOtdaydu fi?uchuadaysn~us 1adayconcuadayduli?u do.D~quy~tdinh
daycophai1aph6bi~nhaykh6ng,v6idOh6trqminsuppdonguaisird\mg
xacdinh,n~udi~uki~nsupp(s)~rhinsuppduQ'cthoa.
. Cho mOtdays =vadayconc. c la dayconk~cuas n~uthoamOt
trongnhfmgdi~uki~nsau:
0 c xufitphattirsbkg cachrutramOtm~thangho~cill S1ho~cill Sn'
0 c xufitphatill sb~gcachrutramOtm~thangill thanhphanSimano
coitnhfit2m~thang.
0 cladayconk~cuac',vac' 1adayconk~cuas.
Bcdtoimkhaikhoimgmdutwintl!nhiimm¥ca£chtimttitcanhfcngmdu(ciing
g9i fa dayph6 biin) maaQh6 tr(J'cuachungfan hO1lngufingminsuppchotrnac
trongmrt CSDLcacgiaolaccuakhachhang.
BaitoancodOphuct~plaO(mk2k-1),khi khaosatmOtlugngrfit100nhUngday. . ,
ph6bienv6i dOdaik comthuOctinh.
Tru6ctien, thu~tgiaiAprioriAll [11]duQ'cluQ'tquav6i cacchiti~tchuy~u.
13
---------..---.-.--.-.-----..------.--.
2.2.1 AprioriAl1
a)M6 tathuatgiai:Nhi~uthu~tgiaiduQ'cd~xu~td~ulamQtd~ngbi~nth~cua
AprioriAll,nam1995,[11].AprioriAllIathu~tgiai3buac:
Bmyc1:Timcact~pm~thangph6bi~n.
Tim t~tcacact~pcacm~th.:mgcodQh6trQ'minsupp.E>i~unayclingphuhqp
vai cacm~utu~nt\1'co chi~udaiIa 1 (mQtm~thang).B~tkY thu~tgiainaod~tim
caet~pm~thangph6bi~nclingd~usud\mgduQ'cbuacnay.
Bmyc2:Bi~nd6idfrli~u.
Caet~pm~thangph6bi~nduQ'canhx~vaocaes6nguyen.SaudoCSDL
duQ'cbi~nd6i,vaim6igiaotacduQ'cthayth~b&ngt~pcuat~tcacaet~pm~thang
ph6bi~nduQ'echuatronggiaotacoPhepbi~nd6inayduQ'cth\1'chi~nm6ilk thu~t
giaiduy~ttrenduli~utrongphatu~n1\1',ho~cladUngngayvalUlltrlr.S\1'l\1'achQn
saukh6ngth~lamduQ'ctrongnhi~ulrngd\mgth\1'c,khimadfrli~ubi~nd6icoth~
IanhanCSDLband~u,nh~tlat~icacdQh6trQ'comucth~p.
M6i day-aftli?ubaygialamQtdanhsachcuat~pcacs6nguyen,vaim6is6
nguyenth~hi~nmQt~pm~thangph6bi~n.Cacm~utu~nt\1'baygiacoth~duQ'c
xemnhuladanhsachcaes6nguyen,hanla danhsaehcaet~pm~thang.(B~tkY
thanhph~neuamQtm~utu~n1\1'codQh6trQ'minsupphailamQt~pm~thangph6
bi~n).
Bmyc3:Tim caem~utuk 1\1'.
C~utructinhtoancO"baneuabuaenaybAtd~uvaimQth~tgi6ngcuacacday
duQ'ctimth~ytrongbuacduy~ttruaedo(timth~ytrongbuactimt~pm~thangeua
buacduy~td~utien),thu~tgiait~oracaelrngvien,th\1'chi~nbuaeduy~ttrendfr
li~ud~t1ms6d~mdQh6trQ'euacaelrngvien,vasudl,mgcaclrngviennayvaidQ
h6trQ'minsuppnhulamQt~ph~tgi6ngd~t~ocact~plrngvienti~ptheo.
14
~ ..-.--..-.---.--
Thu~tghiiAprioriAll
Mii f!iii: Ck:Day_kUngvien.
Lk:Day-k ph6bi~n.
1={day_lph6bi~n};
for (k=2;Lk!=0;k++)do
begin
Ck=lJ'ngvient~oratirLk-l;
for eachdaydfrli~uctrongCSDL do
Tangs6UngvientrongCkmachungcotrongC
end
K~tqua=Dayt&id~~iUkLk;
T~o tfugvieD
Buo'ck~tn6i:
../ CkduQ'Ct~orabingeachLk-l tv k~tn6ivai nhau
../ ChenvaoCk
../ Selectp.litemseq,..., p.litemsetk-bq.litemsetk-l
FromLk-Ip,Lk-Iq
Wherep.1itemsetl=q.litemseq,...,
p.litemsetk-2=q.litemsetk-2
BU'(\c!\J'.<I.tbo: Lo~ibedaycon_(k-l) cuadays_kkhongph6bi~n
Vi du 2.2: {1,2,3}X {1,2,4}={1,2,3,4}va{1,2,4,3}.
v~caban,khitrinhbaynhfmgdayph6bi~n,yellc~utruactienlanitratfttca
nhfrngday{mgvien.Be>h6trQ'cuaOOfrngday(mgviennayduQ'ct£OOtoaubing
ca.chduy~tquaCSDL.Nhfrngdaykhongthoadi€u ki~nminsuppsebi lo~ibe.K~t
quaco duQ'cIa t~pcac dayph6bi~n.
15
----.-------------..-.---..-
b)Vi dlJminhhoa:
a
CID=LQ
elO-2@
e
G
L
r:~ Q~
G
I
CO'S6'Dii'Lieu nhap: t
...
(ID
°
60
70
I t.
CID=3a
CID=42
CID=5 ...
(1,4)
(2,4)
?Kh6ng
MinSupp=40o/o,C62khach ang:
Hinh 2.1:DCPIi~umaudU'Q'cbi~nd6i
IT~p~tbi"8IBilndil
Bmycbi~ndBi
t
...
Danhsaeh
tj.pm~thang
CID=2
.t
(JD-' @ G
(\ 40 -
~, 0 .t
CIO=4 n
\90*
yCIO=5
Hinh2.2:DCPIi~umaudU'Q'cbi~nd6i
Ma KHang TG giaome Mthang
1 1 ::0
1 2 9J
2 1 10,20
2 2 ::0
2 3 40,00,70
3 1 ::0,9],70
4 1 30
4 2 40,70
4 3 90
5 1 90
G G tCIO=1 ..
@ g I70 .0,1C!D=2 I t..
.. @!
C!D=3
I ..t
G @ G "
70 I
40,7
CIO=4
C!D=5 e t ..
16------------------------------
Bmycbi~nd8i
Hlnh2.3:DiPli~umaubi~nd6iti~ptheo
D~xayd\l11gnhungday(mgvienvaph6bi~n,thu~tgiftiduy~tquanhi~ubuac
trenCSDL. Khi hoanthanh,nhfrngdayph6bi~n(nghiala,thoaminsupp)duqc
khampha.ChungdugcgQilacacday_1ph6bi~n(daymacomQtm~thang,ehinh
notrathanhsingleton).T~peuanhfrngday- 2(mgviendugcxayd\l11gtheegiftthi~t
gall:nhfrngday_2 (mgviencoth~lamQtc~pnhfrngm~thangph6bi~nbAtkY,duqc
duavaoclingmQtgiaotaehaykhong.Nhfrngday- 2 ph6bi~ndugcxetbfugcach
tinh<1Qh6trg.Tir quaildi~mnay,caeday-k (mgvienduqct~oraill day_(k-1)ph6
bi~ncodugctrongbuGC(k-1).
Y twYngchinhcuavi?ctc;lOt'mgvienla,gifranh~ day_(k-l), lwltb6cijpday
(s,s')saoclIologib6phdntirilducuadayt~acvaphdntircu6i'cuakit quasau.
Nhuth~,khiti~nhanhchoe~p(s,s'),mQtday(mgvienmaidugehinhthanhbfug
(--\ (
(ID-l 0) 0\ IV t tCIO=1 '" '"
tlQ\ )
Q CD .t
Oanhsach
} p mtM.ng---j tcm=2 CTO=2
(I (!)\7 I -+-\. .
CIO=3
I tot CIO=3
I ",t
(\ (\I (\
(0
(f1
\ -:0)
10 .
\ 9/9J 5 j
CIO=4 (' '-T./ t cm=4'"
/-\
0cm=5 () t cm=5 t.. ...
17
cachthemvaom\lCsaucimgcua.s'vaos.SaudotinhdQhe;trqchoOOUngungvien
nayvachungITathanhnhUngdayph6bi~nm6iv6idQh6trq,006OO~t.Ti~ntrioo
nayduqcI~pI~ichot6ikhikhongcondaytmgviennao.
c)Nhfmxet:
- Thu~tgiai nay duqcxemOOula n€n tangcho S1,l'phMtri6ncacthu~tgiai
duqctrinhbaya cacph§nti~ptheogall,chuy~ula d1,l'avaocacky thu~t5i uu'hoa
trongthu~tgiaid6caiti~nhi~usu~tth1,l'chi~nbaitoankhaikhoangnay.
- Cohaivftnd€ theocachti~pc~n ay.Bdulien,coS1,l'tieuphimQtcachdang
k6d6th1,l'chi~nvi~cbi~nd6idangdi€n rasu5tm6ibu6cduy~ttrongkhitimcac
m~utuftnt1,l'.B~ngcachnay,d6bi~nd6iCSDLngayvaluutrUCSDLdabi~nd6i,
sekhOngth1,l'chi~nduqcd5iv6iOOi€utmgdl,mgkhidungluqngiliaphaic§ngftn
nhugftpdoichoCSDL 100.Thuhai,trongkhinguaitacoth6marQngthu~tgiai
nayd6giaiquy~trangbuQcthaigianvaS1,l'phanlo~i,nhungkhongkhathid6k~t
hqpch~tchegifracackhoangthaigiangiaotac(nghlala,coS1,l'diOOnghlac(rng
nh~ccuamQtgiaotac).
2.2.2 GSP
a)Motathuatgiai:
Cftutruccobancuathu~tgiaiGSP (GeneralizedSequentialPattern)[10]d6
timm~utuftnt1,l'duqctrinhbaynhusauday.
Ytliifng,:
. Dftutiencacm~thang(I_day)trongDB duqcxemlalmgvien.
. T~im6imuc(tinhtheochi€udaik cuaday):
0 QuetCSDLd6tiOOminsuppchocacdaylmgvien.
L......
0 Phatsinhcacday(rngviencochi€udai(k+1)ill cacdaycochi€udai
k (sird\lngAprioriAll)
18
.------...----..........---
. U~pl?i chot6ikhikhongtimduQ'cdayph6bi~n aokhac,-
Thu~tgHlit?OOOi~ubu6cduy~tquadftli~u,Bu6ed~utieDxacdiM de>h6trQ'
chom6im~thang,nghlala,s6day-duli~ubaahamm~thang.Cu6ibu6cnay,thu~t
gi,Hchobi~tm~thangnaolaph6bi~n,nghlalacode>h6trQ'006oofitminsupp,M6i
m~thangnhuth~come>tdayph6bi~nI-thaOOph~nbaag6mm~thangd6,M6i
bu6cti~ptheosaub~td~uv6ime>tt~ph?tgi6ng:cacdayph6bi~ntimthfiytrong
bu6ctru6cdo.T~ph?tgi6ngnayd~t?Oracaet~pph6bi~n(mgvienmm,duQ'cgQi
la cacdayzmgvien,M6i day(mgvienconhi~um~thanghalldayh?tgi6ng;OOu
th~tfitcacacday(mgvientrongme>tbu6ccocimgs6m~thang.Be>h6trQ'chocac
day(mgviennayduQ'ctimthfiyquacacbu6eduy~tdftli~u,T?i cu6ibu6cnaythu~t
giaixacdiOOdaynaola dayph6bi~nth~tS1,I,Cacdayph6bi~nnaytrathaOOh?t
gi6ngchocacbu6cti~ptheo.Thu~tgiaik~tthuckhikhongco'dayph6bi~nnao
cu6ibu6cduy~t,ho~ckhongcoday(mgviennaoduQ'ct?Ofa.
Co 2 chiti~tmaGSP thlJchi~n:
1. T~oU-DgVieD:cacday(mgvienduQ'ct?Ora OOuth~naotru6ckhi bu6c
duy~tb~td~utrongkhiduytri tinhtoanVyn.
2. f)~mdQh8 trQ' euaeaell'ngVieD:vi~etiOOtoande>h6 trQ'chocacday
(mgvienduQ'cxacdiOOOOuth~nao,
Vi~ct?O(mgviencuaGSP thiclingtuangtlJ OOucuaAprioriAll OOungv~chi
ti~thicokhac,nhuvi d\l2.3HiOO2.4.
Vi du2,3:
19---------------
Hinh2.4:Vi d...t~o(Pngvien
Hinh2.4chiraL3vaC4saubuack€t n6ivanitgQn.Trongbuack€t n6i,day
n6ivaid~t~o,van6ivai
d~t~o.Cacdayconl~ikhongn6ivaibftt10'mQtdaynaotrongL3'
Vi d\l,khongn6ivaibftt10'daynaokhimakhongcodaynaocod~g
hay.TrongbuacnitgQn,duqcnitb6
vidayk~v&ino1akhongcotrongL3'
vudi~mcuaasp sovaiAprioriAllchinh1aphuangphapchQnlmgvien.
PhU'O11~ chon ITn2vieDciIa GSP:
§ Lamgiams6lingvienki~mtra:
~Luutrfrt~plmgvientrencftutnicciiyhash
~Nuttrai(Leftnode)chliadanhsacht~pm~thangvas6lmgvien
~Nutgiita(Interiornode)chliabanghash
~Hamhasht~pcon:d~timtfttcacaclmgviencotronggiaotaco
§ Ki~mtralmgvienth6adQh6trqminsupp
PhuangphapchQnlmgviencuaGSPduqchi~unhusau:
Trong khith1Jchi~nbuacduy~t,asp dQcquadiiy-dfrli?ut~imQthaidi~m
" - ~
vatangsodemhotrqnhfrnglmgvienchuatrongdiiy-dfrli?u.Dopo,chotruaemQt
t~pnhfrngdaylmgvienC vaday-duli~ud,asp cantimtftteanhfrngdaytrongC
machungduQ'chuatrongd.GSPsird\lng2kythu~td~giaiquy€tvftnd~nay:
Day-3pho Day-4Un vienC4
bin L3 saukhinot saukhirutgQn
«1,2)(3»
20
1. Si'rd\mgcAlitrucdfrli~ucay-hashd€ lamgiiims6(mgvientro~gC ma
chungduQ'cki€m ITatrongday-aili?u,theoHinh2.5trinhbayxaydl,ffig
cAutrUccaylUlltrfrHash.
Giaotac Hamhash
"~6'9
2,5,8
Hinh 2.5: cAu true cayHash
2. Bi~nd6ith€ hi~ncuaday-aili?ud d€ machungtacoth€ timthAymQt
cachhi~uqua(mgvienC\lth€ naoladayconcuad,theoHinh2.6lavi d\l
th€ hi~ncuadayduQ'cbi~nd6i.
21
Hlnh 2.6:Day-dli'li~umAuva th~hi~nthayd6i
DochinhlaIy dot1;lisacGSPthl,fchi~nt6thanAprioriAll:thienhdt,GSPd~m
s6lingvienit hanAprioriAll.AprioriAlllUQ1:b6cacdaylingvienbingcach101;li
b6 thanhphdncodQh6trQ'nh6hanminsupp,trongkhiGSPki~mtracacdayco
duQ'cbingcach101;lib6m(zthangcodQh6trQ'nh6hanminsupp.Dodo,GSPluon
luond~mcaelingvienithanAprioriAll.Sl,fkhacnhautrongs6cac(rngviencoth~
tranenkha1611chocacday(rngvienv6i2 thanhphftn.AprioriAllphaithl,fchi~n
mQtsanphdm-cheocact~pm~thangph6bi~nduQ'ctimthiytrongbu6ctimt~p
m~thang.GSPtru6ctiend~mcacdaycoduQ'cbingcachthl,fchi~nmQtsanphdm-
chiDtrencacm~thangph6bi~n,vat1;lOracaclingvien2-thanhphftnv6inhi€uhan
2 m~thangsaudo.N~us6m~thang1611thinh6hannhi€u s6t~pm~thang1611,
khoangeachhi~usuitcoth~tranent6it~.Thiehai,AprioriAll(phienbankhong-
hIDtrlr)tru6ctienphaitimcact~pm~thangph6bi~nnaGla duQ'cth~hi~ntrong
m6ithanhphftncuaday-duli~utrongsu6tquatrinhbi~nd6iduli~u,vasaudotim
cacdaylingviennaGduQ'.cth~hi~ntrongno.Di€u naychothiymQtcachd~ctrung
lacophftnch~mbanvi~ctimcacdaylingvienmQtcaclTtfl,fcti~pcuaGSP.
b)Nhanxet:
Thaigian-giaotac Mt hang
10 1,2
25 4,6
45 3
50 1,2
65 3
90 2,4
95 6
Mt hang Thaigian
1 1O 50 NULL
2 10 50 90 NULL
3 45 65 NULL
4 25 90 NULL
5 NULL
6
25 95 NULL
7
NULL
22
Hai Iy dochinhgiaithicht?i saoGSP OOaOOhO1lAprioriAll : M9t la,GSP
d€m cacUngvienit hO1lAprioriAll;Hai la,vi~ctimUngvientrongGSPIamQt
cachtrl;l'Ctiip trongkhi AprioriAllconphaith\Ichi~nvi~cchuy€nd6iill timt~p
m~thangsaudom6itimcacUngvienduQ'cth€ hi~ntrongno.GSPbi€n thienmQt
cachtuy€n tiOOtheos6 day-dftIi~u,vaco cactiOOch~ttangcuangr~tt6tphilhQ'P
v6i day-dftIi~ucokichthu6ctrungbinh.
Bi€m y€u cuaGSPIav~ d€ "thlttcdchat':
. S6 IuQ'llgcacUngvien Ian.
. S6lfinquetCSDLIan.
. V~nd€ th~tS\Ixayrakhiphaikhaikhoangcacm~udai:s6cacm~utufin
1\l'Ungvientangtheohammil.
2.2.3 PSP
a)M6tathuatgiai:
Tuang1\l'OOuGSP, cachti€p c~nPSP [4] thitahuemgmQtcachdfiydu cac
nguyenIy chinhcuaGSP. TiOOm6imecuano Ia su d\lngc~utrucphancApkhac
hO1lIatrongGSP d6iv6i vi~ct6chuccacdayUngvien,do Ia cdutruc-cay-tiJnt6,
d€ caiti€n hi~uquachovi~ctimki€m. .
T?i m6ibu6cthuk, CSDL DB duQ'cduy~td€ d€m dQh6trQ'cuacacUngvien
hi~nt?i (thut\lCKiim tra-imgvien).V~ythi t~pdayph6bi~nLk co th€ duQ'cxay
d\l'TIg.Tit t~pnay,cacUngvienm6iduQ'cduarad€ th\Ichi~nbu6ck€ ti€p (thuWC
Tgo-imgvien).Thu~tgiaidUngkhi cacdayph6bi€n dainh~t,daduavaoDB duQ'c
Imamphadodothuwc t?OUngviencoduQ'cmQt~p(c?n)r6ngcacUngvienmm.
C~utruccay,duQ'cthu~tgiaisud\lngIadiy-tiJnt6(prefix-tree).T?ibu6cthu
k,dlYcochi€usauIak.Nob~tgiftdt cacacdiiy-kUngvientheocachsauday:t?i
b~tkYmQtnhanh,titg6Gd€nffiQtIathayth~choffiQtdayUngvien,vakhaosatmQt
OOanhdan,m6inutt?i chi€usauthul (k~l) b~tgiftm~thangthul cuaday.HO1l
23
th€ nua,songsongv6i m~t-hang,nutcu6icungc~pdQh6trQ'cuadayill g6ct6ila
dangkhaosat(baoham).Vi~cc~tgiaotacduQ'cb~tgiftb~ngcachsud\mgcacbien
duQ'cdanhnhan.Chinhxacban,hayquansat2nut,mQtnutlaconcuanutkia.N€u
cacm~thang,duQ'cnhungvaotrongcacnut,xu~thi~nbandAtitrongsu3tcacgiao
tackhacMall, cacnutduQ'cn6ibiennhauduQ'cdanhnhanb~g '-', nguQ'cl~ino
duQ'cdanhnhan'+'(duang/ trongHinh2.7).Giftsur~g chUngtacot~pcacdiiy-
2ph6bi~n:
L2={«~to)(30»,«10) (40»,«30) (20»,«30 40», «40 to»~}.
No duQ'ct6chuctheec~utruccaynhuHinh2.7.M6i mQtnutk€t thucchua
mQtm~thangva giahi d~m.N€u chungtaquansatnutcom~thang40,giahi
tuanglrngla 2,nghiala co2 IAnxu~thi~ncuaday{«10) (40»}duCJctimth~y
ngay.
root
10 40
A /.............
301 402 202 402 103
Hinh2.7:Cc}utrue cay diPli~u
Thu~tghiiKIEM TRA UNG VIEN (CANDIDATE VERIFICATION)
Nh~p:CayT chuat~tcftcacdayph6bi€n valrngvien,day-duli~ud vachi
danhdaycuanoidseq.Bu6cthukcuathu~tgiftit~o.day.
Xuftt:T t~pt~tcftcaedaylrngvienchuatrongd.
J
Thu~tgiftiKIEM TRA-UNGVIEN dungd€ sosanhcaclrngvienvacacday-
du li~u.Day-duli~uduQ'cduy~tmQtcachtangdAnb~tdAtill m~thangdAtitien.
24
NhanthaigiancuanoduQ'clUllgifttrongbi~nfa.Cacm~th~ngk~ti~ptrongdduQ'c
ki~mITavabi~nUalanhan-thaigiancuam~thanghi~nt~i.DI nhien,n~uUa-la =0,
thic~pm~thangchinh(vat~tcacacm~thangcoth~co giftachung)xu~thi~n
trongmQtgiaotacdan.Khi Uatranenkhacla,di~unayconghialam~thangduQ'c
chQnh,ram6'iph\!thuQcvaomQtgiaotackhac.Tuynhien,chungtakhongth~cho
r~ngdi~udothu~tghiidaphathi~nt~pm~thangdAtitiencuaddobaicirasf>truQ't.
Dodo,vi~cki~mtraphaiduQ'cti~pWcchot6'ikhim~thangduQ'cki~mtracachxa
m~thangth~tS\ldAtitiencuad.E>i~uki~nUa- la~wskhongcondungnfta.Luc
nay,chungtaduQ'cclingc~pmQtt~pm~thang(Ip).E>6iv6'im6im~thangphf>bi~n
trongIp (no tuangUngv6'imQtnutt~ichi~usau1)hamFIND SEQUENCE duQ'c
th\lcthi d~till t~tcacacUngvienduQ'che;trQ'bai t~pm~thangduQ'cruttrichdAti
tien.Ti~ntrinhmotasaudoduQ'ctrinhbaychovi~cl~yrat~pm~thangthuhaico
th~co.faduQ'cd~tv6'inhan-thaigiancuat~pm~thangdAtitienduQ'cb~tg~pvamQt
IAnnftaUaduQ'ctangdAntheotoanbQvi~cki~mtra.Ti~ntrinhduQ'cl~pl~ichot6'i
khi t~pm~thangcu6icungcuacacdayduQ'cgiaiquy~t.
Thu~tghiiTIM DAY (FINDSEQUENCE)
Nh~p:2 s6nguyendungchokichthu6'ct~pm~thang,N: nutcuaT, i: m~t
hangtrongd,depth:chi~usauill trencuacay.
Xu§t: T duQ'c~pnh~theothaigianrangbuQc.
HamFINDSEQUENCEduQ'egQiti~pWcbaithu~tgiaitru6'cdochovi~ctim
ki~meaedayUngviendAtitienb~tdAtiv6'it~p-concuam~thangcuad,saudola
m~thangthuhai,vacuth~ti~pWe.Khi g~pnutla,thu~tgiaiseki~mtradQh6trQ'
cuaday-eonUngvienvatanggiatrid€m cuano.
25
Khi t~tca.cacirngviendugcki€m ITadagi::ii-quy€t,caydugclugcbatd€ t6i
thi€u h6akhonggianbOnh6'dugcyellcdu.T~tcacacla khongthOadOhe)trg
minsuppbi lo<;tibe.Khi vi~cx6abohofmthanh,cayseb~tgiftngaycacdayirng
vienthayvi cacdayph6bi~n.
b)Vi duminhboa:
Hinh2.8motac~utrUcdfrli~udugcdungtrongcachti~pc~ cuaPSP (cay
belltrai)vatrongGSP(caybellphai),dugcquanly dmgmOtbu6'ccuathu~tgiai
t<;today.Chinhxachan,ill cacdCiy-2ph6bi~nchotrongvi d\lHinh2.8,cacdCiy-3
irngviencodugc.Do d6,chungtacoC3=«10) (4010» «10) (30)(20» «10)
(3040»«40 10)(30)«40 10)40».
root
10~
f\ 30 40
F, 40
20 '40 10
root
10 40
20 10
1\
30 40
Hinh2.8:CacminhhQacacc~utruccay-hashvacay-ti~nto
c)Nhanxet:
C~utruccaytiJn t6cuaPSPduarac6caethu~19itheengfrnghiacuanod€
tranhSlJtinhtoant6nkernvavoichkhiki€m ITacacirngvien.Hannfra,c~utrUc
naydugctrinhbaydapirngyellcdubOnh6'ithannhutrongGSPd€ lUlltrfrcaeday
irngvienvaph6bi~ntrencayhash.
2.3 cAe THU~T GIAI KHAI KHOANG GIA TANG it
Bai toankhamphacaedayph6bi~ntrongCSDL theethill'gianlamOtbai
toankhaikhoangdfrli~uquantrQng.Hduh~tcaecongvi~chi~n ayd€u ti~nhanh
26
trenCSDL Ia tInh,vavi~cc~pnh~tCSDL seyeticftuti€n hanhI?i tfttcacacmfiu
bfutgcachquetoanbQCSDL cilvamaLKhamphatfttcacacdayph6bi€n trong
CSDL Iandoih6ikh6iIuQ11gdnhtoanIanvat6nkernnhi6uthaigianbOivi kich
thuackhonggiandugcduy~tchuy€u tangtheos6milcuachi6udaidaygiaomc
dainhfttcotrongno.T6nphitinhtoancaonaycoth~chftpnh~dugckhiCSDLla
tlnh,khimaS\lkhamphadugcth\lchi~nchimQtIftn,vamQts6cachti€p c~ v6bai
toannaydadugctrinhbayphftntren.Tuynhien,nhi6uIInhV\fCchkfugh?llnhu
thUO1lgmC;lidi~nill, phandchv6nc6phftn,phfiuthu~tthinghi~m,...phaichillcac
rangbuQcthaigian-th\lctrongti€n trinhkhaikhoang.TrongcacIInhV\fCnhuth€,
maCSDL dugcc~pnh~trencasakhongd6ivaS\ltuO1lgtaccuanguaisird\mg
chinhsiracacthams6nghienciru,ch?ychuO1lgtrinhkhamphaquatoanbQIftnnua
Ia khongth~lamdugc.V~ythi,yeticAlld6ivai cacthu~tgiaiIa duytri nhimg
thongtinkhaikhoimgdanggiaqua:i) c~pnh~tCSDL,vaii) tuO1lgtaccuanguffisir
d\mg(suad6i/rangbuQckhonggiannghienciru).
Bili toaDkhaikhoanggiatangeaemill tuftnt1}':
GQi:
. DB IaCSDLbandAti.
. minsuppIa dQh6trgnh6nhftt~
. dbIa CSDL giatang(b6sungmQtluQ11gnh6)khiconhUng iaotacmm
vanhUngkhachangmmthemvitodb.Giasirrfutgm6igiaotactrendb
dugcs~px€ptheomakhachangvathaigianthaotaco
. U =DB u db: CSDL dugcc~pnh~tchuatfttcanhUngdaycuaDB vadb.
. L DB Ia t~pcacdayph6bi~ntrongDB.
Bid loankhaikhoanggia tangcacmdutwint~rnhamtimnhimgmduph6biin
. trongU,kYhi?uLu,vaidmgaQh6tr<1nh6nhdtminsuppkhiCSDi alr<1Cc(1pnh(1t.I
- 27
- -
Cachti~pc~ giatangnaylQ'id\mgnhUngthu~lqi euavi~cImamphadfrli~u
tru6cdod€ tranhchgy_lgi (re-run)khidfrli~uduQ'c~pnh~t.
Bai toancodQphuct~plaO(mk),khikhaosatmQtlugngrfit16nnhUngday
ph6bi~nv6idQdaik comthuQctinh.Vi th~,d€ giaibaitoannaytmcachti~pc~
SPADE,dlJatrendim(Lattice),t6rarfithi~uqua.
2.3.1 SPADE
a)M6tathuatgiai:
Trongphfmnay,v6i cachti~pc~ SPADE[12],trinhbaymQthu~tgiaieho
vi~cImamphanhanhcacdayph6bi~nmanot~ocasaehothu~tgiaikhaikhoang
giatangti~ptheosau.
../ Danday(SequenceLattice):
DinhnghTa:V6i quailh~thutlJ "~" trent~pcacm~thangph6bi~nLu, thi
(Lu,~)duQ'cgQiladaTIdaycact~pm~thang.
SPADE sud\lngquailh~daycon ~ xacdinhSlJs~px~ptimgphfmtren
t~pcacday,nghlala,n~u~lamQtdayph6bi~n,thitfitcacacdaycon~ ~
clingph6bi~n.Thu~tgiainghienClrumQteachh~th6ngmQtdimdayduQ'c
marQngbai qUailh~daycon,ill caem~thangt6ngquatnhfit(caem~thang
dan)d~ncaedayph6bi~nrarangnhfit(dayt6id~i)theolo~ichi~usau-dau
tien.Vi d\l,trongHinh2.9,duangnetd~ tuanglIngv6idflnchot~pdfrli~u
mati.
-,-"
28
-CSDL Ranhgi61fun
CID I TID
10
2
3
4
Danday
T~pph6bi~n:{A,Ar-7A,BHA, B,AB,AHB, BH, ABHB}
Hjnh2.9:CSDL bandau
./ D€m uQh6 trQ': HAuh~tcacthu~tgiaikhaikhoangdayhi~nt;;lichod.ng
CSDL ngangvi d\l nhuCSDL trongHinh2.9.Theodinhd;;lngngang,
CSDLbaag6mt~pcackhach ang(cid's).M6ikhachhangcomQt~pcac
giaotac(lid's),tuO'llgUngvai cacm~thangduqcchuatronggiaotaco
Nguqcl;;li,chungtasud\lngCSDLd9C,machungtak~thqpvaim6im~t
hangX trongdandaydanhsachchidanhidUstcuano,kY hi~uL(X), l3.
danhsachcuat~tcacacc~pkhachhang(cid)vachidanhgiaotac(tid)
chuam~thang.Vi d\l,idUstchom~thangC trongCSDL bandAti(Hinh
2.9)coth~baag6mcaclo;;li{,}.
Chos~ncacidUstschom6im~thang,chungtacoth~xacdinhmQtcachl~pdi
l~pl;;lidQh6trqcuab~tleYday-kill cacidUstscuab~tleY2 trongcacday-(k-l)cua
no. C\l th~,chungta k~thqp2 daycon-(k-l) dungchungh~ut6 chung(cacday
duqct;;lO)d~tinhtoandQh6trqcuaday-kmal.MQtki~mtradO'llgiantrendQh6trq
cuadanhsachchidanhk~tquaidUstchochungtadaymm1aph6bi~nhaykhong.
20 B
30 AB
20 AC
30 ABC
50 B
10 I A3 B
40 A
30 AB
40 A
50 B
29
--------
- -
N~uchungtakhongdubQnhO'chinh,chungtacoth6d~mtfttcacacdayph6
bi~nb~ngcachduy~tquadim,vath\lchi~ncacdi6mgiaonhaud6coduQ'cacdQ
h6 trQ'day.Trenth\lct~,tuynhien,chungtachi co mQtluQ'llggiO'ih~ bQnhO'
chinh,vatfttcacacdanhsachchidanhtrunggianidlistsekhongviladutrongbQ
nhO'.SPADEchiakhonggiannghienClruIannayThanhnhi€udo~nh6,d€ quanII'
IDacoth6xuII' IDQtcachdQcI~ptrongbQnhO'.Di€u nayduQ'choanthanhnhacac
lOptlfO1lgalfO1lgdvatren-hgut6(ill naygQila lOp).Chungtanoir~g 2 diiy-ka
trongcungmQtlapn~uchungdungchungh~ut6chungchi€udai(k-l).S\lquailsat
chinhm6ilapla IDQtdan-concuadandayband~uvacoth6duQ'cxu II' mQtcach
dQcI~p.M6i mQtlaph~ut6thidQcI~ptheonghialanocothongtinhoanhaocho
vi~ct<;10ratfttcacacdayph6bi~nmachungdungchungcungh~ut6.Vi d\l,n~u
mQtlap[XJcocacthanhph~nY f-7X,vaZH X nhulacacdayduynhftt,hicac
dayph6bi~nchicoth6cot<;1ibuO'ck~ti~pcoth6laY H Z H X, Z H Y H X va
(Y Z) H X R5rang,khongcom~thangQ naokhaccoth6din d~nmQtdayph6
bi~nvO'ih~ut6X,tm(QA)hayQH X thiclingatrong[X].
30 - ----- .-- --.----------------------------------
Ranhgi6iam Dfmday
3 40
~10 '
-120
tit-:
30
/ '"
I 0 '-
\A->A->A->~)' /
...
4
40
50
AA
BB
CG,.,"",
T~pph6biSn:
{A,B, C, AI-)A, B I-)A, AB, ~B, BI-)B, AI-)C, BI-)C,AI-)AI-)B,
ABI-)B, AI-)BI-)B, AI-)AI-)C, AI-)BI-)C}
140
"".."'.~.':" ,""
Hinh 2.10: CSDL bandAu+giatangvaDan
CSDLDL
CIDID I TIDID Mt hang
20 BB
30 AA BB
100 AA,CC
2
L
20 AA CC
30 AA BB CC
50 BB
10 AA I '
30 BB I CSDL giatang
31
SPADE phantieh<l~quicaedayt~im6imuemaithelnhcaelapeh~ndQel~p
nh6ban.Vi dl,l,t~imue1nosudl,lngcaelaph~ut6ehi~udeli1(X y),t~imuc2no
sudl,lngcaelaph~ut6ehi~udeli2(XH Y,XY),V..v.Chungtachiramuccaclap1
h~ut6nhulacaelapchaoCaelaph~ut6nelyduQ'exuly tuAnDJ.Hinh2.8choma-
gia(trinhbelydangian)ehothutl,lcchinheuathu~tgiaiSPADE.Iri nh~pleilap,
tuangtrngvai idUstehom6imQtrongcaethelnhphAneuano.Caedayph6bi€n
, ,!II
duQ'et~orabangeachgiaonhaucaeidUsteuatateacaee~pdayphanbi~trong
m6ilapvelki~mtra<lQh6trQ'euaidUstk€t quad\Iavelominsupp.Caeday<luQ'ctim
th~ytrathelnhph6bi€n t~imuehi~nt~icaelapt~oraehomuek€ ti€p. Ti€n trinh
mue-eaonelyduQ'el~pmQteachd~qui ehotai khi t~teacaedayph6bi€n duQ'c
d€m. Theothu~tngftquailly bQnha,d~delngth~yr&ngchungtacAnbQnhad~lUll
trftcaeidUsttrungianehot~i2mueli~nk~nhaunh~t.Ngaykhit~teacaedayph6
bi€n ehomuek€ ti€p duQ'et~ofa,caedayt~imuehi~nt~icoth~bi xoa.
BEGIN Enumerate-Frequent-Seq([S]):
for all elementsAi E [S]do
[Ai]=0;
for all elementsAj E [S]do
R =Aj u Ai; /* cacdayR ati9'chinhthanhbiingcach
tgocacdayconAi vaAj vaiAi nhtito.m9th(iut6*/
idUst(R)=idUst(Aj) (\ idUst(Ai);
if support(idUst(R))2:min-supthen
[Ai] =[Ai] u {R};
for all [Ai]* 0 doEnumerate-Frequent-Seq({Ai]);
END Enumerate-Frequent-Seq([S]);
..
H1NH2.11: Magiathu~tgiaid~mcaedayphObi~n
32
MQtthu~tghiiv6khaikhOangiatangduQ'ed6xu~thi ISM (Incremental
SequenceMiining)d\Tatrencachti6pc~nSPADE [12],khic~pnh~tnhUngiaotac
m6ivanhfmgkhaehhangm6iduQ'cthemVaGCSDL.D\Tatrenme>tdim(Lattice)
chuacaem~utuftnt1Jtangg6meam~utuftnt1Jph6bi6nvacacm~utufuIt1Jcotren
ranhgiOiam(negativeborder),vi dl,lnhuHinh2.9vaHinh2.10.Ranhgi6'iamIa
t~phqpg6mcaem~ukhOngph6bi6nnhungcacdayphl,lmachungt~oraIaph6
bi6n.Han th6nfta,de>he>trQ'cuadayphl,lclingduQ'cgiftI~itrongdaTInay.Ytuang
chinhcuathu~tgiainayIakhidftIi~udanggiatang,ph~ngiatangduQ'cduy~tqua
ngayd~satnh~pv6inhfmgthongtinm6iVaGdaTI.Saudo,dftli~um6inayduQ'c
k6thqpv6inhfmgdayph6bi6nvaranhgi6iamd~xacdinhcacthanhph~ncua
CSDLband~ue~nduQ'cduy~tl~i.
PHA1:
1. computeD'(i),D "(i)forallitemsi
2. for allitemsi in l'
3. Q.enqueue(S);
4. while(Q :;t.0)
5. p=Q.dequeueO;
Compute- Support(p);
6. if (support(p)~min-sup)
7. k=length(p);
8. if (p ENBD)
9. NB-to-FS[k].enqueue(p);
10. elseif (D'(P);z!:0)
11. forall k+I-sequencesSin ISL that
are
12. generatingascendentsofp
13. Q.enqueut;:(S);
PHA2:
1. foreachitemi inNB-to-FS[1]
2. constructsuffixclass[i];
3. NB-to-FS[2].enqueue([i]);
4. for (k=2to...)
5. foreachclassC inNB-to-FS[k]
6. Enumerate-Frequent-Seq(C);
Compute- Support(p):
1. A =generating_subsequencel(p);
2. B =generating_subsequence2(p);
3. supportD'(p)=intersect(D'(A),D '(B));
4. supportD"(p)=intersect(D"(A),
D "(B));
5. support(p)=support(p)+supportD'(p)
-supportD"(p);
Hlnh2.12:Magiftthu~tgiftiISM
Thu~tgiaiISM baag6m2phatheoHinh2.12.Fhalia d~c~pnh~tde>he>trQ'
cuacaethanhph~ntrongNB (ranhgi6iam)vaFS (t~peuat~teacacdayph6bi6n
trongCSDL c~pnh~t)vapha2 lad~themVaGNBvaFS ngoaicaiduQ'cth\Tchi~n
;~
trongpha1.CSDL D va dayex,£16h6trQ'hayph6biin cuaextrongD, Icyhi~u
'" ,
suppartD(a), Ia sokhachhangtrongD macaedaychuaexnhulame>tdaycon.Be>
33
h6trQ't6ithi~uminsup,Iangu5'ngdonguaisud\mgxacdinhthuemgduQ'cxacdinh
"caedayph6biin":mQtdaylaph6bi€n trongD Iacominsupnh6nh§t.Lu~tA
B lienquaildayA vadayB duQ'choIacoaQtinCl;lYcn€u c%cuakhach imgma
nochuaA clingchuaB. Giasur~ngdftli~um6i8 IaduQ'cthemvaoCSDLD. Thi
chungtagQiD Ia CSDL banaduva8 Ia CSDL gia tang.CSDL duQ'c~pnh~tIaD
+8.V6i m6ik ;;::1,Fk la t~pchQnIQccuat§t cacacdayph6bi€n chi€u daik trong
CSDL c~pnh~t.C', T' val' Ia t~pcuat§tcacid's,tid'svacacm~thang,tuong(mg,
xu§thi~ntrongphfingiatang8.D' Iat~pcuat§tcam~utin(trongD u 8) vaicid
trongC' vaD "=D' \ 6.
b)Vi duminhboa:
Theovi d\ltrongHinh2.9vaHinh2.10,caethanhphfinsaucodQh6trQ'c~p
nh~t:AHA HA, B HA HA, A HA HB, B HA HB, A HB HB vaC.Trong
nhftngthanhphfinnay,cacdaysausedichuy~nill ranhgi6iamsangt~pph6bi€n:
A H A H B,A H B H BvaC.
Fha2 gi6ngnhuFha1,t~icu6iFha1NB-to-FSIamQtdanhsach(ho~cmQt
mang)cuacacbanghashchuacacthanhphfindadichuy~nill NBsangFS.Dayla
caclapchidvatren-h~ut6machungtacfinki~mtra.D6iv6it§tcaday-1machUng
dadi chuy~n,chungtak€t giaonov6icaeday-1ph6bi€n khaccoth~co.Chungta
themt§tcacacday-2ph6bi€n nhuth€ vaohangdQ'iNB-to-FS[2]choti€n trinhti€p
theo.Theovi d\lch~ytrongHinh2.11vaHinh2.12,A He vaB He duQ'cthem
vaobangNB-to-FS[2].ClingIucnayt§tcacacday-2danggiakhaclienqUailt6iC
machungkhongph6bi€n duQ'cd~tvaotrongNBD+o.Dodo,C H A, C H B,AC,
BCvaCH C duQ'cd~trongNBD+o.Bu6ck€ ti€p trongFha2 la,b~tdfiuvai bang
hashchuacacday-2,d~I§y thanhphfinmakhongduQ'cxu Iy vad~t~odanhsach
cact~p h6bi€n,theoclingvaicaeMUstduQ'ck€t hqpill D u 8,.tronglaptuong
duO1lgcuan6~Bu6c k€ ti€p la duy~tlap tuO1lgduO1lgk€t quabkg Enumerate-
Frequent-Set,manothemb§tkYdayph6bi€n haythanhphfinranhgiai amroainao
34
vacacthanhph~nduQ'ck~thqpv6'iISL. Chungtil l~pbu6'cnaychot6'ikhit~tca
cacbangNB-to-FSlading.Nhuvi d\l,haykhaosatlapwangduangduQ'ck~thqp
v6'iA H C . TirHinh2.11vaHinh2.12,chungtath~yr~ngdayph6bi~nduyOO~t
khaccualaph~ut6cuanoIaB H C . Khi cahaidaytrenlaph6bi~n,chungduQ'c
d~tvaoFSD+(j.Me>teachd~quili~tkecact~pm~thangph6bi~ndand~ncacdayA
H A H C vaA H B H C duQ'cthemvaoFSD+o'Tuangtv,cacdayAB H C, B H
A H C, B H B H C,A H A H A H C vaA H A H B H C duQ'cthemvaoNBD+(j
c)Nhfmxet:
V6'icachti~pc~nsir d\lngdimday,thu~tgiai ISM nayto rar~thi~uquav€
thaigianth1,Icthi duQ'cgiamme>tcachdangk~khi tranhvi~cchgy-lc;dchovi~ckhai
khoangdu li~ukhi co giaotacm6'ihaykhachhangm6'iduQ'cthemvaoCSDL ban
d~u.Nhungthu~tgiainayth1,Ichi~nvi~cduytri ranhgi6'i~mthir~tt5nkernbQ006'
vakh6ngthichnghichoCSDL r~tIan.
2.3.2 ISE
a)M6tathuatgiai:
Thu~tgiai ISE '.'Trichmatitu~nt1,Igia tang" (IncrementalSequence
Extraction)[5],duQ'cd€ xu~td~tinhtoancacmautu~ntv ph6bi~ntrongCSDL
duQ'c~pOO~t,khi themcacgiaotacm6'ivakhachhangm6'ivaoCSDLband~u.
ISE lamgiamt5idachiphitiOOtoanb~ngcachsird\lngl~inhUngthongtinkhai
khoangtru6'cdotirnhUngmauph6bi~ncu.Caim6'iOO~tcuaISE lalamgiamdang
k~t?Pcacmau(mgvienc~nduQ'cki~mtra;ISE t~oracac(mgvientrongtoanbe>
CSDLb~ngcachgcincacmautu~ntv giatangvaomautu~ntvph6bi~nband~u.
Nhuth~notraOOduQ'cvi~cgilll~icacraOOgi6'iamvatinhtoanl~icacmaunaykhi
.
duli~utrongCSDL band~uduQ'c~pnh~t.
35
--....----...-.-
Thu~tgiaiISE coth~chiathanh3bu6'c:
0 BmycHipd~utieD:Tir t~pday_1ph6bi~n,thu~tgiaitimduQ'cact~p
day_2 ph6bi~n.V6'icact~plingvienm6-re>ngthu~tgiaitimrat~phC;ttgi6ng
d~ill dotimti~ptheonhUngdayph6bi~ncode>dainh6hO'IlhaybAng(k+1).
0 BmycHipthif i, v6i i ~(k+l):Timcaet~pcacdayph6bi~nt6idC;ticokich
thu6'c~(k+1)ill t~pcaedaylingvienm6-re>ng.
0 BmycHipthif i, v6i i >(k+l):Timcaet~pcaedayph6bi~nt6idC;ticokich
thu6'c>(k+1).
Chi tiStcuathu~tgiaiseduQ'ctrinhbaytrongehuO'Ilg3.
b)Vi duminhboa:
Trongbaitoankhai khoang iatangcaemfiutu~ tv,dftutienkhaosatbai
toankhiconhunggiaotacm6'iduQ'cthemvaonhUngkhachhangdat6ntC;titrong
CSDL r6i.f)~minhhQabaitoannay,haykhaosatCSDLbandftuDB cho6-vi d\l
2.4,Hinh2.13co4 khaehhangchinh.M6i giaotic duQ'cs~pxSptheothutv thai
glan.
Vi du2.4:
M~thang
506070
5060
80100
8090
(db)
(DB)
iHinh2.13:CSDL bandAuDBvaCSDL giatangvai nhCPnQgiaotilemai
.. db.
.
Ma KH Mt hang
C1 1020 20 5070
C2 1020 30 40
C3 1020 40 30
C4 60 90
----------- 36
- -
Vi dl,l,daykhachhangC31a.Giii sirminsupp=50%,do
dodayph6bi~nphiiico it nh~t2 khachhang.T~pnhUngdayph6bi~nduqcdua
vaotrongCSDLnhusau:LDB ={,}.SaumOtvai
thaotacc~pnh~t,hayquansatCSDLgiatangdb(Hinh2.13),a donhUng iaotac
m6iduqcthemvaonhUngkhach angC2vaC3.Giiisirr~ngdOh6trq1anhuMall,
haidaydirli~usauvaITanenph6bi~nsaukhi
CSDLduqcc~pnh~tk~tirkhichungco00h6trqd~yduoHayquansatdayd~u.
Daynaykh6ngph6bi~ntrongDB khiminsuppkh6ngth6a( nochixu~thi~nd6i
v6ikhachhangcu6icung). V6iCSDLgiatang,daynayITathanhph6bi~nkhino
xu~thi~ntrongnhUngdaycuanhUngkhachhangC3vaC4.Daycoth~
duqcphathi~nd6iv6ikhachhangC1,C2vaC3trongCSDLband~u.Theoph~n
gi6ithi~uCSDLgiatang,dayph6bi~nm6iduqckhamphabai
vi nohqpv6i nhUnggiaotacC1va C2.Hanth~nira,nhUnggiaotacmmduqc
khampha:va,<(5060)
(80)>Iadayph6bi~ntrongdbvakhiduy~tDB chungtatimth~yr~ngnhUngday
ph6bi~ntrongLDB Iadayditru6c.
Vi du2.5:
(DB) (db)
J
Hlnh2.14:CSDLband~uDBvaCSDLgiatangv&inhlPnggiaotacmaiva
., nhCPngkhachhangm&idb.
Ma KH Mt hang
C1 1020 20 5070
C2 1020 30 40
C3 1020 40 30
C4 60 90
C5
Mt hang
506070 80100
5060 8090
1040 7080
37
. Baygiachungtakhfwsatbaitoankhinhfmgkhachhangm6'ivanhfmg iao
tacm6'iduQ'cthemVaGCSDLband~u(Hinh2.14).V6'idQh6trQ'minsuppchicon
50%,dodovi~ckhaosatxemnhuIaph6bi~nchomQtdayphaiduQ'ckhaosatit
nhftt3 khachhangkhikhachhangm6'iC5duQ'cthemVaG.Theodi~uki~nnay,t~p
cacdayph6bi€n duQ'cduaVaGCSDLITathanhLDB={}khinhfmgday
va.chixuftthi~nd6iv6'ikhachhangC2vaC3.
Tuynhien,day<(1020» Iaph6bi€n khinoxuftthi~ntrongnhUngdaycuakhach
hangCl, C2,C3.Theogi6'ithi~u,CSDLgiatang,t~pcacdayph6bi€n trongCSDL
duQ'c~pnh~tIaLU= {,,,,
}.Hayquailsatkyday.Daynaycoth~duQ'cphathi~nd6i
v6'ikhachhangCI trongCSDL band~unhungnokhongIa dayph6bi€n. Iuy
nhien,khim\lC50trathanhph6bi€n v6'iCSDL giatangthidaynayclingphilhgp
v6'inhfmggiaotaccuaC2vaC3.MQtcachtuangtl,l',dayITathanh
ph6bi€n v6'isvgiatang,khinoxuftthi~ntrongC1,C3vakhachhangm6'iC5.
Theo[5],thu~tgiaithvchi~ntrencaet~pdfrli~um~usau:
Bang 2.1:Cac t~pdCPIi~um~u
Vi~cdanhgiahi~usufttISE phaithvchi~nv6'inhfmgthVCnghi~msov6'i
nhfmgI~nthvchi~ncuaGSPtrennhUngcaid~tdfrli~ukhacv6'idQh6trQ't6ithi~u
thayd6i.NhUngthvcnghi~mtrenhinh2.15duQ'Ckhaosattrenhaidfrli~uC9-14-
N2K-DI00KvaC20-14-N2K-D800KduQ'caid~tv6'i10%va5%khachhangda
duQ'cb~sungtuanglmg.Nh~thftyISErftthi~uquangaycakhinhfmgkhach ang
duQ'cb6sungvag~n hunhanhbanGSPgftpdoL
Ten tp dii'liu [C] [IJ N [DJ
s5giaotactb kichca tbtp mi;ithang
,
s5KHsomi;ithang
C9-14-N2K-DlOOK 9 4 2,000 100,000
C20-14-N2K-D800K20 4 2,000 800,000
1800
1600- 1400
~ 1200c
.~ 1000
.~ 800
~ 600
I- 400
200
0
18000
16000
14000
~12000
c
.~10000OJ
:;: 8000I-
6000
4000
2000
0
38
..----.-.-------.--
C9-14-N2K-D100K
00 ~ 00 ~ ~ ~ ~ M N ~ ~
a ~ ~ ~ ~ ~ ~ ~ ~ ~ a
a a a <min~PJ1a a a
C20-14-N2K-D800K
I : GPSI!---ISE
or ~n.,~r-v~~ ~ ~~ ~~ ~ ~ minsupp~. ~. \)' \). \).~. ~. \)'9 \)~
Hlnh2.15:Tho; gianth\fcthikhithem100/0va5%KHvao CSDLband§u
2.3.3 IUS- DUS
Thomasvacaec6ngS\1',nam 199ti1,dftutiend~nghimOthu~tghii,duqcgQila
ULI (UpdateLargeItemset),dvatrent~pranhgi6iam.Tuynhien,thu~tgi:iiULI
dakh6ngtinhd~nS\1'c~pnh~tgiatangcaem~utuk 1\1'.
39
SlJ khacnhaugiftaIUS vaULI dirqcmotftnhusailday.Thitnhcit,thu~tgifti
~~~~~~~~~lli~~~~~~~~
mftuk€t hQ'P.Thuhai,haithu~tgiftisird\mgdayranhgi6iamchom\lCdichkhac
nhau.TheoSlJthayd6icuat~pranhgi6iamtrongCSDLbandAti,thu~tgiftiULI
quy€tdinhnghienCUlltoanbQCSDLhaykhong.Thu~tgiftiIUS lamgiamthaigian
nghienCUllCSDLbandAtib~ngcachgiftl~icacdayranhgi6iam.Ranth~nfta,khi
CSDLbandAtibi xoab6,mQtvaidaytrongranhgi6iamsetrathanhdayph6bi~n
trongCSDL c~pnh~t.Do doDDSsird\lngcacdayranhgi6iamkhicacimgvien
m6itrongCSDLbixoa.
SlJ khacnhaugiftathu~tgiftiIUS vathu~tgiftiISM (IncrementalSequence
Mining)duqcgiaithichnhusail:D€ ki€m tradungluqngbQnh6maraOOgi6iam
tieuphi,thu~tgiaiIUS xacdiOOngu5'ngt6ithi€uchoranhgi6iam:Min-nbd_supp.
Chin€u dQhe;trqcuadaytrongCSDLIaIanbanMin- nbd_suppthidaynaysetra
thanhmQt day trong ranh gi6i am. B~g eachsira d6i kich thu6ccua
Min_nbd_supp,chungtacoth~la~ib6mQtvaidayd~ti€t ki~mbQ006.Nhung
thu~tgiaiISM segiftl~itfitcacacdayranhgi6iam,nghialanokhongth€ ki€m
scatduqcSlJhacphibQnh6cuacacdayranhgi6iam.
SlJ khcicnhaugifta thu~tgiaiIUS vathu~tgiaiISE d:uqcgiftithichla:Dt1u
lien,thu~tgiaiIUS marQngcaphfu1dAtilftnphAnduoicuadayph6bi~ntrang
CSDLbandAtid€ t~oracacimgvienm6itrangCSDLc~pOO~t,trangkhithu~tgifti
ISE chikhaasatSlJmarQngphk duoicuadayph6bi€n trangCSDLbandAti.Thit
hai,thu~tgiaiIUS lqid\lllgcacdayranhgiaiam,trangkhithu~tgiaiISEthikhong.
2.3.3.1IUS (IncrementallyUpdatingSequences):
a) Motathuatgiai:
;~
MQtthu~tgiftihi~uquft,duqcgQila IUS [9],d~tinhdayph6bi~nkhidftli~u
maiduqcthen/vaGCSDL bandAti.Thu~tgiftiIUS lamt6ithi€uhoat6nphitinh
taanb~ngeachsird\lngl~idayranhgi6iamvadayph6bi€ntrongCSDLbandAti.
40
Thu~tgiaiIUS, coth~duQ'cphanchia.thanh2 ph~n.Ph~nthunh~tsird\mg
cacdayranhgi&iamvaph6bi~n.trongDB vadbnhula cacUngviend~tinhcac
dayraubgi&iamvacacdayph6bi~nm&itrongCSDL c~pnh~tU. Ph~ thuhai,
IUS k~thqpcacdayph6bi~ntrongLDBmanolaph6bi~ntrongCSDL c~pnh~tU
vakhongduQ'chuatrongLdbvmcacdayph6bi~n!tongLdbmanolaph6bi~n
trongU vakhongduQ'chuatrongLDB,nghlala,LDBXLdbvaLdbxLDB,d~t~oracac
Ungvienm&itrongCSDL c~pnh~tU, vatinhduQ'cdQh6trQ'cilacacUngvienmm
trongCSDLc~pnh~tU (Thu~tgiaiRobust_search).
DB
Db
Bang 2.2:Ghichuvacaedinhnghia
CSDL band~uchuaduli~uclllienqUailt&ithaigian.
dd
U
CSDLgiatangchuaduli~um&ilienquailt&ithaigian.
CSDLgiamtuDB chuaduli~ubixoalienquailtmthaigian.
Support(F,X)
CSDL c~pnh~t.Khi CSDL duQ'c~pnh~tang,toanbQt~p
du li~ubfuIgv&iDB +db.Khi CSDL duQ'c~pnh~tgiam,
toanbQt~pduli~ubfuIgv&iDB - dd.
DQh6trQ'dayX trongCSDL,v&iX E {db,dd,DB,U}.
Min_supp NguanghotrQ'toithieuciladayphobien.
Min_nbd_suppI NguanghotrQ'toithieuciladayraubgimam.
ex I CacdayUngvientrongCSDLX, v&iX E {db,dd,DB,U}.
NBD(X)
Caedayph6bi~ntrongCSDLX, v&iX E {db,dd,DB,U}.
=ex - Lx, v&iNBD(X) baag6mcacdaytrongCSDLx la
cac t~pcon sub_sets
LX
Thu~tgiaiRobust_searchth\Ichi~ntinhdQh6trQ'cilacacdaytrongmQt
daycacS\Iki~nbaadQng,nh~tla,coth~duy~tdQh6trQ'cilacacdayill dayS\Iki~n- it
baadQngchuatrongduli~ut~pnhieu.NhungdochungtanghiencUuchinhlabai
toanc~pnh~Ucacmftutu~ t1,I,thichungtachiqUailtamd~ndi~uki~nladaybaa
dQngkhongchuaduli~ut~pnhi~u.
41
Thu~tgiaik~thgpcacdaytrongLDBv6i cacdaytrongLdbd~coduQ'cranh
gi6'ifunm6'iNDB(U) trongCSDL c~pnh~tU. Thu~tgiaiIUS sechocacdayph6
bi~ntrongCSDL c~pnh~tU, bingcachchftpnh~ncacdaytrongLDBvaLdbnhula
cac(mgvientrongCSDL c~pnh~tU. Thu~tgiaiIUS coth~t~oracac(mgvienmai
b::. , h d;:..h' tu2d- h- 1, LDB Ldb , Ldb LDBangcac traG01t u. ay,ng la a, x va x .
Thu~tgiai xacdinhngu611gcuaranhgiai am:Min-nbd- Stipp.Bing cach
chinhsuagiatri cuaMin-nbd- Stipp,chungtacoth~ki~mtras6cacdayranhgi6'i
amd~gifttrongb(>nh6'.Do do,chungtacoth~ti~tki~mngu6ncuamaytinhbing
cachd~tchodunggiatrichoMin_nbd_supp.
b)Viduminhboa:
Vi d\lnayseminhhQati~ntrinhcuathu~tgiaiIUS. CSDL banddulaDB,
CSDLgiatangladb,vaCSDLc~pnh~tlaU.
1. DB
\:::tang I: 1:1: I~ I: I: I~ I~ I~ 1
Min_count =2, s6d~mt6ithi~ucuadayd~trathanhdayph6bi~ntrongDB.
LIDB ={,,,}
L2DB= {,,}
L3DB= {}+
NBDI(DB)=,,,,}
NBD2(DB)={,,}
NBD3(DB)= {}
2. db
1:::Mng I~ I: I~ I~ I: I: I~ I: I~ I
Min_count=2,s6dsmt6ithi~ucuadayd~trathanhdayph6bi~ntrongdb.
Lldb ={,,}
L2db= {,}
NBDI(db)=,,,,}
NBD2(db)= {,}
42-.---------------
LJ db={}
3. U=DB+db
l~::hang I~ I: I: I: I; I~ I~ I: I~ I
Min_count=4,s5d~mt5ithi~ucuadayd~trathanhdayph6bi~ntrongU.
Ll u={,,,}
L2u={,}
NBD1(U)={,,,,}
NBD2(U)={,,,,}
NBDJ(U) ={}
2.3.3.2DUS (DecreasinglyUpdatingSequences)
a)M6 1<1thuatgiai:
Trongh~uh~tcactinhhu5ng,du li~uxu£thi~nr£tlauseconhUnganhhuang
d~nk~tquacuavi~ckhaikhoangdaytu~ntv. £>~baadamk~tquacuakhaikhoang
dfrli~ucoth~phananhtheethaigianthvct~,mQtvaidfrli~uCllc~nduQ'cxoaill
CSDL band~u.Nhu th~chungt6i d~nghimQtthu~tgiai,duQ'cgQila DUS [9]
(DecreasinglyUpdatingSequences),lienquaild~nbaitoannay.Thu~tgiaiDUS
chQnIvamQtdayill cacdayranhgi6'iamvacacdayph6bi~ntrongCSDLband~u
khicaclmgvientrongCSDL c~pnh~tcoduQ'cacdayph6bi~nvacacdayranh
gi6'iamtrongCSDLc~pnh~t.
£>inhly I chodi~uki~nc~ d~b£tkYdaynitotrongCSDLband~utrathanh
IDQtdayph6bi~ntrongCSDLc~pnh~t.
DinhtV1:BiiitDB litCSDLband~u,dblitCSDL giatang,thiCSDL c~pnh~tU =
DB+db.Chos~nseqrnlitdayb£tkYtrongCSDLband~uDB, di~uc~ d~dayseqrn
laIDQtdayph6bi~ntrongCSDLU lit: ~
support(seqrn,DB)~Min_freq,
43
. IDBI-Idblv6i Min freq =Min Stipp x .
- - IDBI
Chifngminh:
f)~tD =occur(seqrn,DB), d =occur(seqrn,db),thioccur(seqrn,U)=D -d.
N€u dayseqrnla dayph6bi€n trongCSDL DB, thichungtaco:
\DBI-Idbl
support(seqrn,DB)= \DB! ~Min_supp
Tir congthuctIeDvadinhnghiacuasupport(seqrn,DB),chungtaco:
support(seqrn,DB) =~ ~Min Stippx ~ x IDBI-Idbl
IDB I - D - d IDB\'
Tirdinhnghia,d~0,co~ ~1.Dodo,chungtaco:
D -d
. IDBI-Idb\
support(seqrn,DB)~Mm_freqx IDB! 'nghiala,
support(seqrn,DB)~Min_freq.
Chtrngminhxong.
NhUngdayv6idQh6trQ'trongCSDLbandftula100hO1lngu5TIgph6bi€n t6i
thi€u:Min_freq,coth€ tr&thanhdayph6bi€n trongCSDLc~pnh~tU.Nhuth€ n€u
Min-nbd- Stipp:::;Min- freq,thithu~tgiaiDUSchidungchocacdaytrongTanhgim
amva cacdayph6bi€n co de>h6trQ'la 100hO1lMin- freqnhukhi cactrngVieD
trongU. N€u Min_nbd_supp>Min_freq,thithu~tgiaiDUS sird\lIlgtI1Jcti€p cac
daytrongn cacdayTanhgi6ifunvacacdayph6bi~nnhukhicactrngVieDtrong
CSDL U, d€ tlnhcacdayTanhgi6ifunvacacdayph6bi€n mmtrongCSDLc~p
nh~tU.
Toml~i,thu~tgiaiDUS duQ'chiathanh2 trm'mghQ'Pdumday:
. ~ .N€u Min_nbd_supp :::;Min_freq, 'if seqmE NBD(DB) l) LDB va
Support(Seqrn,DB) ~Min+freq,thU?tgiai tiOOsupport(seqrn,U) dequyet
dinh rfuJgdayseqrncoph1,lthuQcNBD(U) hayLu(U =DB - db).
44
. N~uMin_nbd_supp>Min_freq,V seqmE NBD(DB)u LDB,thu~tgicii
tinhsupport(seqrn,U) d€ quy~tdinhdng dayseqrncophl,lthuQcNBD(U)
hayLu(U=DB - db).
Tronglmgdl,lng,chungtacoth€ dgdminsgphanb6cuangu5TIgphi>bi~nt6i
thi€u:Min_freq.Khi chungtachQngiatri choMin_nbd_supp,t6thanIii chungta
chQngiatfi nayb~ngv6iMin_freq.N~uMin_nbd_suppnh6hanMin_freq,chung
tabi~tr~ngTanhgi6iamNBD(U)coth€ chuanhi~udaymachungkhongth€ tra
thanhmQtdayphi>bi~ntrongCSDL c~pnh~t.M~ckhac,n~uMin_nbd_supp100
banMin- freq,theoDinhIy 1,mQtvaidaysebi lo~ib6trongNBD(U),m~cdumQt
trongnhfmgs6daydosetrathanhdayphi>bi~ntrongCSDLc~pnh~t.
2.3.4 PPCT
MQtthu~tgiaidungd€ phanlo~im~udgatrencaydimmdudu(1cphanhor;rch
PPCT (PartitionedPatternCountTree)[8]sirdl,lngtrongkhaikhoangdftli~ugia
tangd€ luutrfrCSDL. Thu~tgiaid~nd~nk~tquala lamgicimkhonggiannghien
cuu,thaigianphanlo~ivadQchinhxacphanlo~imas6m~uduQ"cluutrUtrong
cayPPC mytheodi~uki~nphanlo~im~u.
Vi du 2.6:T~pm~uduQ'chotheoBang2.3du6idayvanoduQ"cphanlo~i
trencayPPCnhuhinh2.16vahinh2.17:
Bang2.3:Cacm&uvi d~
."
Miu Mt himg
1 a b c x y z
2 abdxy.z
3 a e c x y u
4 f b c x y v
45
Hinh2.16:Cay PC
CayPC phanho~eh1 CayPCphanho~eh2
Hinh2.17:Cayphanho~chPC
2.4 NH~NXET
ChuangnayehoIDQts6nh~xetsau:
./ Quatrinhkhaikhoangdftli~uehinhlati€n trinhkhaiphatrithuethgehi~n
quacaebu6enhusau:
1.
Xeicdjnh
m\lcdich L::"
khai P
khoeing
CSDL
3.
Ti€n
L...J xu19 ~
r7f dit li~up
(bi~n
d6i)
2.
Phan1o~i
CSDLva
hinhthuc
khai
khoeing
4.
W
. 5.
W
6.
Chon . Khai Luutrit
thu~t"gilii khoeing mftu,ph6
ditli~u bien
7.
Ung
-?I dl,lngtrithuc
khai
khoeing
if
46
- Bu6'cti~nxu Iy CSDL anhhuemgnhi~ud~nhi~usufitcuathu~tgiai.Qua
trinhbi~nd6inayclingd.nth1,Ichi~ntheophuangthuc11,IcchQntbichhqpnhfttd€
coth€ d~tduqchi~uquacaD.
./ Vi~cnghienCUucacthu~tgiaitrenclingcftpm9tcainhint6ngquailv~ti~n
trinhkhaikhoangdfrli~utheohinh2.18nhugall:
Phanlo~i
CSDL
L1!achQnthu~tgicii
thichhQ'P
L1!achQnthu~tgicii
thichhQ'P
Khaikhming
dii'li~ugiatang
Hinh 2.18:SO'd6 tom~t ti~ntrinh khai khoangd~li~u.
- TUngthu~tgiaicom~tm~ m~ty~ukhacMali,chungconphl,lthu9cvao
d~ctinhcuaCSDLkhikhaikhoang.Vi~cphanlo~iCSDLmytheod~ctinhphan
b6cacm~utu~ntvph6bi~n,coth€ Iii "dayd~c"hay"thuathat"v6'ichi~udaicua
m~uph6bi~nkhacMali.N~utaxacdinhduqcd~ctinhband~ucuaCSDLthico
th€ 11,IachQnm9tthu~tgiaikhaikhoangthichhqp.
47
- vfind€ lUlltril'dfrli~utruacvasaukhikhaikhmingrfitquantrQng.Vi~cc~p
nh~tcacdfrli~uhfruichlavfind€ d.nphaikhaosatnhi€u,khimacaedfrli~ukhai
khoangkhilUlltril'coconthichhgphaykhong.
./ Bang2.4trinhbaydumdaytomtAtcacthu~tgiaiduQ'cnghienctruv~bai
toankhaikhoangdfrli~umalu~ vannayd~c~pd€n:
Bang2.4:BangtomtAtcaethu~tgiai
STT Tenthuat iai
1 MiningSequentialPatterns:
AprioriAll
2 MiningSequentialPatterns:
Generalizationsand
PerfonnanceImprovements:
GSP
ThePSP Approachfor
MiningSequentialPatterns
(SimilartoGSP)
IncrementalndInteractive
SequenceMining:SPADE
3
4
5 IncrementalMiningof
SequentialPatternsinLarge
Databases:ISE
EfficientlyMiningMaximal
FrequentI emsets:GenMax
6
Tacfiii -Nam
R.Srikant&R.
Agrawal-1995
R. Srikant& R.
Agrawal-1996
F.Masseglia,F.
Cathala,andP.
Poncelet1999.
S.Parthasarathy,M. J.
2aki,M. Ogihara,and
S.Dwakadas1999.
F.Masseglia- P.
Poncelet- M.
Teisseire2000
K. GoudaandM. J.
2aki 2001.
DataMining Conceptsand
I
1.Geng,Duke
Techniques University,Cs.
Department-2001.
7
8 TheAlgorithsofUpdating
Sequentialpatterns:IUS -
DUS
9 An efficientincremental
miningalgorithmfor
compactrealizationof
prototypes:PC-tree,PPC-
tree
Q.Zheng,K. Xu,S.
Ma,W.Ly_2002
P. Viswanath,M.)L
Murty- 2003
Dacdiem
Khai khoangdiI li~u
tIOO:thu~tgiailinnCC7sa
d~utien cho cacthu~t
giaisan
Khai khoangdiI li~u
tiOO:sir d\lIlgciiyhash
lUlltrUdiI li~u
Khai khoangdiI li~u
tiOO:sird\lIlgcayti~nt5
lUlltrUdiIli~u
Khai khoimggia tang:
sird\lIlgclan(lattice)luu
tmdiIli~u
Khai khoanggia tang:
trich m~udiI li~ugia
tang
Timt~pph5bient5id~
my theophfu1lo~ d~c
tiOOCSDL
T5ngqUailv~caekhai
ni~mva ky th~t khai
khoangdiIli~u
Khai khoanggia tang
tlOOd~nearanhgiOifun
ehovi~elUll trUvac~p
OO~tCSDL
Luu trUdiI li~utrenciiy
d~mm~udugc phfu1
ho'ilch:PPCT