NGHIÊN CỨU VÀ ỨNG DỤNG MỘT SỐ THUẬT GIẢI MÔ HÌNH ỨNG DỤNG KHAI THÁC DỮ LIỆU
(DATA MINING)
Đỗ Phúc
Trang nhan đề
Mục lục
Dẫn nhập
Chương_1: Tập phổ biến và luật kết hợp.
Chương_2: Đoạn lặp phổ biết.
Chương_3: Gom cum dữ liệu.
Chương_4: Một số ứng dụng.
Kết luận
Các công trình của tác giả đã công bố có liên quan đến đề tài luận án
Tài liệu tham khảo
Phụ lục
53 trang |
Chia sẻ: maiphuongtl | Lượt xem: 1744 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Luận án Nghiên cứu và ứng dụng một số thuật giải mô hình ứng dụng khai thác dữ liệu (data mining), để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
51
Vdi t~pphdbi€n {iI, i 2 , i3},eoth€ t~olu~tk€t h<;ipeod~ng:
Co50%khdch angmuaMANY(nhi~u){il,i2},muaMANY(nhi~u)(ill.
1.7.4.Tun lu~tke"th(/pcaengii'canhkhaithacduIi~umit[7]
GQiFFS(O,I,RF,{/-li};r,minsupp)la t~ph<;ipcaet?Pph6biencuangii'dnh
khaiiliacdii'li~umoungvdibt)hamthanhVieD{J-li},giatIi nguongchuy€nd6i
figii' dnh 1: va nguong minsupp.Vdi ba bt) ham tMnh VieD
{J..liMANY},{J-liAVER},{/-liFEw}cho tUngm~thang iEI, co th€ t~ora ba ngii'dnh khai
thacdii'li~umokhacnhauvattrdosii'dt,mgcacthu~tgiiii£lmt?Pphdbi€n da
trlnhbayd cacph~ntrend€ tlmcact?p:
. FFSl= FFS(O,I,RF,{J-liMANY};r,mimsupp)ti'ngvdi bt)hamMANY
. FFSz=.FFS(O,I,RF,{/-liAVER};r,mimsupp)ti'ngvdibt)hamAVERAGE
. FFS3=FS(O,I,RF,{J..liFEW};r,mimsupp)ti'ngvdibi)hamFEW.
TIm SFl E FFSj,SFzEFF~zsaDcho SFI2=SF1nSF27:0vaphanra SF12
thanhcaet?Pcon X, Y khacr6ngcuaSFI2saDchoSFI2=XuYva XnY=0 d€
t~oIu?t keth<;ipx~ Y giii'acacngii'dnh khacnhau.N€u lu~tnaycodt)tinc~y
vu<;itngu'Ongminconf,thicoth€ cq.,caelu~tk€t h<;ipeod~ng:
,,"",,9
. C6 56%khdchhangmuaMANY(nhi~u)m(ithangX, thisemuaFEW(it)m(it
hangY
1.8.DUNG LV! T KET H(1PDE PHAN LOP DULltV VA M<1RQNGHt
s6 PHt}THVQC THUQCTINH TRONGLY THVYET T~P THO [9]
1.8.1.Caekhaini~mcdban
lJinh nghia1.22.Bangquyetdinhnhiphan
Xet ngii' dnh khaiiliacdii'li~u(O,D,R)vdi0 la t~pkhaer6ngcacd6i
tu<;ing,D la t~pkhacr6ngcacchiM.o( thut)ctlohnhiphan),choH vaC la cae
t~pconkhacr6ngcuaD saDchoD=HuC, HnC=0, bi)ba(0, D=HvC,R) (hf\1e
gQiIII mi)tbangquy€t dinhnhiphan.
52
Bang1.11:MQtvidl,lv~bangquye'tdinhnb!phan
Bang 1.11 Ia mOt vi dl,l v~ bang quye'tdinh nhi phan voi
H={dl,d2,d3,d4,d5}vaC={cl ,c2}.ThuQctinhcl xacdiohlOpam;thuQctinhc2
xacdinhlopdlfdng.
Djnhnghia1.23.Lu~tpMnlopireDbangquye'td!nhohiphao
rho bangquye'tdinhnhiphan(0, D=HvC,R),gQiS lacact~pcookhac,
dingcuaH, lu~tpMn loptrenbangquye'tdinhnhiphan(;6d~ngS~ {c}voi
CEc.HampMnlopf dU'c;1c~otitlu~tphanlopcod~ngf =1\ dEH" dva H' c H.
VidlJ1.8.MQts6lu~tpMnloptrangbangquye'tdinhnhipMn(jbangL11
RI:{d3,d4}~{c2li;'R2:{d2,d5}~{cl};R3:{d5}~{el}
Cac hampMo lop tu'ongt1ngIa fl=d3 " d4; f2 =d2 J\ d5, f3=d5.E>6itlfc;1ng0
thoahamphanlopf ne'uacochttata'tcacacchIbaacom~trangH'.
1.8.2.Dq chinhxaccuahamphanlap
rhobangquye'tdiohnhiphan(0,D=Hr..£,R)trongdocacd6itu'c;1ngcua
0 du'c;1cxe'pvaohailop.GQi0+la t~pcaed6itu'c;1ngcua0 thuQcv~lope2 va0-
la eact~pcaed6itu'c;1ngcua0 thuQcv~lopcl. rho f lamOthamphanlop, eo
th€ stl'dl,lngcactieuchu~nsand€ xaediohdOchinhxaecuahamphanlOp f
[24],[38],[48].
GQi TP={OEO+I f(a)dung};FP = {oEO+1f(a)sai}
dl d2 d3 d4 d5 c1 c2
01 1 0 0 1 0 1 0
02 0 1 0 1 0 0 1
03 0 0 1 1 0 0 1
04 1 0 0 0 1 1 0
05 0 1 0 0 1 1 0
06 0 0 1 0 1 0 1
07 0 1 0 0 1 1 0
08 0 0 1 1 0 0 1
53
TN ={0 E 0-' reO)dung};FN ={0 E 0-' f(o)sai}
Be>chinhxac ciiaphanlop cI dtt<;1ctinhbAngGongthti'c:
11N!
ITPI+I1N1 (1.5)
Be>chinhxac cuaphanlop c2dtt<;1etinhbAngGongthti'c
IIPI
I TP I +11N I (1.6)
VidlJ.1.9.Voi bangquye'tdinhnbiphantrongbang1.11
. Xet lu?tphanlopcl : {d2,d5}~ {c1}voif=d2J\ d5
0+={02,03,06,08}ti'ngvoi c2;O.={oI,04,05,07}ti'ngvoi c1
TP={o E 0+1reo)dung}=0;
FP= {oEO+1 f(0)sai}={02,03,06,08}
TN ={0 E 0.1 reo)dung}=~05,07};
FN ={ 0 E 0- I f(s) sai }={0 I, 04 }
B6chinhxacphanlopc1 11NI I{o5,o7}I =10
. ITPI+I1N1 101+I{o5,o7}1'
. Xetlu?tphanlope~'d~ng {d3,d4}~{e2}voif=d3J\ d4:
0+={02,03,06,08}ungvoi c2;O.={01,04,05,07}ti'ngvoi cl
TP ={0 E 0+I reo)dung}={03,08}
FP= {oE 0+I f(o)sai }={02,06}
TN ={0 E 0.1 res)dung}=0
FN={oEO.1 f(s)sai}={ol,04,05,07}
Be>chinhxacphanlopc2 - ITPI = I{oJ.o8}I -1,0
ITP I+!nv I I{oJ,oS}I+101
1.8.3.Dung lu~tke'thc1plam lu~tphanlopdii'Ii~u
Cho bangquyetdinh to, D=Hl£,R) va caengtKJngminsupp,mine:onf,
t1mcaelu~tke'th<;1pcod~ngr:S~{e}.voic ECvaS cH. Co th~dl{aVaGlu~t
54
ke'thQpnaylamcaelu~tphanlOpdii'li~u.rho bangquye'td!nh(0, D=Hl£.R)
va caengu'<Jngminsupp,mineonfun caclu~tke'th<;1pcodl:lngr: S~{e}.vdi
ceC vaS cR. Theodinhnghi'adQtinc~y eualu~tke'thQpr: S~{e}la :
CF(r) IP(S)~~({C})I va peS)Ia t~pcacd6itu'QngcoehuacaethuQctinhtrong
S, p({e})la~pcaed6itu'QngthuQelOpcdodop(S)np({c}}sexaedinhcaed6i
tu'<;1ngthuQeIdp e va co chuacaethuQcnnhtrongS. Ne'ue la ldp e2 thi
Ip(S)()p({e2})1=TP, peS)=TP uTN hayIp(S)1=ITPI+ITNIvi TPnTN=0. Noi
cachkhae:
ITNI
CF(S~{el })=ITP I+1TNI
ITPI
CF(S~{e2})=ITP I+1TNI
(1.7)
(LX)
\
Nhqnxii: Co thEsad~ngdQtine~ycualu~tke'thQpd~daubgiadQchinhde
euahamphanldp
Vi d~1.10.Vdi bangquy~tdinhnb!phantrongbang1.11,secocaelu~tke'th~p
.~'
theengtttJngph6 bie'nt6i thi~uminsupp=OJ2va nglliJngtin e~yt6i thiEu
mineonf=O.7
rl:{dl}->{ell;SP=0.25 CF= 1.00
r2:{d3}->{e2};SP=0.38 CF= 1.00
r3:{d4}->{e2};SP=0.38 CF=0.75
r4:{d5}->{ell;SP=0.38CF=0.75
r5:{d2,dS}->{c1};SP=0.25CF=1.00
r6:{d3,d4}->{e2};SP=0.25CF=1.00
Trongdococaelu~tphanldpdung100%If!: rl,r2,r5,r6.
55
1.8.4.Uimg Iu~tke"th(jpd~md rqng h~s6ph~thuQcthuqctinh trongIy
thuye't~ptho
1.8.4.1.Caekhaini?mcd bantrongIi thuylttqptho
Ph~nnaysii'd~ngcacdjnhngmacdbancua1:9thuyet~ptho lamcdsa
xiiydlfngh~s6phl;1thuQcthuQctinhmarQng[33],[79].
Dinhnghia1.24:H~th6ngthongtin
Chot~ph<;1p0 hii'uh~n,khacr6ngcact~pd6iut<;1ngvaA la t~phii'uh.,n
khacr5ngcacthuQctinhroi r~c.GQidom(a;)Iii ffii~ngiatricuathuQctmhaiEA
RAIl
va V=Udom(a;),hamis:O~AxV xacdinhghiteiciiacacdoittf<;1ngU'ngvoicac
1=1
thuQctinhcuaA. H~th6ngthongtin Iii bQba(O,A,fs).
Bang1.12MQtvi d~v~h~thongthongtin
\
'.z~
BangLI2.la mQtvi d1,lv~h~thongthongtinvdiO={01,02,03,04.o5,06,07,08}
vaA={a.b.c}.
Choh~th6ngthongtin(O,A,fs).BcA, kyhi~uneB)130gicitri thuQctinh
cuat~pthuQctinhB U'ngvoid6itu'<;1ngu.M5i doittf(1ngCEOse U'ngvdi ffiQt
vectord~ctntngchodoittfvoi a E A va
v=o({a}).E>6itu'<;1ngI trongbang1.12tu'dngU'ngvoi vectord~ctrungchod6i
tu',,<c,6».
O/A a b c
01 1 4 6
02 2 4 7
03 3 4 7
04 1 5 6
',05 2 5 6
06 3 5 7
07. 2 5 6
08 3 4 7
56
Dinkngkia1.25.Quanh~bit khaphanvaphanho~cht~pd6itu<;1ng
Choh~th6ngthongtin(O,A,fs),BcA, quailh~bit khaphanind(B) tren
t~pdO'ittf<;1ng0 du'<;1cd!nhnghla nhu'sau:
'ifB c A , 'ifu,V EO, U ind(B)v ~ u(B)=v(B) (1.9)
Quanh~bit khaphanind(B)xacdinhhaid6itu<;1ngu vav coclinggiatIi
thuQctinhdO'ivoitit d caethuQetinhtrongB (u(B)=v(B » .
ChoBcA, coth~ki€m ITaquailh~bit khaphanind(B)Ia mQtquailh~
tu'dngdu'dng.Quanh~bit khaphanind(B)xaedinhmQtphanho~eht~pdO'i
tu'<;1ng0 thanhcaelopttfdngdu'dng.Vdi u E 0, k9 hi~u [U]ind(B)130lOp ttfdng
du'dngeilau theoquailh~ind(B)va O/B Ia phanho<:1ehdu'<;1c1<:10tll quailh~
ind(B).M6iphgntlieilaphanho~chO/Bdu'<;1cgQiIa IDQlt~pcosahayIDQtIdp
tu'dngduong.
VidlJ1.11:Vdibangdii'Ii~uabang1.11vaB={e}secocaeloptu'dngdu'ong:
. (jngvdi
.,
[ol]ind(B)=[04]ind(B)=[~~1jnd(B)=[07]ind(B)={ol,04,05,07}
e (j ng vdi
[02]ind(B)=[03]ind(B)=[06]ind(B)=[08]ind(B)= {02,03, 06, 08}
Dinkngkia1.26:Bangquy€tdinhtrong19thuy€tt~ptho
Choh~thO'ngthongtin(O,A,fs),gQiHR vaCR la caet~pconkhacr6ng
eilaA saochoA=HRuCRvaHRi1CR=0,(0, A=HRuCR,fs»du'<;1cgQihi mQt
bangquy€tdinhtrong19thuy€tt~ptho.T~pHR du<JcgQila t~pcaethuQetinh
di~uki~nvaCR la t~pcaethuQctinhquy€t dinh.Bang1.12.Ia IDQtvi d~lv~
bangquy€td!nhtrang19thuy€t~pthovdi H={a,b}vaC={c}.
57
Dinkngkia1.27.Xa'pxl t~ph<;fp
Choh~th6ngthongtin(O,A,fs),X, lacact~pcankhacr6ngcua0, XcO
vaB la t~pconkhacr6ngcuaA, BcA. -BE 1!oeIu'<;fngt~pX caed6i tu'<;fngqua
t?P B cac thuQctinh,Z.Pawlakdungkhai ni~mxa'pxi du'oieuaX quaB ky hi~u
laB.(Xrva xa'pxitreneuaX quaBkYhi~uIaB*(X)[79].Caexa'pxidu'oiva
trenB.(X)vaB.(X) dtr<;fCdinhnghianhu'sau:
B.(X)={u EO I[U]ind(B)C X}
.
B (X)= {U E o ([U]ind(B) II X * 0 }
(1.10)
Dink nghia 1.28.H~so'ph1,1thuQcthuQctlnh
Cho tru'dchai ~p con khac r6ngU, V cua ~p thuQctlnh A, h~sO'ph1,1
thuQcthuQctinhcuat~pthuQctmhV VaGt~pthuQctinhU du'<;fCsa d1,1ngdEkhao
sat s1,1'ph1,1thuQccuat~pthuQctinhV VaGt~pthuQctlnhU va du'<;fcdinhnghIa
nhasau:
y(U,V) = LIU.(X)IIIOI
XeOIV (1.11)
-t.
Ph1,1thuQcthuQc"tihhcuaV VaGU du'<;fCkj hi~ula: U~V , k. Voi k =1,
t?P thuQctlnhV beanloan ph1,1thuQCVaGt~pthuQctlnhU. Voi k<I: V phtJ.
thuQcmQtph~nVaGU; Voi k =0: V bean loan khong ph1,1thuQcVaGU.
H~so'ph1,1thuQcthuQctinhy(U,V) du'<;fCsu-d1,1ngdEphananhmti'cdQph1,1
thuQcuahait~pthuQctinh[79].
Vidl}1.12.Vdih~th6ngthongtindbangdii'li~u3.2,rho:U={a,b} vaV={c;},
haytinhY (U,V)?
a)V8i U={a,b }seeocae18pttfdngdtfdng:
. {; }: UI=[ol]ind(U)=[oI]
{; }: U2=[02]ind(U)=[02].
58
. {;,}:U3=[03]ind(U)=[08]ind(U)={03,08}
. {; }: U4=[04]ind(U)=[04]
. {;}:U5=[05]ind(U)=[07]ind(U)= {05,07}
. {;}: U5=[06]ind(U)={06}
b)V8iV={c}secocae18ptudngdudng:
. (fngvdi
XI= [ol]ind(V)=[04]ind(V)=[05]ind(V)=[07]ind(V)={01,04,05,07}
. (fngvdi
X2= [02]ind(V)=[03]ind(V)=[06]ind(V)=[08]ind(V)= {02,03,06,08}
Bi tinhh~s6pht;1thuQcuathuQctinhcuaV vaoU b~ngc6ngthU'c1.11,
dn tinhU*(X)vdix eON.
. VdiXl={01,04,05,07},U*(Xl)={01,04,05,07}
. Vdi X2={02,03,o~,08},U*(X2)={02,03,06,08}
y(U,V)= 2)u.(X)I/IOI-lu.(Xl)I+IU.(X2)1-XeDif' 8 - 1,0
~f
,~'
V~yh~86pht;1thuQcthuQctinhcuaV vaoU la 1,0hayV pht;1thue}choan
toanvaoU.
1.8.4.2.Mil TQnghi sitph1;lthuQcthuQclinh [9J
Phin nay trlnhbay cd sd 19lu~ndE dinhnghiava tinh tminh~s6 pht;1
thuQcthue}ctinhmdfe}ng.
Dinh nghia1.29.Hamphananhmucde}baoham
ChongU'ongdomuedQbaoham8e[0,1],gQi~(S,T) la hamphananh
muedQbaohamcuaStrongT, ham~(S,T)dU<;fC(t!nhnghianhusan:
59
J.lc(S,T) =IS II T)IIISI (1.12)
Neu J.lc(S,T);:::8, thit~p h<;1pS du'<jcgQila baahamtrangT vdi mUGdQ
baahamla 8. Neu8=1,0thiS c T
Dtnhnghia1.30.Xa'pXldu'oimdfQng
Vdi dinhnghlacilahamphiloanhmuedQbaaham, co th~ dinhnghia
Xa'pXlmofQngB**(X)trongIy thuyet~pthonhu'sau:
B**(X)={u E 0 I J.lc([U]ind(B),X ;:::8 J\ U EX} (1.13)
Dtnhnghia1.31.H~s6ph\!thuQcthuQctfnhmdfQng
H~s6ph\!thuQcmofQngdu'<;1cdinhnghlaquahamphananhmuedQbaa
ham.Chohait~pthuQctinhU vat~pthuQctinhV, M s6ph\!thuQcthuQctinhmo
fQngcilaV vaoU du'<;1ckyhi~uIa '¥ (U,V)vadu'<;1cd!nhnghianhu'sau:
'¥(U,V)= II U..(X)l1!0I
XeO/V
(1.14)
Vi dl,lI.13saildayneuleDkhaDangphanldp cilah~s6ph\!thuQcthuQc
tinhmdfQng.
~{'
Vidl}1.13:Xetbangquyetdinh1.12,choU={b}vaV={c},taco:
. Voi U={b}secocaeloptu'dngdu'dng:
[01]ind(U)=[02]ind(U)=[03]ind(U)=[08]ind(U)={01,02,03,08}
[04]ind(U)=[05]ind(U)=[06]ind(U)=[07]ind(U)={04,05,06,07}
. Voi V={c}seeocaeloptu'dngdu'dng:-
[ol]ind(B)=[04]ind(B)=[05]ind(B)=[07]ind(B)={ol,04.05, 07}
[02]ind(B)=[03]ind(B)=[06]ind(B)=[08]ind(B)={o2,03,06,08}
Dungh~s6ph\!thuQcthuQctinhtruy~nth6ngy(U,V)=II U.(X)1/101=0
'eO/!
60
Trong1:9thuyttt~pthokhiy(U,V)=Oconghlal?iV khongph\,!thuQcVaG
U,nhungtheoyeucftucuapIlaulapgftndungv~ncoth8suyfaduQCV tIcU.
Tit hailu~tphanldp:
~ ,dQchfnhxaccuapMnlap=0,75
~ ,dQchinhxaccuapMnIdp =0,75
D\faVaGnh~nxettren,lu~nanmdfQngkhaini~mxa'pXlduOicuat~ptho
nh~m(ijnhnghlah~s6ph1,1thuQcthuQctinhmdfQng\fI(U,V).
Vdi cact~pcdsdcuaphanho~chON vamucdQbaahame=0,75:
Vdi Xl= {oI,04.a5,a7},U..(XI)={a4,05,07}
Vdi X2={02,03,06,08},U..(X2)={a2,03,08}
\fI (U,V) = II U..(X)I/ 10I = (I{04,05,a7}1+I{02,03,08}I)/101=6/8=0,75
XeOIV ",
Dov~yM s6ph1,1thuQcthuQctinhmdfQngcokhaDangpMn ldpt6t hdn
h~s6ph1,1thuQcthuQctinhtruy~nth6ng,d~cbi~tl?icacpMnlapg~ndung[91.
Nhq.nxet:KhinguongdomuedQbaaham8=1,0thl'¥(U,V)=y(U,V).
1.8.4.1.Chuyintl/Jibangquye'Fi1/nhtTongIi thuylttljpthosangbangquyltdink
nhjphlin
IAII
Choh~th6ngthongtin(O,A=HRuHC,fs),V=Udom(a,),gQiD Ia t~ph<jp
;=1
cacembaad=eAxVvathoahamis.Tit (O,A=HRuHC,fs)t~oquaDh~hai
ngoiRcOxD,saDcho0R do(a)=va d=.
Bang1.1I Ia bangquyttdinhnhipMn du<jchuy~nd6i tubangquytt
dinhtruy~nth6ng(bang1.12)vdicacchIbaadnhusan:
dl=;d2=;d3=;d4=;d5=;cl=;c2=
XethamattributesduQcdinhnghlanhusan:
61
v SeD, attributes(S)={ae A I-eS } (1.15)
Hamattributesd~la'ytencacthuQctinhtrongt~pconScacchibaacua D.
Tinhchat1.6: Voi c~pham(p,A) dfidtnhnghiaaireD,gQiU eA vaOIU la
mQtphiloho~cho theequaDh~ba'tkhaphiloind(U)vaU1,Uz,.,Uklacac~pcd
sacuaphiloho~chOIUthip(A(Uj»=UjV j=I,...,k.
Vidl}1.14:Voi U={a,b} vat~pcdsacuaphanhOi;lChOIUungvoiloptttdng
du'dngU5=[o5]ind(U)=[o7]ind(U)={o5,o7}du'va.
TheocachmahoaireD,haichibaatttdngunglad2=;d5=.Dungc~p
hamp,Ada du'<;1cdtnhnghiaaireD, ta co:
A(05, o7)={d2,d5,cl}; p(A-(o5,07») =p({d2,dS,cl})={o5,o7}=U5
1.8.4.4.Tinhhf srfphI}thul)cthul)ctinhmdrl)ngquadl)tincljyvadl)philbitn
cualuatkit hd,rp "-.. ,
Rtldl 1.1:ChoSeD vaTeD, mucdQcuapeS)baohamtrongpeT)du'<;1ctlnh:
J.Ic(p(S) ,peT»~=Ip(S) tlp(T)llIp(S)1 =CF(S-+T) (1.16)
-.}-
.,~.
DinhIi 1.7([9]).Cho(O,A=HRuHC,fs)la bangquye'tdtnhvabangchuy~nd6i
quye'tdtnhnhtphilo(O,D=HuC,R)tttdngung,gQiU vaVIa hait?Ph<;1pconcua
A, Uj la cact?PcdsacuaphilohOi;lChOIU vaX la t?PcdsacuaphilohOi;lCh
ON, J la t~pcacchis6 saorhoVjeJ, !lc(Uj,X)~ethi:
'I' (U,V) =I I(CF(A.(Uj)-+A,(X»*SP(A,(Uj)))
XeOlVjeJ
(1.17)
Trangdo D la t~pchibaacuabangquye'tdtnhnhtphan(O,D,R)dtt<;1c
chuy~nd6itITbangquye'tdtnh(O,AJs).
62
Chungminh:GqiJ Ia~pcacchis6 saGcho'v'jeJ,J.1c(Uj,X);::e voi l!j Ia ~pcd
sdcuaphinho~ch01U,coth€ tinhI(U (X»I bhg:
I(U (X»I =IIUj(JXI
jeJ
Dol(Uv cD, A.(X)g), lu~tke'th<;1pA.(Uj)-+A.(X)di'idu<;1etlnh dQph6
bie'nva dQtinc~yDenCF(A(Uj)-+A,(X»= Ip(A,(Uj»(\ p(A,(X)l/lp(A(Uj»1.Theo
tlnhcha't1.6doUj va X la cact~pcosdeuaphinho~chDen p(A(Uj»=Ujva
p(A(X)=X,dov~yIp(A.(Uj»n p(A.(X)I=IUjn XI =CF(A(Uj)~A(X»*IV).Ngoai
fa, dQph6bie'ncua~p h<;fpA(Uj)Ia SP(A,(Uj»=Ip(A(Uj))I/IOI=IUpIOI,Den
IUjl=SP(A(Uv)* 101.Tom l~i:IUjn XI =CF(A(Uj)~A(X»* SP(I..(Uj»* 101
Ne'uA.(Uj) lat~pph6bie'nvaA(Uj)~A(X)lalu~tke'th<;fp,coth€ tlnhh~
s6ph1:lthuQcthuQctinhmdrQngnhusan:
'¥(U,V)= I I( GtF(A(U)~ A(X»*SP(A(Uj)))
XeD/VjeJ
1.8.4.5Xliytb!ngthuQ.tgiai dJ!atrenhi siJphlJ.thllQCthuQctilllzmllTQng
Chobangquye'tdinh(O,A=HRuCR,fs)vanglliJngdQehlnhxaecuaphin
~.
lOpminprecisione[O,I],funcaelu~t'phinlopS~T voiS~HRvaTcCR, saGtho
dochlnhxaecualu~tphinlopS~ V Ionhonho~cbingminprecision.Chobang
quye'tdinh(O,A=HRuCR,fs),gQi(O,D=HuC,R)la bangquye'tdjnhnb!phin
dU<;fCehuy~nd6i tUbangquye'tdjnh(O,A=HRuCR,fs).ChotrUoccacnglliJng
minsupp,minconf,minprecision.GQiFS(O,D=HuC,R,minsupp)la t~pcaet~p
ph6bie'ncia (O,D=HuC,R)vaR(O,D=HuC,R,minsupp,mincont)la t~pcaelu~t
ke'th<;fpeod~nglu~tphinlopS~ T, saGchoS~HvaTcc.A=Huc.
Thu~tgiai 1.11.sandfty sad1:lngh~s6ph1:lthuQcthuQetinhmdrQngd~
tlmlu~(phanIdpdlili~u.
63
Thu4tgiiii 1.11:TImlu~tphanlopdt!atrenh~56ph1:1thuQcmdrQng
Vao:Bangquy~tdjnh(O,A=HR0CR,fs)
NgU'Ongminsupp,mineonf,minpreeision
Ra:T~pcaelu~tphanlopS~ T, sacchoSc H,T c C, A=HuC, ngU'Qngphan
lOpla minprecision.
BlIUc 1: Chuy~nbangquy~tdtnh(O,A=HRuCR,fs) sangbang quy€t djnh nht
phan(O,D=HuC,R)
BlIf1c2: Tinh FS(O,D=HuC,R,minsupp)va R(O,D=HuC,R,minsupp,minconf)
theecaethu~tgiaifunt~p h6bi€n valu~tk~th<Jp.
BlIUc3: Phan hoi;1cht~pR(O,D=HuC,R,minsupp,mincont)ra cae nhomlu~it
phanlop S ~ T, cocacthuQctinhtrongt~pS gi6ngnhauva caethuQctmh
trongt~pT gi5ngnhau,gQiC={G!,Gz,...,Gdlacacnhomlu~tsankhiphanlop.
,BlIUc4: g6mcaeb1foegall:
1)For eachG E C do
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
La'y rEG var=S ~ T
GQiU=Attributes(S)v~VlaAttributes(T)
:::::;':;1
/I Tinh '¥(U,V)
Psi=O
Foreachr:S ~ T var EG do
TinhCF(S~ T) vaSP(S)II dungthu~tgiait1mlu~tke'thc;1P
Psi=Psi+CF(S~ T)* SpeS)
Endfor/I r
If Psi~minprecision
Ghi(U,V)vaot~pKetQua
Endif
13)Endfor/I G
64
Vi dl!-minhh{Jathuq.tgidi 1.11
Voi bangquytt dinh nhi phan (j bang 1,12,ngU'ongph6 bitn t6i thi~u
minsupp=O,1.ngu'Ongtinc~yt6i thi~uIII minconf=0,75,ngu'ongcmnhxactoi
thi~uIii minprecision=O,75.Ungdl,mgcacthu~tgiairimIu~tphanloptitlu~tktt
h<jpsethudU'<JccacIu~tphanlOpsan:
NhomGl:
. Lu~tke'th<;1p{dl} 40 {el}
r1:~
ThuQctmhvt trai a,thuQctinhvt phaic,
SP(rl)=0,25 CF(rl)= 1,00SP({dlD=0,25
. Lu~tke'th<;1p{d3}40 {e2}
r2:~
ThuQctinhvt trai a,thuQctinh,v€phiiic.
SP=0,38 CF= 1,00
SP(r2)=0,38 CF(r2)=1.00SP({d3D=0,38
Tinh'P({a},{C})=CF(rl)*SP({dl})+CF(r2)*SP({d3}}=0,63
NhomG2:
. Lu~tktth<jp {d4} 40 {e2}
-.J
c~",'
r3:~
ThuQctinhvt trai b,thuQctinhvt phiii c.
SP(r3)=0,38 CF(r3)=0,75SP({d4})=0,5
. Lu~tktth<;1p{dS}40 {el}
r4: ~
ThuQctinhvt trai b, thuQctinhvt phiii c.
SP(r4)=0,38 CF(r4)=0,75 SP({d5})=O,5
65
\f'({b},{c})=
CF(r3)*SP({d4})+CF(r4)*SP({d5}}=0.5*0.75+0.5*0.75=0,75
NhomG3:
. Lu~tke'th<;1p{d1,d4}~ {el}
r5:* ~
ThuQctinhvetnIi a,b; thuQctinhveph:iic.
SP(r5)=0,13 CF(r5)=1,00 SP({d1,d4})=0,125
. Lu~tketh<;1p{dl,d5}~ {el}
r6:* ~
ThuQctinhvetnIi a,b; thuQctinhvephaic.
SP(r6)=0,13 CF(r6)=1,00SP({dl,d5})=0,125
. {d2,d4} ~ {c2}
r7:* ~ ,
ThuQctinhvetnii a,b;thuQctinhvephaic.
SP(r7)=0,13 CF(r7)=1,00 SP({d2,d4})=O,125
. Lu~tketh<;1p{d2,d5}~ {el}
r8:* ~
ThuQctinhvetnii a,b; thuQctinhveph:iic.
SP(r8)=0,25 CF(r8)=1,00 SP({d2,d5})=O,25
0 Lu~tketh<;1p{d3,d4}~ {c2}
r9:**~
Ten thuQctinhve tnii a,b; tenthuQctinhveph:ii c.
SP(r9)=0,25 CF(r9)=1,00SP({d3,d4})=O,25
. Lu~tketh<;1p{d3,d5}~ {c2}
rlO:* ~.
ThuQctinhvetreEa,b ;thuQctinhvephaic.
66
SP(rlO)=0,13 CF(rlO)=1,00SP({d3,d5})=0,125
Tinh'I'({a,b},{c})=CF(r5)*SP({dl,d4})+CF(r6)*SP({dl,d5))+
CF(r7)*SP({d2,d4})+ CF(r8)*SP({d2,d5})+CF(r9)* SP({d3,d4})+
CF(rlO)*SP({d3,d5})=1,0
1.9.KET LU~N
Chu'c1ngayphattri~ncacthu?tgiiiihi~uquad~tlmt~pph6bienvalu~t
ke'thQptrongCSDLbiingcachghlmdQphuct~pcilannhtoaDvagiamso lftn
truyc~pCSDL.Co hailo~ithu~tgi.H du'Qcphattri~nla thu~tgiaikhongtang
cu'ongvathU?tgiaitangcu'ong.
Trongthu~tgiaikhongtangcu'ong,mohlnhvectorbi€u di~nt~pm~thang
va baadongd:idu\1Cd€ xu!tnhiimbi~udi€n CSDL thanhngfi'canhnhiphan
niimtrongbQnhomaynnhvagiamsolu'c1ngt~pungVieDdn tinhdQph6bien
d~DangcaDhi~ustIltthu~tgiai. ,
Trong thu~tgiai tangcu'ong,thu~tgiai (~OdaDkhai ni~mcilaR. Godin
d:i du'Qcdi biend€ funt~pph6bie'n(itcackhai ni~mhlnh£huc£rongdaDkhai
ni~m.Thu~tghHtrendaDkhaini~mngoaikhaDangtangcu'ongconcotnIdi~m
"f,
la chidn truyc~pCSDLmQ(Iftn'atiynh!tlacoth€ t~odaDkhaini~m.
Ke'de'nlacacnghienCUumdrQnglu~tke'thQptruy€nthongsangd~ng
lu~tke'thQpphild!nhvalu~tkethc;ipmo.
CuoiclIngchttc1ngaytrlnhbaycacnghiencUudunglu~tke'thc;iPlamlu~t
, phanlOpdfi'li~uvaxaydl,l'ngh~soph1,1£huQcthuQctinhrodfQngtrongly thuyet
t~pthonhiimDangcaokhiiDangkhaosatmli'cdQph1,1thuQcgifi'acac~pthuQc
tinhtrongcaebaitoaDphanlopdii'li~ug§ndung.