Ukuchithwa kwedatha kunye ne-k-i-algorithm
I- k- ukuthabatha i-algorithm yinkampani yokuchithwa kwedatha kunye nokusetyenziswa komatshini wokufunda umatshini osetyenziselwa ukuhlolisiswa kwamagqabantshintshi kumaqela okuqwalasela okufanayo ngaphandle kolwazi olungaphambili lwaloo dlelwane. Ngesampulu, i-algorithm izama ukubonisa ukuba yeyiphi ididi, okanye iqela, idatha iyenayo, kunye nenani lamaqoqo echazwa yindleko k.
I- k-i- algorithm enye yeendlela ezilula zokuqoqa kwaye isetyenziswa ngokubanzi kwi-imaging yezokwelapha, i-biometrics kunye neendawo ezinxulumene nazo. Ubuncedo be- k- kuthetha ukuqokelela ukuba luchaza malunga nedatha yakho (usebenzisa uhlobo lwalo olungagqithwanga) kunokuba ufundise i-algorithm malunga nolwazi ekuqaleni (usebenzisa ifom ye-algorithm).
Ngamanye amaxesha kuthiwa yi-Lloyd's Algorithm, ngokukodwa kwimijikelezo yesayensi yekhompyutheni kuba i-algorithm ephakamileyo yayiphakanyiswa nguStuart Lloyd ngo-1957. Igama elithi "k-means" lenziwe ngo-1967 nguJames McQueen.
Indlela i-k-ithetha ngayo imisebenzi ye-Algorithm
I- k-i- algorithm yindlela yokuziphendukela kwemvelo ezuze igama layo kwindlela yayo yokusebenza. Iziqendu ze-algorithm ziqwalasela kumaqela e- k , apho i nikezwa njengeparameter yokufaka. Emva koko ibela i-observation nganye kwiiklasi ngokusekelwe ekuqwalaselweni kokubona kwinqanaba leqela. Kuthetha ukuba iqela lithetha ngokutsha kwaye inkqubo iyaqala kwakhona. Nantsi indlela i-algorithm isebenza ngayo:
- I-algorithm iyakhetha ngokukhawuleza iinjongo ezifana neendawo zokuqala zamaqela (iindlela).
- Inqaku ngalinye kwi-dataset linikezelwa kwiqoqo elivaliweyo, ngokusekelwe kumgama we-Euclidean phakathi kwendawo nganye kunye nesikhungo ngasinye.
- Isiko ngalinye leqela libuyiselwa njengomyinge weengongoma kwilo qela.
- Izinyathelo 2 no-3 ziphinda zize zidibane. Ukuguquguquka kunokuchazwa ngokuhlukileyo ngokuxhomekeke ekuphunyezweni, kodwa ngokuqhelekileyo kuthetha ukuba akukho ziqendu zokutshintsha amaqoqo xa amanqanaba amabini no-3 aphindaphindiweyo, okanye ukuba utshintsho alwenzi ulwahlulo olubonakalayo kwinkcazo yamacandelo.
Ukukhetha inani lamaCluster
Enye yezinto ezingenakunceda ukuba k- kuthetha ukuhlanganiswa kukuba kufuneka uchaze inani lamacandelo njengengeniso kwi-algorithm. Njengoko yenzelwe, i-algorithm ayikwazi ukukhetha inani elifanelekileyo lamaqoqo kwaye kuxhomekeke kumsebenzisi ukuchonga oku kwangaphambili.
Ngokomzekelo, ukuba unayo iqela labantu eliza kuhlanganiswa ngokubhekiselele kwizisi zobunini njengowesilisa okanye owesifazana, ubiza i- k-i- algorithm usebenzisa igalelo k = 3 liya kubanyanzela abantu kwiinqela ezintathu xa zimbini kuphela, okanye igalelo le k = 2, liza kunika ulungelelwano olungaphezulu.
Ngokufanayo, ukuba iqela labantu babelula ngokukhawuleza ngokusekelwe kwikarhulumente yasekhaya kwaye wabiza i- k- means algorithm kunye negalelo k = 20, iziphumo zinokuthi zenziwe ngokubanzi ukuze zisebenze.
Ngenxa yesi sizathu, ngokuqhelekileyo kuluvo oluhle lokuzama amaxabiso ahlukeneyo k ukuze uchonge ixabiso elifanelekileyo ngokufanelekileyo kwedatha yakho. Kananjalo ungathanda ukuphonononga ukusetyenziswa kwamanye ama-algorithms edatha yedatha ekufuneni kwakho ulwazi olumatshini.