Yintoni i-ithetha ukuqokelela?

Ukuchithwa kwedatha kunye ne-k-i-algorithm

I- k- ukuthabatha i-algorithm yinkampani yokuchithwa kwedatha kunye nokusetyenziswa komatshini wokufunda umatshini osetyenziselwa ukuhlolisiswa kwamagqabantshintshi kumaqela okuqwalasela okufanayo ngaphandle kolwazi olungaphambili lwaloo dlelwane. Ngesampulu, i-algorithm izama ukubonisa ukuba yeyiphi ididi, okanye iqela, idatha iyenayo, kunye nenani lamaqoqo echazwa yindleko k.

I- k-i- algorithm enye yeendlela ezilula zokuqoqa kwaye isetyenziswa ngokubanzi kwi-imaging yezokwelapha, i-biometrics kunye neendawo ezinxulumene nazo. Ubuncedo be- k- kuthetha ukuqokelela ukuba luchaza malunga nedatha yakho (usebenzisa uhlobo lwalo olungagqithwanga) kunokuba ufundise i-algorithm malunga nolwazi ekuqaleni (usebenzisa ifom ye-algorithm).

Ngamanye amaxesha kuthiwa yi-Lloyd's Algorithm, ngokukodwa kwimijikelezo yesayensi yekhompyutheni kuba i-algorithm ephakamileyo yayiphakanyiswa nguStuart Lloyd ngo-1957. Igama elithi "k-means" lenziwe ngo-1967 nguJames McQueen.

Indlela i-k-ithetha ngayo imisebenzi ye-Algorithm

I- k-i- algorithm yindlela yokuziphendukela kwemvelo ezuze igama layo kwindlela yayo yokusebenza. Iziqendu ze-algorithm ziqwalasela kumaqela e- k , apho i nikezwa njengeparameter yokufaka. Emva koko ibela i-observation nganye kwiiklasi ngokusekelwe ekuqwalaselweni kokubona kwinqanaba leqela. Kuthetha ukuba iqela lithetha ngokutsha kwaye inkqubo iyaqala kwakhona. Nantsi indlela i-algorithm isebenza ngayo:

  1. I-algorithm iyakhetha ngokukhawuleza iinjongo ezifana neendawo zokuqala zamaqela (iindlela).
  2. Inqaku ngalinye kwi-dataset linikezelwa kwiqoqo elivaliweyo, ngokusekelwe kumgama we-Euclidean phakathi kwendawo nganye kunye nesikhungo ngasinye.
  3. Isiko ngalinye leqela libuyiselwa njengomyinge weengongoma kwilo qela.
  4. Izinyathelo 2 no-3 ziphinda zize zidibane. Ukuguquguquka kunokuchazwa ngokuhlukileyo ngokuxhomekeke ekuphunyezweni, kodwa ngokuqhelekileyo kuthetha ukuba akukho ziqendu zokutshintsha amaqoqo xa amanqanaba amabini no-3 aphindaphindiweyo, okanye ukuba utshintsho alwenzi ulwahlulo olubonakalayo kwinkcazo yamacandelo.

Ukukhetha inani lamaCluster

Enye yezinto ezingenakunceda ukuba k- kuthetha ukuhlanganiswa kukuba kufuneka uchaze inani lamacandelo njengengeniso kwi-algorithm. Njengoko yenzelwe, i-algorithm ayikwazi ukukhetha inani elifanelekileyo lamaqoqo kwaye kuxhomekeke kumsebenzisi ukuchonga oku kwangaphambili.

Ngokomzekelo, ukuba unayo iqela labantu eliza kuhlanganiswa ngokubhekiselele kwizisi zobunini njengowesilisa okanye owesifazana, ubiza i- k-i- algorithm usebenzisa igalelo k = 3 liya kubanyanzela abantu kwiinqela ezintathu xa zimbini kuphela, okanye igalelo le k = 2, liza kunika ulungelelwano olungaphezulu.

Ngokufanayo, ukuba iqela labantu babelula ngokukhawuleza ngokusekelwe kwikarhulumente yasekhaya kwaye wabiza i- k- means algorithm kunye negalelo k = 20, iziphumo zinokuthi zenziwe ngokubanzi ukuze zisebenze.

Ngenxa yesi sizathu, ngokuqhelekileyo kuluvo oluhle lokuzama amaxabiso ahlukeneyo k ukuze uchonge ixabiso elifanelekileyo ngokufanelekileyo kwedatha yakho. Kananjalo ungathanda ukuphonononga ukusetyenziswa kwamanye ama-algorithms edatha yedatha ekufuneni kwakho ulwazi olumatshini.