文章詳情頁

python中K-means算法基礎知識點

瀏覽：2日期：2022-06-29 10:40:40

能夠學習和掌握編程，最好的學習方式，就是去掌握基本的使用技巧，再多的概念意義，總歸都是為了使用服務的，K-means算法又叫K-均值算法，是非監督學習中的聚類算法。主要有三個元素，其中N是元素個數，x表示元素，c(j)表示第j簇的質心，下面就使用方式給大家簡單介紹實例使用。

K-Means算法進行聚類分析

km = KMeans(n_clusters = 3)km.fit(X)centers = km.cluster_centers_print(centers)

三個簇的中心點坐標為：

[[5.006 3.428 ]

[6.81276596 3.07446809]

[5.77358491 2.69245283]]

比較一下K-Means聚類結果和實際樣本之間的差別：

predicted_labels = km.labels_fig, axes = plt.subplots(1, 2, figsize=(16,8))axes[0].scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Set1, edgecolor=’k’, s=150)axes[1].scatter(X[:, 0], X[:, 1], c=predicted_labels, cmap=plt.cm.Set1,edgecolor=’k’, s=150)axes[0].set_xlabel(’Sepal length’, fontsize=16)axes[0].set_ylabel(’Sepal width’, fontsize=16)axes[1].set_xlabel(’Sepal length’, fontsize=16)axes[1].set_ylabel(’Sepal width’, fontsize=16)axes[0].tick_params(direction=’in’, length=10, width=5, colors=’k’, labelsize=20)axes[1].tick_params(direction=’in’, length=10, width=5, colors=’k’, labelsize=20)axes[0].set_title(’Actual’, fontsize=18)axes[1].set_title(’Predicted’, fontsize=18)

k-means算法實例擴展內容：

# -*- coding: utf-8 -*- '''Excercise 9.4'''import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport sysimport randomdata = pd.read_csv(filepath_or_buffer = ’../dataset/watermelon4.0.csv’, sep = ’,’)[['密度','含糖率']].values########################################## K-means ####################################### k = int(sys.argv[1])#Randomly choose k samples from data as mean vectorsmean_vectors = random.sample(data,k)def dist(p1,p2): return np.sqrt(sum((p1-p2)*(p1-p2)))while True: print mean_vectors clusters = map ((lambda x:[x]), mean_vectors) for sample in data: distances = map((lambda m: dist(sample,m)), mean_vectors) min_index = distances.index(min(distances)) clusters[min_index].append(sample) new_mean_vectors = [] for c,v in zip(clusters,mean_vectors): new_mean_vector = sum(c)/len(c) #If the difference betweenthe new mean vector and the old mean vector is less than 0.0001 #then do not updata the mean vector if all(np.divide((new_mean_vector-v),v) < np.array([0.0001,0.0001]) ): new_mean_vectors.append(v) else: new_mean_vectors.append(new_mean_vector) if np.array_equal(mean_vectors,new_mean_vectors): break else: mean_vectors = new_mean_vectors #Show the clustering resulttotal_colors = [’r’,’y’,’g’,’b’,’c’,’m’,’k’]colors = random.sample(total_colors,k)for cluster,color in zip(clusters,colors): density = map(lambda arr:arr[0],cluster) sugar_content = map(lambda arr:arr[1],cluster) plt.scatter(density,sugar_content,c = color)plt.show()

到此這篇關于python中K-means算法基礎知識點的文章就介紹到這了,更多相關python中K-means算法是什么內容請搜索好吧啦網以前的文章或繼續瀏覽下面的相關文章希望大家以后多多支持好吧啦網！

Python 編程

上一條：numba提升python運行速度的實例方法下一條：python中HTMLParser模塊知識點總結

相關文章：

1. python爬蟲實戰之制作屬于自己的一個IP代理模塊2. Spring如何使用xml創建bean對象3. Android Studio設置顏色拾色器工具Color Picker教程4. python 利用toapi庫自動生成api5. python實現PolynomialFeatures多項式的方法6. HTML 絕對路徑與相對路徑概念詳細7. IntelliJ IDEA設置默認瀏覽器的方法8. Java程序的編碼規范（6）9. python實現在內存中讀寫str和二進制數據代碼10. python實現讀取類別頻數數據畫水平條形圖案例

排行榜

					
					python爬蟲實戰之制作屬于自己的一個IP代理模塊
python實現在內存中讀寫str和二進制數據代碼
HTML 絕對路徑與相對路徑概念詳細
python 利用toapi庫自動生成api
Java程序的編碼規范（6）
python實現PolynomialFeatures多項式的方法
IntelliJ IDEA設置默認瀏覽器的方法
Spring如何使用xml創建bean對象
Android Studio設置顏色拾色器工具Color Picker教程
python實現讀取類別頻數數據畫水平條形圖案例
python中PyQuery庫用法分享