Implementation of some well known clustering algorithms and their analysis.
-
football.csvhas the information about 18K football players and their different features, abilities and skills in the game including other attributes like their club, nationality, height etc. -
1_data_visualization.ipynbcontains visualizations of the information in csv files. -
2_KMeans.ipynbcontains implementation of Kmeans clustering algorithm from scratch without the use of any inbuilt libraries. -
3_Agglomerative.ipynbcontains implementation of Agglomerative hierarchical clustering. -
3_Divisive.ipynbcontains implementation of Divisive hierarchical clustering. -
4_DBSCAN.ipynbcontains implementation of DBSCAN clustering algorithm. -
Report.pdfcontains our detailed analysis on all the tasks and their comparison.
All the implementations have the initial code of data cleaning same. After each cell, some print statements are added to show the progress of the code.
To visualize the clusters in 2D, PCA was used for dimensionality reduction.