Hello World, This is Saumya, and I am here to help you understand and implement Linear Regression in more detail and will discuss various problems we may encounter while training our model along with some techniques to solve those problems. There won't be any more programming done in this post, although, you can try it out yourself, whatever is discussed in this blog. So now, first of all, Let's recall what we studied about Linear Regression in our previous blog . So, we first discussed about certain notations regarding to machine learning in general, then the cost function, h θ (x (i) )= θ 0 x 0 +θ 1 x 1 . Further we discussed about training the model using the training set by running the gradient descent algorithm over it. We also discussed about the Cost Function. Now, before we begin, I want to talk about the Cost Function in brief. Cost function, as we defined, is, J(θ)= i=1 m ∑ ( h θ (x (i) )-y (i) ) 2 / (2*m). If we define cost function, we can define it as t
Hello World, This is Saumya, and I am here to help you understand and implement K-Means Clustering Algorithm from scratch without using any Machine Learning libraries . We will further use this algorithm to compress an image. Here, I will implement this code in Python, but you can implement the algorithm in any other programming language of your choice just by basically developing 4-5 simple functions. So now, first of all, what exactly is Clustering and in particular K-Means? As discussed in my blog on Machine Learning , Clustering is a type of unsupervised machine learning problem in which, we find clusters of similar data. K-means is the most widely used clustering algorithm. So basically, our task is to find those centers for the clusters around which our data points are associated. These centres of the Clusters are called centroids(K). Note that, these cluster centroids, may or may not belong to our dataset itself. Since our problem is to choose these