Machine learning challenges and Trends. ML challenges come in a variety of shapes and sizes. Machine learning issues can be classified in a variety of ways. We’ll go through the most apparent ones here.
1. A learning system’s learning “signal” or “feedback” is based on the type of the learning “signal” or “feedback” available to it.
• Supervised learning: A “teacher” presents the computer with sample inputs and desired outputs, with the objective of learning a general rule that maps inputs to outputs. The model is trained until it reaches the appropriate degree of accuracy on the training data. Here are some real-life examples:
- Image Classification: Images/labels are used to train. Then you send a fresh image in the future, expecting the computer to recognise the new thing.
- Market Prediction/Regression: You use previous market data to train the computer and then ask it to forecast a new price in the future.
Unsupervised learning: The learning algorithm is given no labels and is left to identify structure in its data on its own. It’s utilised to divide people into distinct categories. Learning without supervision may be a goal in and of itself (discovering hidden patterns in data).
- Clustering: In research and study, you ask the computer to group together comparable data into clusters.
- High-Dimensional Data Visualization: Make use of the computer to aid in the visualisation of high-dimensional data.
- Generative Models: A model can create new data when it captures the probability distribution of your input data. This can help you improve the robustness of your classifier.
Below is a simple graphic that explains the difference between supervised and unsupervised learning:
As you can see, supervised learning uses labelled data, whereas unsupervised learning uses unlabeled data.
- Semi-supervised learning: Semi-supervised learning situations are those in which you have a significant amount of input data but only portion of it is labelled. These issues fall in between supervised and unsupervised learning. Consider a photo repository with only a few photographs identified (e.g. dog, cat, person) and the remainder of the images unlabeled.
- Reinforcement learning is when a computer programme interacts with a dynamic environment in order to achieve a certain objective (such as driving a vehicle or playing a game against an opponent). As it navigates its issue space, the software receives feedback in the form of incentives and penalties.
2. On the basis of a machine-learned system’s desired “output”
- Classification: Inputs are separated into two or more classes, and the learner is required to create a model that assigns unseen inputs to one or more of these classes (multi-label classification). This is usually done under the supervision of a professional.
- Spam filtering is an example of classification, with email (or other) messages as inputs and classifications of “spam” and “not spam.”
- Regression: It’s a supervised learning problem with continuous rather than discrete outputs. Predicting stock prices, for example, using past data.
- Below is an example of classification and regression using two separate datasets.
- Clustering: A collection of inputs is split into groups in this step. Because the groupings aren’t known ahead of time, unlike classification, this is usually an unsupervised task.
- The provided dataset points have been split into groups recognisable by the colours red, green, and blue, as seen in the example below.
- Density estimation: The goal is to figure out how inputs are distributed in a given space.
- Dimensionality reduction: It reduces the number of variables by mapping them to a lower-dimensional space. A related challenge is topic modelling, in which a software is given a set of human language documents and asked to determine whether documents cover comparable subjects.
We have a variety of algorithms that are utilised to complete these machine learning tasks/problems based on these machine learning tasks/problems. Linear Regression, Logistic Regression, Decision Tree, SVM Support vector machines), Naive Bayes, KNN(K closest neighbours), K-Means, Random Forest, and others are some of the most widely used machine learning methods.
Note: All of these algorithms will be discussed in detail in future posts.
Machine Learning Terminologies | Machine learning challenges
- Model
- A model is a particular representation that is learned from data using a machine learning method. Hypothesis is another term for a model.
- Feature
- A feature is a quantifiable aspect of our data that is unique to it. A feature vector is a handy way to express a set of numeric features. The model is fed with feature vectors as input. Color, smell, and taste, for example, may be used to anticipate the appearance of a fruit.
- Note: For effective algorithms, selecting informative, discriminating, and independent characteristics is critical. To extract the essential characteristics from the raw data, we usually use a feature extractor.
- Set a goal (Label)
- A target variable, often known as a label, is the value that our model is supposed to predict. The label for each piece of input in the fruit example described in the features section would be the name of the fruit, such as apple, orange, or banana.
- Training
- The aim is to provide a collection of inputs (features) and predicted outputs (labels), such that after training, we will have a model (hypothesis) that will map incoming data to one of the trained categories.
- Prediction
Once our model is complete, we can feed it a set of inputs and get a projected result (label).