No-code ML-Lab
Create a training
Start the process from the AI module by clicking the “Create” button. Assign a name and a description to the training to easily identify it among multiple sessions.

📊 Select the dataset or dataset view
Choose the dataset that will be used as the basis for the model. It must be previously loaded in the Datasets section of the TOKII workspace (see datasets section). You can also select a derived view of the original dataset if you have defined filters or transformations.

🧠 Define the type of learning
🧩 Supervised Learning
It is based on a dataset with a known target variable, that is, a column whose value we want to predict.
Classification: predicts discrete categories, such as the status of the equipment (normal/failure).
Regression: estimates continuous numerical values, such as energy consumption or future temperature.
🧭 Unsupervised Learning
Does not require a target variable. The system automatically detects structures or groups within the data.
Clustering: groups similar records without prior labels. Useful for detecting usage patterns, behavior segments, or trends in sensors.

🧮 Select the specific algorithm
Depending on the type of learning chosen, TOKII offers a set of algorithms ready for training:
🧾 Classification
🔹 Logistic Regression
It is one of the simplest and most effective algorithms for binary classification tasks, that is, when the goal is to predict between two classes (such as active/inactive, failure/no failure). Although its name includes “regression,” it is used for classification, as it estimates the probability of a data point belonging to one class or another. Its main advantage is speed and ease of interpretation.
🔹 Decision Tree
This model builds a set of “if-then” rules to make decisions, dividing the data into branches according to their characteristics. It is intuitive, easy to visualize, and useful when you want to understand why a particular class has been assigned. It adapts well to structured datasets with categorical or numerical variables.
🔹 Random Forest
It is an extension of the decision tree. Instead of using a single tree, it builds multiple trees and combines their results (voting). This improves accuracy and reduces the risk of the model overfitting the training data. It is very robust and works well in environments with noise or many attributes.
🔹 Gradient Boosting Classifier
This algorithm sequentially trains several simple models, improving on each iteration the errors made by the previous models. It is one of the most powerful and precise methods currently available, although it may require more training time. It is ideal for complex tasks where maximum predictive performance is sought.
🔹 K-Nearest Neighbors (K-NN)
Classifies each new data point based on the most similar data points to it (nearest neighbors) according to a distance metric. It is straightforward and does not need a training phase as such. Its effectiveness depends on having well-distributed data and not too noisy. It is useful for cases where relationships between variables are clear.
🔹 Naive Bayes
This model applies probability principles (Bayes' theorem) and assumes that input variables are independent of each other. Although this assumption is rarely fully met, it works surprisingly well in practice, especially in text classification, categorized alerts, and quick processes.
🔹 Neural Network (MLP)
This algorithm simulates the functioning of human neurons and allows modeling complex and nonlinear relationships. It is very flexible and can adapt to a wide variety of problems, although it requires more data and is less transparent (a kind of “black box”). It is useful for sophisticated tasks where other models fail to perform well.

📉 Regression
🔹 Linear Regression
It is the simplest regression model. It seeks to fit a straight line that describes the relationship between an input variable and an output variable. For example, if higher temperatures lead to higher energy consumption, this model will identify that linear trend. It is easy to interpret and quick to train, but its accuracy is limited to problems with proportional and direct relationships.
🔹 Ridge Regression
This variant of linear regression introduces a penalty to prevent the model from overfitting the data. This is especially useful when there are many input variables that may be correlated. Ridge allows maintaining a good balance between simplicity and predictive ability, improving the model's generalization.
🔹 Lasso Regression
Like Ridge, Lasso adds a penalty, but with the added ability to completely eliminate variables that do not add value to the model. This makes it an effective tool for simplifying models when there is a lot of data. It is ideal when the goal is not just to predict well but also to identify which variables are truly relevant.
🔹 Elastic Net
Combines the penalties of Ridge and Lasso, getting the best of both worlds: controlling overfitting and performing variable selection. It is especially useful when there are many predictive variables, and some of them are correlated. This flexibility makes it a solid option for complex problems without losing interpretability.
🔹 Decision Tree Regressor
This algorithm divides the data into branches, making simple “if... then...” decisions until reaching an estimated value. It can model nonlinear relationships and handle categorical or numerical variables. Its hierarchical structure makes it easy to visualize, but it can overfit if not properly regulated.
🔹 Random Forest Regressor
It is a combination of multiple decision trees trained on different subsets of data. By averaging their predictions, a more robust and precise model is obtained. It is very useful for noisy data, multiple variables, or complex behaviors, and requires very little data preparation in advance.
🔹 Gradient Boosting Regressor
It is one of the most powerful regression models. It works by building decision trees sequentially, where each new tree corrects the errors of the previous one. Although slower to train, it usually provides highly accurate results, being ideal for demanding tasks such as time series prediction or precise energy consumption.
🔹 K-Nearest Neighbors Regressor (K-NN)
This model predicts the output value by looking for the "k" most similar data points to the new data and averaging their values. It does not require prior training, but its performance depends on having well-distributed data. It is intuitive and effective when there are clear local relationships between the data.
🔹 Neural Network Regressor (MLP)
Inspired by the functioning of the human brain, this model connects multiple layers of artificial “neurons” that allow learning complex and nonlinear relationships between the data. Although less interpretable than other models, its ability to detect hidden patterns makes it very useful in industrial environments with multiple interrelated factors.

🧩 Clustering
🔹 K-Means
K-Means is one of the most well-known and widely used clustering algorithms. Its operation is based on dividing the data into a fixed number of groups (k) in such a way that the elements within each group are as close as possible to their “centroid” (the midpoint of the group). It is very fast and efficient for large volumes of data, as long as the groups have round shapes and similar sizes. It is ideal for simple and well-defined segmentations, although it may not work well if clusters have irregular shapes or unequal sizes.
🔹 DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
This algorithm groups data based on density, meaning it creates clusters when there is sufficient concentration of points in a region, and treats isolated points as “noise.” It is particularly useful when working with data containing groups of irregular shapes or outliers. DBSCAN does not require specifying the number of groups in advance and is robust against extreme values, making it very effective for tasks such as anomaly detection or grouping in industrial environments with operational noise.
🔹 Hierarchical Clustering
This method constructs a tree-like structure (called a dendrogram) that represents how the data progressively cluster together. It can be executed in an ascending manner (joining elements to form a large group) or descending (dividing a group into subgroups). It is very useful when you want to understand the relationship between the data and do not know in advance the number of clusters. It allows visualizing levels of grouping and selecting the most appropriate one based on the analysis.
🔹 Spectral Clustering
The spectral algorithm transforms the data into a different mathematical space using linear algebra techniques (matrices and graphs), where it becomes easier to identify complex structures not visible in the original space. It then applies a clustering algorithm such as K-Means on that new representation. This method is very effective for problems where the data has nonlinear shapes or more subtle relationships, such as curves or interconnected groupings.
🔹 Mean Shift
IThis algorithm does not require specifying the number of clusters in advance. Instead, it detects areas with high point density and iteratively shifts each data point towards the nearest density center. It is very useful when the data have multiple natural concentrations of information, although it can be more computationally expensive.
🔹 Gaussian Mixture Model (GMM)
This algorithm models the data as a combination of Gaussian distributions (bell-shaped curves). Unlike K-Means, it assigns a probability to each data point of belonging to each group, allowing for more flexible and smooth groupings. It is ideal for situations where clusters may overlap or have different shapes and sizes.
