MODELLING AND VISUALISATION
Modelling
Modeling in machine learning refers to the process of creating a mathematical or computational model that learns patterns from data to make predictions or decisions without being explicitly programmed to perform the task.
Steps:
- Choose a model architecture (e.g., linear regression, decision tree, neural network).
- Train the model on data (feed in input data and let the model learn from the outcomes).
- Evaluate the model to see how well it performs.
- Use the model to make predictions on new/unseen data.
Popular Python Modeling Libraries:
| Library | Purpose | Key Features |
|---|---|---|
| scikit-learn | General ML | Wide variety of classical ML algorithms (classification, regression, clustering), easy API |
| XGBoost | Gradient Boosting | Fast and accurate gradient boosting implementation |
| LightGBM | Gradient Boosting | Fast, supports large datasets, better performance on categorical features |
| CatBoost | Gradient Boosting | Handles categorical features well automatically |
| TensorFlow | Deep Learning | Powerful, production-ready deep learning library |
| PyTorch | Deep Learning | Popular for research and development, flexible |
| Keras | Deep Learning | High-level API (now integrated with TensorFlow) for building deep learning models easily |
| Statsmodels | Statistical Modeling | Great for linear regression, time series analysis, econometrics |
Interfacing between Pandas and Model Code
| The interface between Pandas and machine learning model code refers to how data stored and manipulated using Pandas (such as DataFrames and Series) is connected to machine learning libraries such as Scikit-learn, TensorFlow, or PyTorch. 1. Data Preparation with Pandas Pandas is widely used in the data preparation stage of the machine learning pipeline. This includes:
2. Splitting Features (X) and Target (y) Before training a machine learning model, the dataset is typically divided into two parts:
This is usually done by selecting the appropriate columns from the DataFrame. 3. Splitting Data Using train_test_split()
|
|---|
Comments
Post a Comment