Plant stage type prediction
- amandathying
- Jan 2
- 1 min read
Updated: Jan 9
Context: The task was to take a set of given dataset in .db format and perform exploratory data analysis and a model to predict a plant's stage type using sensor data and optimal temperature for plant stage type.
Exploratory DAta Analysis
The main objective of EDA was to:
Understand the data provided to get an inkling on what attributes might contribute to an accurate prediction model
Clean and process any abnormalities or missing values
Process datatype to ensure suitability for MLM
Experiment with various models to narrow down final selection
EDA process can be viewed here.
Through EDA, I discovered that XGBoost performs the best in predictng plant stage type using sensor data. So this was the model used in the developing of the app.


End to end machine learning pipeline
Using XGBoost, I created an app package which have been uploaded to github.
File can be pulled and runned locally. Below is a screenshot of the webapp.
After thoughts:
This was done as part of an assessment within a short time frame. After reviewing, there were some things that I would do to improve.
EDA:
To check how the various numerical data were skewed before testing for their association. But I assumed that it’s a large data set and that meant that they will be normally distributed. Hence, the use of ANOVA.
For numerical data, I used spearman because it is not sensitive to outlier. The result provides us insights as to which attributes have a monotonicity relationship with another. But I could take it a step further to check for non-monotonicity relationships.


Comments