Model name: Crank validity model
Goal: The objective of the model is to use the features of the cranking voltage profile, derived from the time series voltage data sent by GO devices, to determine whether a cranking event is valid or invalid.
Base model: Gradient boosting classifier
Model type: Supervised, binary classification
Model version: 14
Developed by: Vehicle maintenance analytics team
How can I use this product? This model is used in our Google BigQuery environment to remove invalid events for downstream use cases in the Maintenance center. As an end user you will not interact with this model or see any direct outcomes in your MyGeotab account, as it is a data quality check.
Primary intended uses:
Out-of-scope uses: The following devices do not transmit voltage data during an event or do not provide sufficient data quality for accurate predictions. Therefore, the model should not be used to classify cranking events from the following devices:
Targeted users/user groups: Users who are interested in using the voltage profile of cranking events, especially for predictive maintenance, to identify potential battery or electrical system failures can leverage the model's outcome to filter erroneous data.
Factors impacting model performance: The accuracy of the model varies depending on the vehicle type (make, model, fuel type). This is based on an evaluation of the test dataset, which showed that some vehicle types have lower accuracy than others. These categorical features are not included in the model.
In this section, we highlight some ethical challenges that were encountered during the model development, including bias and fairness considerations, and present our solutions to overcome these challenges. Additionally, we provide the assumptions and constraints of our model, including any limitations in the data or the model’s scope that could affect its performance, in order to foster the understanding of the model's strengths and limitations to the stakeholders which is crucial to use the model responsibly and interpreting its results.
The model assumes that Make, Model, and Fuel Type are the key categorical features. However, if another relevant categorical feature—such as Year—should also be considered, its exclusion may have led to an inaccurate determination of the number of samples per group in the training and evaluation datasets.
The model was evaluated on the unseen test data and the performance metrics used to assess the model are accuracy, precision and recall.