Best way to represent a version feature based on percentiles

We're training a binary classifier in AutoML, and one of the features consist of browser versions. Currently these versions are provided normalized to the model, according to the percentile of the browser the current observation falls into. For example, if the percentiles of some specific browser versions are:

percentile version
p25 34
p50 45
p75 53
p99 70

then an observation with said browser and version=54 would be represented as:

p25 p50 p75 p99
1 1 1 0

My question is, wouldn't it be better to provide a single integer feature called percentile_version that shows the maximum percentile reached? For the previous example it would be represented as:

percentile_version
3

Given that the observation's version is greater than the first 3 percentiles, in a fixed amount of percentiles to check, of course.

Topic binary-classification google-cloud-platform automl feature-construction feature-extraction

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.