Learning the Average of a 0/1 Dependent Variable

uppose I have a matrix and a dependent vector whose entries are each

  1. in {0,1}

  2. dependent on the corresponding row of

Given this dataset, I'd like to learn a model, so that given some other dataset ′, I could predict average(′) of the dependent-variable vector ′. Note that I'm only interested in the response on the aggregate level of the entire dataset.

One way of doing so would be to train a calibrated binary classifier →, apply it to ′, and average the predictions. However, the first step addresses a more difficult problem - predicting each of the dependent variables - which I'm basically just reducing to their average. I'm wondering if there is an alternative way that is somehow better (e.g., less prone to overfitting), or a theoretical explanation why there is no such alternative.

Topic theory aggregation classification

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.