Going from voting per district + census data, to voting per age?

I have some voting or polling data that is listed by voting district. I also have detailed demographics of each voting district. How can I combine this to get an estimation of how the different demographics voted? I want to be able to make a chart of percent yes over age, or income bracket. (In the end, I want to use these relations to try and predict the outcome in a place with different demographics).

One approach I've seen is to treat it as a classification problem, and then assign each district input variables such as % male, % young and so on. Then train a classifier, such as a BDT, and you can use it to predict voting outcomes for other demographics. The problem I see with this is that it treats the whole district as one data point. I can only indirectly get distributions of how the demographics voted. (For example see here: https://towardsdatascience.com/understanding-voting-outcomes-through-data-science-5d257b51ae5c)

I guess another approach would be to randomly generate data points in form of pseudo voters. The benefit would be that I could not only have single distributions (vote vs. age) but also multidimensional distributions (vote vs. age for different ethnicities). But I don't think the source data gives me that much information. And I would not even know how to create the pseudo data. It is probably prohibitively computationally expensive.

This seems like a very standard thing one wants to do, but I can't recall what the go to technique is. (The reverse, combining census data with voting per demographic to predict results in places seems straightforward.) Any suggestions?

Topic inference

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.