Summarize events per ID
Data: Each corresponds to an event (a person's visit to the hospital, as an example). I have a series of data associated with this event (duration of visit, motive, etc...).
Objective: Summarize the above information in a per person data set (meaning that the new data set should have only on row per person and capturing as much information about their history as possible).
My initial solutions: 1 - The most obvious, and potentially useful, is to create relevant variables by hand. For instance, if the objective is to predict the average time of next visit, the average time on the past is relevant. However, this is very problem specific, and I feel there should be other (not in replacement) options.
2 - Recurrent neural networks. As visits have a time sequence, it seams logical to mean to apply a recurrent autoencoder, in order to summarize this data. (If this is by no means correct, can you point out why?)
3 - What if I don't have a time sequence?