Can longitudinal studies be completely anonymous?
A regular digital questionnaire can be completely anonymous, by sending out a non-personalized URL for the questionnaire and not asking or storing identifiable information (such as the users IP address or asking questions about date of birth, etc.). By this I mean, as the researcher, I am unable to later identify who filled out a questionnaire, even if I wanted to.
I now have a longitudinal study, with 4 waves of questionnaires, one year apart each. Consecutive waves are required to only be send to participants who filled out the previous wave(s). The study should ideally be absolutely anonymous to the data collector and data evaluator, as it is about mental health. However I feel like this is not possible, as being able to identify which person has filled out a previous questionnaire requires adding some kind of identifying information to the answer set. I figure that without this information, I need to send every wave to every possible participant, every year.
I found this paper from Audette et al (2019) which discusses different methods for anonymizing longitudinal studies. Interestingly, I feel like all four described methods (nonanonymous data later de-identified, preexisting unique identifiers, electronic anonymizing system, self-generated identification codes) are not truly anonymous, as the person who filled out a questionnaire can theoretically be identified by a person with access to enough data about the questionnaires and its answers.
It tingles in my fingers that there might be some underlying concept of data science or computer science which describes my feeling that a longitudinal study can never be truly anonymous, but I am not 100% sure. Is this the case?
Topic anonymization
Category Data Science