There are many counter-examples in using the temporal data a year before inferring the temporal missing values a year after.
I suggest you take a look at the Darts package which is tailored for time series.
As a suggestion, say that you have to infer $m$ missing values, you can proceed as follows. Suppose that you have trained a forecasting model $f(\cdot)$ that forecast the $(n+1)$-th value, say $\hat{v}$, from a generic sequence of $n$ values, say $\langle v_1,v_2,\ldots,v_n \rangle$, that is:
$$
\hat{v} = f(\langle v_1,v_2,\ldots,v_n \rangle).
$$
To predict the first missing value, say $\hat{v}_1$, out of $m$, call:
$$
\hat{v}_1 = f(\langle v_1,v_2,\ldots,v_n \rangle)
$$
where the sequence $\langle v_1,v_2,\ldots,v_n \rangle$ represents the last $n$ values that are known before the first missing value. Now, recursively, having the predicted sequence $\langle \hat{v}_1, \ldots, \hat{v}_{i-1} \rangle$, one can predict the $i$-th value out of $m$, for $1 < i \le m$, by calling:
$$
\hat{v}_i = f(\langle v_i,v_{i+1},\ldots,v_n,\hat{v}_1,\hat{v}_2,\ldots,\hat{v}_{i-1} \rangle).
$$
There are pros and cons to this approach. An advantage is that we do not need to exploit any data model for the missing values and to infer using such a model. A disadvantage is that as we incrementally infer each missing value the error will increase since we use predictions (i.e., inferred missing values) to predict the next outcomes; that is, as $m$ increases the error increases.