One major difference is that the F1-score does not care at all about how many negative examples you classified or how many negative examples are in the dataset at all; instead, the balanced accuracy metric gives half its weight to how many positives you labeled correctly and how many negatives you labeled correctly.
When working on problems with heavily imbalanced datasets AND you care more about detecting positives than detecting negatives (outlier detection / anomaly detection) then you would prefer the F1-score more.
Let's say for example you have a validation set than contains 1000 negative samples and 10 positive samples. If a model predicts there are 15 positive examples (5 truly positive and 10 it incorrectly labeled) and predicts the rest as negative, thus
TP=5; FP=10; TN=990; FN=5
Then its F1-score and balanced accuracy will be
$Precision = \frac{5}{15}=0.33...$
$Recall = \frac{5}{10}= 0.5$
$F_1 = 2 * \frac{0.5*0.33}{0.5+0.3} = 0.4$
$Balanced\ Acc = \frac{1}{2}(\frac{5}{10} + \frac{990}{1000}) = 0.745$
You can see that balanced accuracy still cares about the negative datapoints unlike the F1 score.
For even more analysis we can see what the change is when the model gets exactly one extra positive example correctly and one negative sample incorrectly:
TP=6; FP=9; TN=989; FN=4
$Precision = \frac{6}{15}=0.4$
$Recall = \frac{6}{10}= 0.6$
$F_1 = 2 * \frac{0.6*0.4}{0.6+0.4} = 0.48$
$Balanced\ Acc = \frac{1}{2}(\frac{6}{10} + \frac{989}{1000}) = 0.795$
Correctly classifying an extra positive example increased the F1 score a bit more than the balanced accuracy.
Finally let's look at what happens when a model predicts there are still 15 positive examples (5 truly positive and 10 incorrectly labeled); however, this time the dataset is balanced and there are exactly 10 positive and 10 negative examples:
TP=5; FP=10; TN=0; FN=5
$Precision = \frac{5}{15}=0.33...$
$Recall = \frac{5}{10}= 0.5$
$F_1 = 2 * \frac{0.5*0.33}{0.5+0.3} = 0.4$
$Balanced\ Acc = \frac{1}{2}(\frac{5}{10} + \frac{0}{0}) = 0.25$
You can see that the F1-score did not change at all (compared to the first example) while the balanced accuracy took a massive hit (decreased by 50%).
This shows how F1-score only cares about the points the model said are positive, and the points that actually are positive, and doesn't care at all about the points that are negative.