Use more than one query, but prevent duplicates - at scale!

Setup

I have a Gutenberg block (made with ACF), which lets users define certain conditions (e.g. this tag or this date, and a number) to display posts matching these conditions. E.g.: Show newest 5 posts or Show posts between June 1st and June 20th with the tag 'car'. Users can create as many of these blocks as they want, I then parse the conditions into WP_Query and output the posts.

Problem

It can happen, though, that one or more posts would be displayed more than once on such a setup, especially, if a lot of these blocks are used. It could be, for example, that 5 newest and the aforementioned posts with the tag car have 3 posts that are exactly the same. I would like to prevent showing the same posts on one page.

Solution 1: post__not_in + transient or option

Now this is no problem, if I use post__not_in in every WP_Query after the first, saving the IDs returned by WP_Query into a transient or an option, but not only does this not feel the right way to do it, it just does not scale. With a lot of posts, which can be displayed with varying conditions on just one page, performance is an issue and post__not_in seems to be a no go, if it comes to performance.

Solution 2: post__not_in + cached results

Of course, there is always the option to use post__not_in no matter what, push all ids of all queries into some custom object, save it into a transient and use this one on subsequent calls, but here, too, this feels to me as if I am using a way to solve a problem where there is a better way, which I am not seeing. (Also pitty the soul that calls the page when the transient is expired.)

Question

Can anyone help out? (No code needed, although it often helps; I am just interested in the way to do this/to solve this problem.)

Topic duplicates wp-query Wordpress performance

Category Web


I don't see any problem with passing post__not_in to the query. Yeh, it might not look like a clever code, but sometimes the only way to implement some functionality is with ugly code.

Regarding the VIP page... its a meh, a specific edge case which might not fit your situation (are you doing a widget that apears on every page?) even if core wordpress caching will not handle this well, you can (and probably should!) implement your caching around those queries. In any case, as you might have several queries to handle it is not clear how applicable the solution suggested in the VIP article. To implement such a solution (filter posts on PHP side) you will have to be ready to query in each query for the total of posts in all queries and this will bring a performance penalty of its own.

PS I would ask again whether posts need to actually be unique and what harm is caused if they are not, is the added code complexity actually serves any useful purpose.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.