Hi wp-hackers,
thank you so much for your availability. I will contact you in private shortly.
With Dinos permission Im sharing the email he sent me and my reply -
Hi Dino,
thank you again for your time. Attached you can find the definition for the tables used by my plugin, for your reference.
It's basically a real time analytics tool for WordPress. The main table is wp_slim_stats, which contains all the pageviews.
In a MU environment, the plugin will create its tables for each new site (wp_1_slim_stats, wp_2_slim_stats, etc). Site admins can access reports for their own blogs.
SELECT t1.*, tb.*,tci.*
FROM wp_slim_stats t1
INNER JOIN wp_slim_browsers tb ON t1.browser_id = tb.browser_id
INNER JOIN wp_slim_content_info tci ON t1.content_info_id = tci.content_info_id
ORDER BY t1.dt desc
LIMIT 0, 50
The corresponding explain is attached as explain-select-1.png.
So here is how this is working:
MySQL examines table t1 (wp_slim_stats) first and applies a sort on dt (will be expensive if there are a lot of rows).
From the largest dt value until 50 is reached, it will very efficiently do a primary key lookup on tb (wp_slim_browsers) and then tci (wp_slim_content_info). As long as there is full participation (INNER JOIN always finds rows in this table, the joining is efficient).
Short story:
I think you need an index on wp_slim_stats.dt to avoid the sort - but otherwise its an efficient query.
Now, some users have requested to get a Network View of their reports, which basically combines all the pageviews to see what post is the most popular network-wide, etc.
In order to do that, I was thinking about two approaches: 1) do a UNION ALL of all those selects or 2) replace all the blog-specific tables with a global wp_slim_stats that includes a new column blog_id.
The latter seems to be easier to implement, but what about performances? Think about a network with 100 blogs that make 100k pageviews/month. The new global table would have 100x100k = 10M records.
You would not be UNIONing ALL millions of rows though, youd be UNIONing top 50 from all tables right? As long as you can use the indexes (like the query you specified), then the number of rows in the table doesnt matter. I think as long as it is efficient enough, this is probably the more desired option?
In MySQL until 5.7 (currently in developer preview) UNION ALL needs to creates a temporary table to buffer results before sending them to the client (so there is a small level of inefficiency), so make sure that each query being unioned doesnt have too many rows. More information on this optimization on my blog here:
http://www.tocker.ca/2013/12/10/testing-the-union-all-optimization-in-mysql-5.7-dmr3.html