Discussion:
get_users() exhausts memory with 'all_with_meta' argument
Mike Walsh
2014-01-03 15:10:27 UTC
Permalink
I am working with a couple users who use the Email Users plugin where
memory is being exhausted when calling get_users() with the 'fields'
argument set to 'all_with_meta'. In both cases, the sites have a large
number of users (one has 13k, the other 4.5k). In both cases a call to
get_users() without any arguments succeeds but fails when the
'all_with_meta' option is used.

Anyone run into this before? If so, how did you resolve it?

So this works:

$u = get_users();

But this fails:

$u = get_users(array(fields => 'all_with_meta'));

I've asked one user to do add the following to their wp-config.php to see
if it makes a difference:

define('WP_MEMORY_LIMIT', '512M');

It may work for this site however I am not sure if it simply delays a
future failure or actually fixes it (if it does at all).

Mike
--
Mike Walsh - ***@gmail.com
Nikola Nikolov
2014-01-03 16:01:47 UTC
Permalink
Do you/they really need all of the meta fields for all users? Since WP_User
can load meta field values as properties of the user object( or by using
$user->get( $meta_key ) ), I don't see why you would need to load all of
the meta fields into the memory.

The problem here is because of the fact that they have lots of users(each
one with their own meta information), creating an array of objects with all
of this data just depletes the allocated memory for the PHP process.

So, while loading the meta values as properties of the object might end-up
slower(since I believe, you have to make a separate request to get each
field from the DB), it should in theory work.

They might want to get rid of the user objects that they no longer need -
so something like this:

foreach ( $users as $i => $user ) {
// Do something with the user object here
// Now get rid of the user object
unset( $users[ $i ] );
}

I'm saying they *might*, because they might have enough memory to handle
all user objects when they don't load all of the fields, but instead just
some of them.


Also, it might be a better alternative to create a script that would do
what they are trying to do in incremental steps(maybe 100, or 500, or
however many users work without exhausting the memory limit) instead of in
one big chunk.
Post by Mike Walsh
I am working with a couple users who use the Email Users plugin where
memory is being exhausted when calling get_users() with the 'fields'
argument set to 'all_with_meta'. In both cases, the sites have a large
number of users (one has 13k, the other 4.5k). In both cases a call to
get_users() without any arguments succeeds but fails when the
'all_with_meta' option is used.
Anyone run into this before? If so, how did you resolve it?
$u = get_users();
$u = get_users(array(fields => 'all_with_meta'));
I've asked one user to do add the following to their wp-config.php to see
define('WP_MEMORY_LIMIT', '512M');
It may work for this site however I am not sure if it simply delays a
future failure or actually fixes it (if it does at all).
Mike
--
_______________________________________________
wp-hackers mailing list
http://lists.automattic.com/mailman/listinfo/wp-hackers
J.D. Grimes
2014-01-03 16:17:30 UTC
Permalink
Post by Nikola Nikolov
Do you/they really need all of the meta fields for all users? Since WP_User
can load meta field values as properties of the user object( or by using
$user->get( $meta_key ) ), I don't see why you would need to load all of
the meta fields into the memory.
The problem here is because of the fact that they have lots of users(each
one with their own meta information), creating an array of objects with all
of this data just depletes the allocated memory for the PHP process.
So, while loading the meta values as properties of the object might end-up
slower(since I believe, you have to make a separate request to get each
field from the DB), it should in theory work.
Actually, the meta fields are never retrieved until called for explicitly. All the ‘all_with_meta’ field does is return WP_User objects that give easy access for retrieving the meta if needed, rather than dumb objects like it does otherwise.
Post by Nikola Nikolov
Also, it might be a better alternative to create a script that would do
what they are trying to do in incremental steps(maybe 100, or 500, or
however many users work without exhausting the memory limit) instead of in
one big chunk.
Yeah, you should pull out like 500 or so at a time. You can use get_users() to do that, just use the number and offset parameters.

-J.D.
Otto
2014-01-03 16:45:23 UTC
Permalink
Post by J.D. Grimes
Actually, the meta fields are never retrieved until called for explicitly. All the ‘all_with_meta’ field does is return WP_User objects that give easy access for retrieving the meta if needed, rather than dumb objects like it does otherwise.
No, I think it does actually get all the data from the DB.

Looking at trunk, if you examine wp-includes/user.php, line 577-578,
you find this:

if ( 'all_with_meta' == $qv['fields'] ) {
cache_users( $this->results );

The cache_users function is over in wp-includes/pluggable.php. It
first does a "SELECT * FROM $wpdb->users" for the relevant IDs, thus
getting those user fields into memory. However, it then goes on to
call the update_meta_cache() function with the id's of the users.

The update_meta_cache function (over in wp-includes/meta.php) will do
a SELECT of all the meta info from that user meta table for those IDs
and store them in the object cache.

So yes, it is loading all that meta data into memory (the object
cache) in advance. If you have a lot of users and are not using a
persistent object cache, this will eat up lots and lots of PHP memory.

Best to avoid using all_with_meta for large result sets, unless
wp_using_ext_object_cache() returns true.

-Otto
J.D. Grimes
2014-01-03 17:21:56 UTC
Permalink
Ah, OK, I didn’t look at that function. So it would be better to just set fields to ‘all’ then, as that returns WP_User objects too. I didn’t know that, I guess because actually it didn’t prior to 3.5. I’ve updated the codex: http://codex.wordpress.org/Function_Reference/get_users#Parameters

-J.D.
Post by Otto
Post by J.D. Grimes
Actually, the meta fields are never retrieved until called for explicitly. All the ‘all_with_meta’ field does is return WP_User objects that give easy access for retrieving the meta if needed, rather than dumb objects like it does otherwise.
No, I think it does actually get all the data from the DB.
Looking at trunk, if you examine wp-includes/user.php, line 577-578,
if ( 'all_with_meta' == $qv['fields'] ) {
cache_users( $this->results );
The cache_users function is over in wp-includes/pluggable.php. It
first does a "SELECT * FROM $wpdb->users" for the relevant IDs, thus
getting those user fields into memory. However, it then goes on to
call the update_meta_cache() function with the id's of the users.
The update_meta_cache function (over in wp-includes/meta.php) will do
a SELECT of all the meta info from that user meta table for those IDs
and store them in the object cache.
So yes, it is loading all that meta data into memory (the object
cache) in advance. If you have a lot of users and are not using a
persistent object cache, this will eat up lots and lots of PHP memory.
Best to avoid using all_with_meta for large result sets, unless
wp_using_ext_object_cache() returns true.
-Otto
_______________________________________________
wp-hackers mailing list
http://lists.automattic.com/mailman/listinfo/wp-hackers
Jeremy Clarke
2014-01-03 18:15:17 UTC
Permalink
FWIW the line where you increase memory limits isn't a good solution, as
each affected user will need to get in touch with you after your plugin
breaks their site, and the RAM usage will hurt everyone on their server if
it's shared hosting (assuming their host even tolerates increasing the
memory usage beyond a certain point).

The question is interesting because it seems like it really applies to any
plugin that might want to fetch all users with meta. Unless we know 100%
that there will never be more than $x users on the site a plugin runs on we
should probably all have some kind of inherent limiting built into "all
users" queries.

I'd approach this in one of two ways:

*If one actually only needs to operate on some of the users:*

Ideally you should find a way to limit the list of users you are fetching
BEFORE you retrieve them with their full meta contents. Maybe you can add
other query parameters to get_users that will reduce the total number of
users returned, which would mean less bloat from filling up the meta cache.

The big concern with that solution is that it's easy to end up swapping a
memory-consuming meta cache with a time-consuming SQL query (I bet user
queries based on meta fields are as slow as post queries based on meta
fields).

Another way of doing this would be to get all users without meta, loop
through them and test something non-meta-related to filter out the ones
that aren't necessary, then pass the remaining user ids as the 'include'
parameter of a second get_users call that fetches all meta. Queries with an
'include' parameter full of IDs are usually so fast they're basically
irrelevant, so the SQL penalty of doing two queries shouldn't be a concern.

Of course all that might not be an option for your plugin, but generally
speaking it's probably a best practice when dealing with "all users" on a
WP site.


*If one truly needs to do an operation on the meta of all users:*

Do like Nikola indicated and make this a batch process that does ~500 users
at a time by executing a loop that limits the query to 500 users and uses
an offset that increases by 500 each round.

Note that because the data is being saved to the WP object cache these
loops will add up over the course of the pageload, so you'll probably want
to use AJAX requests or something to handle the batches as separate
requests (i.e. so that after each loop the object cache gets cleared and
you don't end up with one process that contains the full objects of all
users).

Another option is to investigate the possibility of clearing each user from
the object cache after they've been "processed" by your script. Not sure
how you'd do that or if it's possible/feasible, but if you could do that
then you could skip the AJAX step, as each round of the 500-user-loop would
free up memory for the next loop.

A quick look at the siblings of wp_cache_get (/wp-includes/cache.php)
implies that wp_cache_delete() might do the trick.

Good luck :)
--
Jeremy Clarke • jeremyclarke.org
Code and Design • globalvoicesonline.org
Otto
2014-01-03 18:36:17 UTC
Permalink
Post by Jeremy Clarke
The question is interesting because it seems like it really applies to any
plugin that might want to fetch all users with meta.
Realistically, there's no good reason to attempt to load all users
with all the meta all at once anyway. You should limit your queries to
something more sane.

Examine your specific case, and take action accordingly. The original
post mentioned the Email Users plugin. So unless you're actually
retrieving all the users to email them all at once, then don't call
get_users() without some limiting parameters.

The get_users() function is basically a wrapper for WP_User_Query,
which can take a number of limiting parameters. If you're needing to
page through users for display or selection, then it has a "number"
and "offset" field that can be used for paging, like it does for the
users list. Or, if you need to retrieve users that only have a
specific meta_key to allow them to be emailed (like an opt-in), then
you can send it a combination of meta_key/meta_value/meta_compare
parameters to limit the query to only return those users.

I can't think of any real case where you actually need not just all
the users, but all their meta too. You will certainly need some subset
of users, or all the users but without the need for the meta, or you
might need to page through them and get some of the meta about them
too. As long as you're careful to limit your requests to only that
which is needed, you will have far fewer problems.

-Otto
Mike Walsh
2014-01-03 19:29:19 UTC
Permalink
Since I originally posted this I've been playing around with limits getting
500 users at a time. This worked better but eventually failed as well on
the use case with 13K users. After reading all of the replies so far, I
think it is due to the growing cache. I added a check to watch the memory
increase with each query for 500 users and it steadily increases until
memory is exhausted.

To answer some of questions - the opt-in aspect is exactly what is stored
as meta data. I took over this plugin a while back and this query model
has been part of it long before I started working on it. Basically it does
a query for all users which have meta data and processes that. The Codex
indicates the meta query has been available since 3.5.0 and this plugin
dates back to the 2.x days.

It probably makes sense to re-implement this portion of the code as it
appears pretty fragile. One of those things that has been this way for
years and wasn't an issue because it never broke until now.

Mike
Post by Otto
Post by Jeremy Clarke
The question is interesting because it seems like it really applies to
any
Post by Jeremy Clarke
plugin that might want to fetch all users with meta.
Realistically, there's no good reason to attempt to load all users
with all the meta all at once anyway. You should limit your queries to
something more sane.
Examine your specific case, and take action accordingly. The original
post mentioned the Email Users plugin. So unless you're actually
retrieving all the users to email them all at once, then don't call
get_users() without some limiting parameters.
The get_users() function is basically a wrapper for WP_User_Query,
which can take a number of limiting parameters. If you're needing to
page through users for display or selection, then it has a "number"
and "offset" field that can be used for paging, like it does for the
users list. Or, if you need to retrieve users that only have a
specific meta_key to allow them to be emailed (like an opt-in), then
you can send it a combination of meta_key/meta_value/meta_compare
parameters to limit the query to only return those users.
I can't think of any real case where you actually need not just all
the users, but all their meta too. You will certainly need some subset
of users, or all the users but without the need for the meta, or you
might need to page through them and get some of the meta about them
too. As long as you're careful to limit your requests to only that
which is needed, you will have far fewer problems.
-Otto
_______________________________________________
wp-hackers mailing list
http://lists.automattic.com/mailman/listinfo/wp-hackers
--
Mike Walsh - ***@gmail.com
Otto
2014-01-03 19:33:23 UTC
Permalink
Post by Mike Walsh
To answer some of questions - the opt-in aspect is exactly what is stored
as meta data. I took over this plugin a while back and this query model
has been part of it long before I started working on it. Basically it does
a query for all users which have meta data and processes that. The Codex
indicates the meta query has been available since 3.5.0 and this plugin
dates back to the 2.x days.
Cool. So instead of doing get_users(array('fields' =>
'all_with_meta')), change it to:

get_users(array(
'meta_key' => 'your-opt-in-key',
'meta_value' => 'your-opt-in-value',
) )

And you should be good to go, most likely. This will only get the
users opted in and it won't try to cache all the meta data for all the
users at once.

-Otto
Mike Walsh
2014-01-03 20:33:31 UTC
Permalink
It turns out the 'all_with_meta' wasn't necessary at all. In the pre-3.x
days before the advent of the magic methods for the get_users() function
[see this: http://scribu.net/wordpress/the-magic-of-wp_user.html] the
'all_with_meta' was used to retrieve the first and last names for each user
which is stored in the meta table. Now that the meta data exists through
the 'magic' of get_users(), I no longer need 'all_with_meta' so I simply
eliminated the argument. Now the memory usage for this site of 13k users
never goes above 46M-47M.

<!-- email-users.php::1084 Query #0 Memory Usage: 34.5M --><!--
email-users.php::1084 Query #1 Memory Usage: 34.75M --><!--
email-users.php::1084 Query #2 Memory Usage: 35.25M --><!--
email-users.php::1084 Query #3 Memory Usage: 35.75M --><!--
email-users.php::1084 Query #4 Memory Usage: 36.5M --><!--
email-users.php::1084 Query #5 Memory Usage: 37M --><!--
email-users.php::1084 Query #6 Memory Usage: 37.5M --><!--
email-users.php::1084 Query #7 Memory Usage: 37.75M --><!--
email-users.php::1084 Query #8 Memory Usage: 38.5M --><!--
email-users.php::1084 Query #9 Memory Usage: 39M --><!--
email-users.php::1084 Query #10 Memory Usage: 39.5M --><!--
email-users.php::1084 Query #11 Memory Usage: 39.75M --><!--
email-users.php::1084 Query #12 Memory Usage: 40.25M --><!--
email-users.php::1084 Query #13 Memory Usage: 40.75M --><!--
email-users.php::1084 Query #14 Memory Usage: 41.25M --><!--
email-users.php::1084 Query #15 Memory Usage: 41.5M --><!--
email-users.php::1084 Query #16 Memory Usage: 42.5M --><!--
email-users.php::1084 Query #17 Memory Usage: 43M --><!--
email-users.php::1084 Query #18 Memory Usage: 43.5M --><!--
email-users.php::1084 Query #19 Memory Usage: 44M --><!--
email-users.php::1084 Query #20 Memory Usage: 44.5M --><!--
email-users.php::1084 Query #21 Memory Usage: 44.5M --><!--
email-users.php::1084 Query #22 Memory Usage: 45.25M --><!--
email-users.php::1084 Query #23 Memory Usage: 45.75M --><!--
email-users.php::1084 Query #24 Memory Usage: 46M --><!--
email-users.php::1084 Query #25 Memory Usage: 46.5M --><!--
email-users.php::1084 Query #26 Memory Usage: 47M -->

I am not sure I need to break the query up into chunks of 500 any more
although I supose it doesn't hurt anything if I do. I could also add
some sort of a fail safe that if memory approaches the max limit WP
defines, I could terminate the loop.

As always, thanks for the help on wp-hackers.

Mike
Post by Otto
Post by Mike Walsh
To answer some of questions - the opt-in aspect is exactly what is stored
as meta data. I took over this plugin a while back and this query model
has been part of it long before I started working on it. Basically it
does
Post by Mike Walsh
a query for all users which have meta data and processes that. The Codex
indicates the meta query has been available since 3.5.0 and this plugin
dates back to the 2.x days.
Cool. So instead of doing get_users(array('fields' =>
get_users(array(
'meta_key' => 'your-opt-in-key',
'meta_value' => 'your-opt-in-value',
) )
And you should be good to go, most likely. This will only get the
users opted in and it won't try to cache all the meta data for all the
users at once.
-Otto
_______________________________________________
wp-hackers mailing list
http://lists.automattic.com/mailman/listinfo/wp-hackers
--
Mike Walsh - ***@gmail.com
Loading...