Skip to content

Fix bug #81156 unset() on ArrayObject inside foreach skips next index. #7553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Fix bug #81156 unset() on ArrayObject inside foreach skips next index. #7553

wants to merge 15 commits into from

Conversation

NathanFreeman
Copy link
Contributor

@NathanFreeman NathanFreeman commented Oct 4, 2021

https://bugs.php.net/bug.php?id=81156

$ao = new ArrayObject([true, false, false, true]);

foreach ($ao as $key => $value) {
    echo $key, "\n";
    if (!$value) {
        unset($ao[$key]); // or $ao->offsetUnset($key);
    }
}

The function _zend_hash_iterators_update() and zend_hash_move_forward_ex() will modify the iterator position when unseting value circularly.

For example, _zend_hash_iterators_update() change iterator position 0 to 1 and zend_hash_move_forward_ex() change 1 to 2 when unset $ao[0]. So the program will unset $ao[2] in the next cycle.

I made the following changes to solve such problem.

First, recording iterator position changing by using has_updated_iterator field.

// zend_types.h : 472
typedef struct _HashTableIterator {
	HashTable    *ht;
	HashPosition  pos;
	unsigned       is_delete:1;
} HashTableIterator;

// zend_hash.c : 1313
static zend_always_inline void zend_hash_set_is_delete(HashTable *ht, HashPosition from, HashPosition to)
{
	if (UNEXPECTED(HT_HAS_ITERATORS(ht))) {
		HashTableIterator *iter = EG(ht_iterators);
		HashTableIterator *end  = iter + EG(ht_iterators_used);

		while (iter != end) {
			if (iter->ht == ht) {
				if (iter->pos == from) {
					iter->is_delete = 1;
				} else {
					iter->is_delete = 0;
				}
			}
			iter++;
		}
	}
}

Secondly, It means that the iterator position has changed by function _zend_hash_iterators_update() and the current position is valid if the result of function sql_check_update_iterator() is SUCCESS.

The function spl_array_it_move_forward can will execute spl_array_next_ex if the result of function sql_check_update_iterator() is FAILURE.

// sql_arrar.c : 960
static int spl_array_check_if_delete(HashTable *ht) /* {{{ */
{
	int result = FAILURE;
	HashTableIterator *iter = EG(ht_iterators);
	HashTableIterator *end  = iter + EG(ht_iterators_used);

	while (iter != end) {
		/* The iterator has unset value in foreach, so the current position is valid. */
		if (iter->ht == ht && iter->is_delete == 1) {
			result = SUCCESS;
			iter->is_delete = 0;
		}
		iter++;
	}

	return result;
}

// sql_arrar.c : 979
static void spl_array_it_move_forward(zend_object_iterator *iter) /* {{{ */
{
	spl_array_object *object = Z_SPLARRAY_P(&iter->data);
	HashTable *aht = spl_array_get_hash_table(object);

	if (spl_array_check_if_delete(aht) != SUCCESS) {
		spl_array_next_ex(object, aht);
	} else {
		/* execute the common part */
		if (spl_array_is_object(object)) {
			spl_array_skip_protected(object, aht);
		} else {
			uint32_t *pos_ptr = spl_array_get_pos_ptr(aht, object);
			zend_hash_has_more_elements_ex(aht, pos_ptr);
		}
	}
}

To be honest, this pr can not solve the following bug. The results are always weird.
In my branch, the result is {"0":true,"2":true}.
In the main version of php , the result is {"0":true,"2":true,"3":false}.
Can we accept this change? 😃

$ao = new ArrayIterator([true, false, false, true]);

foreach ($ao as $key => $value) {
    $ao->offsetUnset($key);
    $ao->next();
}
echo json_encode($ao), "\n";

Welcome to comment the pr if something wrong with my pr and it will help me to perfect it. Thanks.

@NathanFreeman
Copy link
Contributor Author

I write a new function zend_hash_set_is_delete() to set is_delete = 1.
Because function zend_hash_iterators_update() was used by many other function.

@NathanFreeman
Copy link
Contributor Author

NathanFreeman commented Oct 8, 2021

@nikic
It would be greatly appreciated if you could give me some suggestions for my pr. Thanks.

@nikic
Copy link
Member

nikic commented Nov 3, 2021

I don't like this approach. The core problem here is that the HT iterator mechanism (which exists specifically to update iterators on unsets and similar changes) is designed for use with foreach on arrays, which performs the sequence:

loop:
get current key + value
next
run foreach body
goto loop

That is, when the loop body executes, the iterator already points to the next element, and the update mechanism is written with that in mind. If the element to which the iterator points is deleted, then we move to the next one, rather than the previous one (which would make sense if the iterator pointer to the current element).

Now, for Iterators including ArrayIterator the sequence is different:

loop:
if not first element { next }
get current key + value
run foreach body
goto loop

In this case the iterator points to the current element, which is why the update mechanism fails.

Addressing this using an additional "is deleted" flag is not correct. I see generally two approaches on how to fix it:

  • Change foreach on array to keep the iterator on the current element, adjust iterator update code to move to previous element instead of next.
  • Add a flag to iterators whether they point to current/next element and account for it in iterator update code.

I'm not sure what issues either of those options would encounter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants