'I want to delete all the lists with length 1 at the beginning and end but not in the middle in python

Given a list of lists (i.e. a nested list), I want to delete all the inner lists with length 1 from the beginning and the end of the outer list. For example

d = [[3], [1,5], [3], [3,2,4], [8], [3]] should become: [1,5], [3], [3,2,4].

For the start of the list, I am using the following code:

i = 0
while len(d[i]) == 1:
    d.remove(d[i])
    n = n -1

and for the end, I use this code:

 while len(d[n-1]) == 1:
    d.remove(d[n-1])
    n = n - 1

where n = number of lists.

When I run this I get:

[[1, 5], [3, 2, 4]]

So it also removes the lists in the middle with length 1. How can I change the code so it does not do that?

So for an n number of lists, I want to remove all the lists at the beginning which have length 1 until there is one list that does not have length 1. I want to do the same for the end of the list.



Solution 1:[1]

Most of the current answers involve repeated popping from the front of a list, which is slow.

Using the answers here to find the first and last index where a condition does not hold, we can find the start and end of the desired sublist in linear time. Then, we construct the slice using the found start and end indices. If the list contains only sublists of length 1, then our generators will raise a StopIteration -- to avoid this, we check ahead of time whether all of the lists are of length 1:

if all(len(item) == 1 for item in data):
    result = []
else:
    start = next(i for i, v in enumerate(data) if len(v) != 1)
    end = len(data) - next(i for i, v in enumerate(reversed(data)) if len(v) != 1)
    
    result = data[start:end]

print(result)

This outputs:

[[1, 5], [3], [3, 2, 4]]

If you really need speed, you can skip the generators and the first pass check (this code has been improved from a suggestion by juanpa.arrivillaga):

start = -1
for idx in range(len(data)):
    if len(data[idx]) != 1:
        start = idx
        break

end = len(data)
for idx in reversed(range(len(data))):
    if len(data[idx]) != 1:
        end = idx
        break

result = data[start:end + 1]

print(result)

which seems to perform better on my machine.

Benchmark code:

import timeit

def bb1(data):
  if all(len(item) == 1 for item in data):
      result = []
  else:
      start = next(i for i, v in enumerate(data) if len(v) != 1)
      end = len(data) - next(i for i, v in enumerate(reversed(data)) if len(v) != 1)

      result = data[start:end]
  return result

def bb2(data):
  start = -1
  for idx in range(len(data)):
      if len(data[idx]) != 1:
          start = idx
          break

  end = len(data)
  for idx in reversed(range(len(data))):
      if len(data[idx]) != 1:
          end = idx
          break

  result = data[start:end + 1]
  return result


def fb(d):
  while len(d[0]) == 1:
      d.pop(0) # remove from head

  while len(d[-1]) == 1:
      d.pop() # remove from tail

Then,

data = ([[3] for _ in range(100000)]) + [[3], [1,5], [3], [3,2,4], [8], [3]]

%timeit -n7 bb1(data[:])
%timeit -n7 bb2(data[:])
%timeit -n7 fb(data[:])

Output:

7 loops, best of 5: 23.7 ms per loop
7 loops, best of 5: 13.1 ms per loop
7 loops, best of 5: 900 ms per loop

Solution 2:[2]

here is a simple solution:

def removeFunction(d):
    l = 0
    r = len(d) - 1
    
    # remove length 1 elements from beginning
    for i in range(len(d)):
        if len(d[i]) > 1:
            l = i
            break
            
    # remove length 1 elements from the end
    for i in range(len(d)-1, -1, -1):
        if len(d[i]) > 1:
            r = i
            break
            
    return d[l:r+1]

Solution 3:[3]

You can simply keep track of the indices where the condition is met and then slice your list accordingly, instead of modifying the list.

def filter_list(list_of_list, len_to_filter):
    l, r = 0, -1  # left, right indices
    while len(list_of_list[l]) == len_to_filter:
        l += 1
    while len(list_of_list[r]) == len_to_filter:
        r -= 1
    if r == -1:  # avoid case where r + 1 is 0
        return list_of_list[l:]
    else:
        return list_of_list[l : r + 1]

filter_list(d, 1) 

This doesn't mutate the original list and does not require any sorting or reversal. You just traverse the list from each side until a different size than the expected one is reached.

Here is a benchmark comparing it to pop, if there are few lists with the required length to remove, the version and indices versions are pretty much the same. If there are a lot of lists with the required length to filter, to pop operations become non-negligible and the indices version is better.

import random

small = [[3], [1, 5], [3], [3, 2, 4], [8], [3]]
# create 10_000 lists of length 1 or 2
ll = [
    [random.randint(0, 1000) for _ in range(random.randint(1, 2))]
    for _ in range(10_000)
]

worst_case = [
    [random.randint(0, 1000) for _ in range(random.randint(1, 1))]
    for _ in range(10_000)
]

worst_case[5000] = [1, 2, 3]


def filter_pop(list_of_list, len_to_filter):
    while len(list_of_list[0]) == len_to_filter:
        list_of_list.pop(0)

    while len(list_of_list[-1]) == len_to_filter:
        list_of_list.pop()

    return list_of_list


def filter_list(list_of_list, len_to_filter):
    l, r = 0, -1  # left, right indices
    while len(list_of_list[l]) == len_to_filter:
        l += 1
    while len(list_of_list[r]) == len_to_filter:
        r -= 1
    if r == -1:  # avoid case where r + 1 is 0
        return list_of_list[l:]
    else:
        return list_of_list[l : r + 1]
In [208]: %timeit -n100 filter_pop(ll[:], 1)
47.6 µs ± 9.77 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [209]: %timeit -n100 filter_pop(worst_case[:], 1)
7.46 ms ± 71.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [210]: %timeit -n100 filter_list(ll, 1)
43.7 µs ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [211]: %timeit -n100 filter_list(worst_case, 1)
1 ms ± 59 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Solution 4:[4]

There's no need for i or n since you're modifying the list, so its first and last elements change, not their indices. You can do something along these lines:

while d and len(d[0]) == 1:
    d.pop(0) # remove from head

while d and len(d[-1]) == 1:
    d.pop() # remove from tail

Solution 5:[5]

remove() removes the first encountered element given in the function paramater.

So in your case your code that removes the last element first checks the elements from beginning of the list, because of that it removes [3] from middle.

For that part you can use pop() or del which removes the element from given index:

while len(d[n-1]) == 1:
    d.pop()
    n = n - 1

Note: You can use pop() without an argument in this case, because it removes last item from the list when there is no given argument.

Solution 6:[6]

To solve this problem, I will scan the list only once to find the start and end of the sublist:

def remove_ends(li):
    start = end = len(li)
    for index, element in enumerate(li):
        if len(element) != 1:
            start = min(start, index)
            end = index + 1
    return li[start:end]

This code works against the following test cases:

>>> remove_ends([[1]])
[]

>>> remove_ends([[1], [2]])
[]

>>> remove_ends([[1,2], [3,4], [5], [6,7]])
[[1, 2], [3, 4], [5], [6, 7]]

>>> remove_ends([[1], [2], [3], [1, 5], [3], [3, 2, 4], [8], [3]])
[[1, 5], [3], [3, 2, 4]]

>>> remove_ends([])
[]

>>> remove_ends([[1,2]])
[[1, 2]]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4
Solution 5
Solution 6