'How to convert a list of arrays into a 1D scalar array to subset loom file

I am working with a loompy file and unfortunately the relevant metadata along which I would like to subset the loom file is located in an external metadata file.

sub_meta = []

for i in marrow_meta2['annotated_cell_identity.ontology'].unique():
    # gets the indices of marrow_meta2['annotated_cell_identity.ontology'] where i is found and assigns
    # it to q, for each i (that is, each unique value in annotated_cell_identity.ontology)
    q = np.where(marrow_meta2['annotated_cell_identity.ontology'] == i) 
    # if the length of q is less than 1000, append q to sub_meta
    if len(q) < 1000:
        sub_meta.extend(q)
    elif len(q) > 1000:
        sub_meta.extend(random.sample(q, 1000))

marrow_meta2[sub_meta]

Despite my best attempts, no matter what I do, I seem to get a list of arrays, instead of a list of integers corresponding to the rows of the loom object's matrix that I would like to subset.

I have the R equivalent code to what I want to do. I am using Python because I would like to use scanpy. I am not ruling out using reticulate or simply saving the relevant objects to file and loading them into Python downstream to use scanpy, but I'd like to crack the issue first:

# only select 2k (if possible) of each cell type
sub_meta <- c()
for(c in unique(marrow_meta$annotated_cell_identity.ontology_label)){
  i <- which(marrow_meta$annotated_cell_identity.ontology_label==c)
  if(length(i) < 1000){
    sub_meta <- append(sub_meta, i)
  }
  else{
    sub_meta <- append(sub_meta, sample(i, 1000))
  }
}
sub_meta <- marrow_meta[sub_meta,]


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source