'Cython: wrap a string vector with PyArray_SimpleNewFromData

[This question has been edited as the presented code contained issues not related to the actual problem]

I have implemented a generic function that shall convert c++ pointers to numpy arrays:

cdef np.ndarray pointer_to_array(void *ptr, np.npy_intp N, int np_type):
    cdef np.ndarray arr = np.PyArray_SimpleNewFromData(1, &N, np_type, ptr)
    return arr

Here, *ptr is a pointer to the underlying data, N is the size of the array and np_type is the numpy code for the type. The function works well for types with fixed size such as double.

However, I would like to apply the function to some kind of string array (e.g. dtype('<U10')). So what I try is

# mymodule.pyx

cdef get_v():
    # `v` is a std::vector[std::string] defined in an 
    # external header file and is assumed to persist over the
    # lifetime of the module
    return pointer_to_array(v.data(), v.size(), np.NPY_STRING)

However, I obtain a ValueError: data type must provide an itemsize. This makes sense, since numpy needs to know the size of the string. How can I pass this information to PyArray_SimpleNewFromData? Or is there another way to wrap c++ arrays of strings with numpy arrays?


Further code needed for the example

# mymodule.pxd

from libcpp.vector cimport vector
from libcpp.string cimport string 

cdef extern from "mymodule_cpp.h":
    vector[string] v
# mymodule_cpp.h

#include <vector>
#include <string>

std::vector[std::string] v(10, "Hello.");


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source