'Azure Cognitive Search: How to get collection of strings from a view

I have an index definition on the Azure Cognitive Search and inside this index definition, there is a field called CustomerNames which of type collection (Collection(Edm.String)). Basically, this means my collection result will look like this: CustomerNames: ["Homer Simpson", "Henry Griffin", "Jane Doe"]

Here is the snapshot of the index definition: enter image description here

Now I have a view that will act as a data source in this cognitive search and this view is:

CREATE VIEW [dbo].[vCog_CustomerActivityAttachmentSearchDocuments]
AS
SELECT
    caa.Id,
    caa.CustomerActivityId,
    caa.AmsAttachmentId,
    caa.OriginalFileName,
    caa.[FileName],
    caa.FileExtension,
    caa.ContentType,
    caa.[Description],
    caa.DocumentTypeId,
    caa.AdditionalInfo,
    caa.Comments,
    ca.CustomerId,
    GuidCustomerId = ca.CustomerId,
    CustomerNames = 
    (    
    SELECT
    STRING_AGG(CONVERT(nvarchar(max),cp.FirstName + ' ' + cp.LastName), ', ')
    FROM dbo.CustomerPersons AS cp with (nolock)
    WHERE cp.CustomerId = c.Id AND cp.LegalEntityName IS NULL
    ),
    CustomerPersons = 
    (
        Select
            cp.Id
            , cp.CustomerId
            , cp.NamePrefix
            , cp.FirstName
            , cp.MiddleName
            , cp.LastName
            , cp.DateOfBirth
            , cp.HomePhone
            , cp.WorkPhone
            , cp.CellPhone
            , cp.FaxPhone
            , cp.Email
            , cp.Email2
            , cp.LegalEntityName
        from
            CustomerPersons cp with (nolock)
        where
             cp.CustomerId = c.Id FOR JSON AUTO
    ),
    c.FirmName,
    ca.PolicyId,
    PolicyNumber = p.TrimmedPolicyNumber,
    caa.CreatedAtUtc,
    caa.[RowVersion]
FROM dbo.CustomerActivityAttachments caa with (nolock)
JOIN dbo.CustomerActivities ca with (nolock) on ca.Id = caa.CustomerActivityId
JOIN dbo.Customers c with (nolock) on c.Id = ca.CustomerId
LEFT JOIN dbo.Policies p with (nolock) on p.Id = ca.PolicyId
where IsNull(caa.UpdatedAtUtc, caa.CreatedAtUtc) > GetDate() - 1200
GO

So when I run the indexer on the cognitive search, it is unable to recognize the CustomerNames and throwing the following error:

The data field 'CustomerNames' in the document with key 'XXXXXXXX' has an invalid value of type 'Edm.String' (String maps to Edm.String). The expected type was 'Collection(Edm.String)'.

So my question is how do I change my query in a way that I am getting a JSON array of strings and making it recognizable by the search function?

What I have tried so far: I have tried using this query inside the customerNames:

SELECT  
        JSON_QUERY('[' + STUFF(( SELECT ',' + '"' + STRING_AGG(CONVERT(nvarchar(max),cp.FirstName + ' ' + cp.LastName), ', ') + '"' 
        FROM dbo.CustomerPersons AS cp with (nolock) 
        WHERE cp.CustomerId = c.Id AND cp.LegalEntityName IS NULL 
        FOR XML PATH('')),1,1,'') + ']')  as [CustomerNames]
        FOR JSON PATH , WITHOUT_ARRAY_WRAPPER

But this is not solving the problem. Any ideas will be greatly appreciated. Thanks in advance.



Solution 1:[1]

you can use Field mapping function - jsonArrayToStringCollection to convert the json array to Index collection

So let your query generate a column with name CustomerNamesJArray and map it to CustomerNames and have a jsonArrayToStringCollection mapping function.

https://docs.microsoft.com/en-us/azure/search/search-indexer-field-mappings#field-mapping-functions

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 m4X