'Finding the three longest substrings in a string using SPARQL on the Wikidata Query Service, and ranking them across strings
I'm trying to identify the longest three substrings from a string using SPARQL and the Wikidata Query Service and then rank
- the substrings within a string by length
- the strings by the lengths of any of those longest substrings .
I managed to identify the first and second substring from a string and could of course just create similar additional lines to tackle the problem, but this seems ugly and inefficient, so I am wondering if anyone here knows of a better way to get there.
This is a simplified version of the code, though I have left some auxiliary variables in that I am using for tracking progress on the way. You can try it here.
Clarification in response to this comment: if it is necessary to treat this query as a subquery and to feed it with results from another subquery, that's fine with me. To get an idea of the kinds of use I have in mind, see this demo.
SELECT * WHERE {
{
VALUES (?title) {
("What are the longest three words in this string?")
("A really complicated title")
("OneWordTitleInCamelCase")
("Thanks for your help!")
}
}
BIND(STRLEN(REPLACE(?title, " ", "")) AS ?titlelength)
BIND(STRBEFORE(?title, " ") AS ?substring1)
BIND(STRLEN(REPLACE(?substring1, " ", "")) AS ?substring1length)
BIND(STRAFTER(?title, " ") AS ?postfix)
BIND(STRLEN(REPLACE(?postfix, " ", "")) AS ?postfixlength)
BIND(STRBEFORE(?postfix, " ") AS ?substring2)
BIND(STRLEN(REPLACE(?substring2, " ", "")) AS ?substring2length)
}
ORDER BY DESC(?substring1length)
Expected results:
longsubstring substringlength
OneWordTitleInCamelCase 23
complicated 11
longest 7
really 6
string 6
Thanks 6
title 5
three 5
your 4
help 4
Actual results:
title titlelength substring1 substring1length postfix postfixlength substring2 substring2length
Thanks for your help! 18 Thanks 6 for your help! 12 for 3
What are the longest three words in this string? 40 What 4 are the longest three words in this string? 36 are 3
A really complicated title 23 A 1 really complicated title 22 really 6
OneWordTitleInCamelCase 23 0 0 0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
