I have a Spark question, so for the input for each entity k I have a sequence of probability p_i with a value associated v_i, for example the data can look like
shipitjs
nvlink
django-allauth
word-boundaries
android-jetpack-compose-button
truss
clj-time
structured-clone
reshape
zurb-foundation
extjs-chart
tweets
roslibjs
web-scripting
drizzle
jquery-ui-selectable
guice-3
heuristics
dataform
ellucian-banner
system-font
piano
mojolicious
crecordset
disparity-mapping
sakila-database
segment
windows-error-reporting
unit-testing
kubernetes-operator