I have a Spark question, so for the input for each entity k I have a sequence of probability p_i with a value associated v_i, for example the data can look like
cups
webseal
background-fetch
phphotolibrary
firefox-android
typescript1.4
javacard
springdoc
windowed
vue-cli-5
spring-jdbc
eaaccessory
multimap
nats.io
tomcat7
time.h
ou
amd
vscode-settings
wql
python-gnupgp
bisection
atlassian-sourcetree
pid-controller
query-parameters
premailer
django-taggit
epson
shopify-storefront-api
argb