I have a Spark question, so for the input for each entity k I have a sequence of probability p_i with a value associated v_i, for example the data can look like
gliffy
google-oauth-java-client
haneke
subtitle
scalafx
catalyst-optimizer
ondemand
ecmascript-harmony
bluebird
babeljs
gwttestcase
camel-ftp
pcapplusplus
scp
vod
autofixture
pytest
google-analytics-firebase
redux-saga
methodinfo
autoit-c#-wrapper
systrace
angular2-mdl
qregularexpression
rfc5322
sublime-text-plugin
measureoverride
vue-filter
template-meta-programming
glassfish