I am using nutch-2.3.1 with Hbase-0.98.8-hadoop2 and the crawl runs fine for HTML pages, but when trying to run the crawl for PDF URLs only some of them seems t
timecodes
apex
folder-access
jaws-screen-reader
go-cmp
playgrounds
latex2exp
ruby-mocha
app.js
office365connectors
cloud-security
max-pool-size
amazon-dynamodb-data-modeling
digital-assets-links
mobilefirst-runtime
fragment
fastlane-match
access-violation
lanczos
github-secret
libdispatch
boxapiv2
generic-method
socialsharing-plugin
stack-overflow-jobs
mapbox-expressions
programmatically-created
tabcontrol
flowdocumentreader
ip2location