'Accessing RSS feed with URL is working but URLResource is giving 403 forbidden error
I am trying to fetch rss feed but it is giving error when feed is accessed using org.springframework.core.io.URLResource class. I am using latest spring boot version 2.6.3. Am attaching sample code as well and GH repo.
@EnableIntegration
@Configuration
public class IntegrationRssFetch {
@Value("https://www.reutersagency.com/feed/?post_type=reuters-best")
private UrlResource urlResource;
@Value("https://www.reutersagency.com/feed/?post_type=reuters-best")
private URL url;
@Bean
public MetadataStore metadataStore() {
PropertiesPersistingMetadataStore metadataStore = new PropertiesPersistingMetadataStore();
metadataStore.setBaseDirectory("src/main/resources");
return metadataStore;
}
@Bean
public MessageChannel rssOutputChannel() {
return MessageChannels.direct("rss_feed_flow").get();
}
@Bean
public IntegrationFlow feedFlow() {
return IntegrationFlows
.from(Feed.inboundAdapter(this.urlResource, "feedTest")
.metadataStore(metadataStore()),
e -> e.poller(p -> p.fixedDelay(100)))
.channel("rss_feed_flow")
.get();
}
@Bean
public IntegrationFlow rssReadFlow() {
return IntegrationFlows
.from("rss_feed_flow")
.handle(message -> {
SyndEntry entry = (SyndEntry) message.getPayload();
System.out.println(entry.getTitle());
})
.get();
}
}
Following is stack trace of error
2022-02-23 15:46:51.181 ERROR 14352 --- [ scheduling-1] o.s.integration.handler.LoggingHandler : org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=null, feedResource=URL [https://www.reutersagency.com/feed/?post_type=reuters-best], metadataKey='feedTest', lastTime=-1}'; nested exception is java.io.IOException: Server returned HTTP response code: 403 for URL: https://www.reutersagency.com/feed/?post_type=reuters-best
at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:234)
at org.springframework.integration.feed.inbound.FeedEntryMessageSource.populateEntryList(FeedEntryMessageSource.java:201)
at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:176)
at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:57)
at org.springframework.integration.endpoint.AbstractMessageSource.receive(AbstractMessageSource.java:142)
at org.springframework.integration.endpoint.SourcePollingChannelAdapter.receiveMessage(SourcePollingChannelAdapter.java:212)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.doPoll(AbstractPollingEndpoint.java:444)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.pollForMessage(AbstractPollingEndpoint.java:413)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$4(AbstractPollingEndpoint.java:348)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.lambda$execute$0(ErrorHandlingTaskExecutor.java:57)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:55)
at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$5(AbstractPollingEndpoint.java:341)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Server returned HTTP response code: 403 for URL: https://www.reutersagency.com/feed/?post_type=reuters-best
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1894)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:263)
at org.springframework.core.io.UrlResource.getInputStream(UrlResource.java:186)
at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:224)
... 21 more
Link to GH https://github.com/pinkeshsagar-harptec/code-sample/tree/main/rssfeedissue
Solution 1:[1]
So, looks like that www.reutersagency.com doesn't like some User-Agent HTTP header values. For example it returns 403 for my default Java/1.8.0_251, but at the same time it is OK for Java/17.0.1 or Java/8, Java/20. But still doesn't work for Java/1.8 or 1.6, 1.7 etc. Java/11 is OK, too.
So, I suggest to upgrade to a newer Java anyway. Looks like Java 8 is already out of support on that RSS server.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Artem Bilan |
