'Understanding Netflix Eureka Leases - Service Keeps Unregistering

I keep having this intermittent issue with one of my Spring Cloud Eureka clients, but none of the others. When I start my "NCLSEARCHSERVICE" it registers itself with Eureka, shows "UP" in Eureka for about 30 seconds to a minute, then unregisters itself. Here are the logs from the Eureka server that show it registering all of my other services, but then failing to register my search service because "Lease is not registered". Sorry for the poor formatting of the log file copy/paste.

2015-10-01 09:48:55.144  WARN 6470 --- [nio-8761-exec-4] com.netflix.eureka.InstanceRegistry      : DS: Registry: lease doesn't

exist, registering resource: CONFIGSERVICE - 172.18.100.120 2015-10-01 09:48:55.145 WARN 6470 --- [nio-8761-exec-4] c.n.eureka.resources.InstanceResource : Not Found (Renew): CONFIGSERVICE - 172.18.100.120 2015-10-01 09:48:55.147 WARN 6470 --- [nio-8761-exec-2] com.netflix.eureka.InstanceRegistry : DS: Registry: lease doesn't exist, registering resource: ZUULSERVER - eXploit-Zuul 2015-10-01 09:48:55.147 WARN 6470 --- [nio-8761-exec-2] c.n.eureka.resources.InstanceResource : Not Found (Renew): ZUULSERVER - eXploit-Zuul 2015-10-01 09:48:55.419 INFO 6470 --- [nio-8761-exec-7] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.120 with status UP 2015-10-01 09:48:55.423 INFO 6470 --- [nio-8761-exec-8] com.netflix.eureka.InstanceRegistry : Registered instance id eXploit-Zuul with status UP 2015-10-01 09:48:55.778 INFO 6470 --- [nio-8761-exec-9] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.120 with status UP 2015-10-01 09:48:55.809 INFO 6470 --- [io-8761-exec-10] com.netflix.eureka.InstanceRegistry : Registered instance id eXploit-Zuul with status UP 2015-10-01 09:49:05.416 WARN 6470 --- [nio-8761-exec-3] com.netflix.eureka.InstanceRegistry : DS: Registry: lease doesn't exist, registering resource: GEOSERVER - 172.18.100.155 2015-10-01 09:49:05.418 WARN 6470 --- [nio-8761-exec-3] c.n.eureka.resources.InstanceResource : Not Found (Renew): GEOSERVER - 172.18.100.155 2015-10-01 09:49:05.441 INFO 6470 --- [nio-8761-exec-5] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.155 with status STARTING 2015-10-01 09:49:05.607 INFO 6470 --- [nio-8761-exec-4] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.155 with status STARTING 2015-10-01 09:49:05.948 INFO 6470 --- [nio-8761-exec-8] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.197 with status UP 2015-10-01 09:49:06.107 INFO 6470 --- [nio-8761-exec-9] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.197 with status UP 2015-10-01 09:49:11.175 WARN 6470 --- [io-8761-exec-10] c.n.eureka.resources.InstanceResource : Time to sync, since the last dirty timestamp differs - ReplicationInstance id : 172.18.100.155,Registry : 1443651207024 Incoming: 1443651408538 Replication: false 2015-10-01 09:49:11.197 INFO 6470 --- [nio-8761-exec-3] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.155 with status UP 2015-10-01 09:49:11.562 WARN 6470 --- [nio-8761-exec-5] c.n.eureka.resources.InstanceResource : Time to sync, since the last dirty timestamp differs - ReplicationInstance id : 172.18.100.155,Registry : 1443651408538 Incoming: 1443651207024 Replication: true 2015-10-01 09:49:11.607 INFO 6470 --- [nio-8761-exec-4] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.155 with status UP 2015-10-01 09:50:03.859 INFO 6470 --- [nio-8761-exec-6] c.n.eureka.resources.InstanceResource : Found (Cancel): NCLSEARCHSERVICE - 172.18.100.197 2015-10-01 09:50:03.870 INFO 6470 --- [io-8761-exec-10] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.197 with status UP 2015-10-01 09:50:04.120 INFO 6470 --- [nio-8761-exec-2] com.netflix.eureka.InstanceRegistry : Registered instance id 172.18.100.197 with status UP 2015-10-01 09:50:04.130 INFO 6470 --- [nio-8761-exec-5] c.n.eureka.resources.InstanceResource : Found (Cancel): NCLSEARCHSERVICE - 172.18.100.197 2015-10-01 09:50:04.745 WARN 6470 --- [nio-8761-exec-3] com.netflix.eureka.InstanceRegistry
: DS: Registry: cancel failed because Lease is not registered for: NCLSEARCHSERVICE:172.18.100.197

The logs for my search service don't seem to show anything interesting, only that it saw in Eureka that it was previously down and now it's changing the status to up:

2015-09-30 17:59:14.164 INFO 9056 --- [ main] c.n.e.EurekaDiscoveryClientConfiguration : Unregistering application nclSearchService with eureka with status DOWN 2015-09-30 17:59:14.164 INFO 9056 --- [ main] com.netflix.discovery.DiscoveryClient : Saw local status change event StatusChangeEvent [current=DOWN, previous=UP] 2015-09-30 17:59:14.208 INFO 9056 --- [nfoReplicator-0] com.netflix.discovery.DiscoveryClient : DiscoveryClient_NCLSEARCHSERVICE/172.18.100.197: registering service... 2015-09-30 17:59:14.217 INFO 9056 --- [nfoReplicator-0] com.netflix.discovery.DiscoveryClient : DiscoveryClient_NCLSEARCHSERVICE/172.18.100.197 - registration status: 204 2015-09-30 17:59:14.218 INFO 9056 --- [ main] com.netflix.discovery.DiscoveryClient : DiscoveryClient_NCLSEARCHSERVICE/172.18.100.197 - deregister status: 200 2015-09-30 17:59:14.218 INFO 9056 --- [ main] c.n.e.EurekaDiscoveryClientConfiguration : Registering application nclSearchService with eureka with status UP

My config is a mess right now, because I keep experimenting with different values and configurations, but here is the basic values I'm working with:

spring:
  cloud:
    config:
#      uri: http://${CONFIG_SERVER_URL:172.18.100.120:8888}
      discovery:
        enabled: true
        serviceId: CONFIGSERVICE
  application:
    name: nclSearchService


eureka:
  client:
    enable-self-preservation: false
#    registerWithEureka: true
#    fetchRegistry: true
    serviceUrl:
      defaultZone: http://172.18.100.120:8761/eureka/
#    healthcheck:
#      enabled: true
  instance:
    leaseRenewalIntervalInSeconds: 30

# items below are listed in the config server
service:
    name: nclSearchService

security:
  sessions: NEVER

#instance:
#  preferIpAddress: true
#server:
#  port: 8080
#  contextPath: / 

You can see that I'm also loading config dynamically from a config service, but that just loads application specific variables.

In the past, if I keep randomly messing with the config on my search service, I can get it to stabilize on Eureka. But then the problem re-appears and I can never pin point what is actually causing it. It seems to be an issue with Eureka and it's instance registry, but I can't quite figure out what to do to fix it.

UPDATE 1: The NCLSEARCHSERVICE is using spring-boot-dependencies 1.2.5.RELEASE and spring-cloud-netflix 1.1.0.M1. My Eureka server is using the same version of the spring boot dependencies, but spring-cloud-netflix 1.0.3.RELEASE. I tried updating the search service to see if that had any effect on my issue. If it is recommended that all services use the same version, I can move either up/down a version.

Only once in my troubleshooting did I attempt to call /refresh on my search service. If I remember correctly, it went through the same process of showing up temporarily and then being removed from the registry again.

UPDATE 2: I may have narrowed down the issue... I had a config server running on port 8888 on the same VM as the Eureka server running on port 8761. I disabled the dynamic configuration in my search service and moved the contents of the nclSearchService.yml from the config server to application.yml on my search service. The search service has registered with Eureka and has been stable for about 10 minutes.

This is odd because even though my search service is no longer configured to use the config service, the config service is still deployed and running. Also, even when the search service was deregistering, you could see in the logs that it was successfully pulling the nclSearchService.yml file from the config service, and it would boot all the way up and be usable with the properties that were pulled from the config service.

Not sure what any of that means :)



Solution 1:[1]

Increasing the docker memory from 2GB to 4GB fixed this for me. Maybe increasing the VM memory will fix this for you as well.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Marina