'Google CloudSearch CSV Connector hits a top limit when indexing

We are using Google's CSV connector to attempt to index a CSV file with 600k+ records. In the Test datasource, the number of records that get indexed top out at 8k. A different upper bound is seen for the Prod data, but at 130k. The connector keeps running but no additional records are indexed. Is there a datasource limit or some other limiting factor? Below are some of our tuning params from the config file

connector.runOnce=false
traverse.threadPoolSize=1000
traverse.partitionSize=4000
batch.batchSize=20

batch.maxQueueLength=8000
batch.maxActiveBatches=250
batch.maxBatchDelaySeconds=20
batch.readTimeoutSeconds=120
batch.connectTimeoutSeconds=300

google-cloud-search

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Google CloudSearch CSV Connector hits a top limit when indexing

Sources

Related Questions