'timeout issue in spark while scaling for more requests

We are running inference pipeline for one of our model in pyspark. In order for the model to scale, we partition the test data and send each batch for records for prediction. But while running the prediction we iterate through every row within a batch and run the prediction at row level. For row level prediction we are calling a function using df.apply method in spark. We are processing bulk object while calling the apply function. The object is bit heavy (loaded embedder model). As we apply row by row, this object is passing for each call(passed between function equal to number of rows). Because of that while the huge volume of request comes we are getting timeout issue. Is there any better way of getting the prediction for entire batch of records instead of predicting for each row.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source