'Keras fitting setting in TensorFlow Extended (TFX)

I try to construct a TFX pipeline with a trainer component with a Keras model defined like this:

   def run_fn(fn_args: components.FnArgs):
    
      transform_output = TFTransformOutput(fn_args.transform_output)
    
      train_dataset = input_fn(fn_args.train_files,
                               fn_args.data_accessor,
                               transform_output,
                               num_batch)

      eval_dataset = input_fn(fn_args.eval_files,
                              fn_args.data_accessor,
                              transform_output,
                              num_batch)
    
      history = model.fit(train_dataset, 
                epochs=num_epochs, 
                steps_per_epoch=fn_args.train_steps, 
                validation_data=eval_dataset,
                validation_steps=fn_args.eval_steps)

This works. However, if I change fitting to the following, this doesn't work:

  history = model.fit(train_dataset, 
            epochs=num_epochs,
            batch_size=num_batch,
            validation_split=0.1)

Now, I have two questions:

  1. Why does fitting work only with steps_per_epochs only? I couldn't find any explicit statement supporting this but this is the only way. Somehow I conclude that it must be something TFX specific (TFX handles input data only in a generator-like way?).

  2. Let's say my train_dataset contains 100 instances and steps_per_epoch=1000 (with epochs=1). Is that mean that my 100 input instances are feed 10x each in order to reach the defined 1000 step? Isn't that counter-productive from training perspective?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source