'Optimise the creation of a record with many records that belong to it

Description

I'm writing full stack web application using ReactJS, Laravel and MySQL that allows users to create quizzes.

My database structure:

Quizzes table

create table quizzes (
    id bigint unsigned auto_increment primary key,
    title varchar(255) not null,
    description text null,
    duration smallint unsigned not null,
    is_open tinyint(1) default 0 not null,
    shuffle_questions tinyint(1) default 0 not null,
    user_id bigint unsigned not null,
    lesson_id bigint unsigned not null,
    created_at timestamp null,
    updated_at timestamp null,
    constraint quizzes_lesson_id_foreign foreign key (lesson_id) references lessons (id) on delete cascade,
    constraint quizzes_user_id_foreign foreign key (user_id) references users (id) on delete cascade
) collate = utf8mb4_unicode_ci;

Questions table

create table questions (
    id bigint unsigned auto_increment primary key,
    title text not null,
    description text null,
    image varchar(255) null,
    type enum ('radio', 'checkbox', 'text', 'image') not null,
    is_required tinyint(1) default 0 not null,
    points tinyint unsigned default '0' not null,
    quiz_id bigint unsigned not null,
    created_at timestamp null,
    updated_at timestamp null,
    constraint questions_quiz_id_foreign foreign key (quiz_id) references webagu_24082021.quizzes (id) on delete cascade
) collate = utf8mb4_unicode_ci;

Answers table

create table answers (
    id bigint unsigned auto_increment primary key,
    value varchar(1024) null,
    is_correct tinyint(1) default 0 not null,
    quiz_id bigint unsigned not null,
    question_id bigint unsigned not null,
    created_at timestamp null,
    updated_at timestamp null,
    constraint answers_question_id_foreign foreign key (question_id) references questions (id) on delete cascade,
    constraint answers_quiz_id_foreign foreign key (quiz_id) references quizzes (id) on delete cascade
) collate = utf8mb4_unicode_ci;

Data that comes from UI when user press "Save quiz" button

//....
axios
    .post('/quizzes', { "quiz": QuizData, "questions": QuestionsData, "answers": AnswersData })
    .then(res => {
        if(201 === res.status) alert('Quiz saved!');
        console.log(res.data)
    });
//....

Quiz controller store method

public function store(Request $request): JsonResponse
{
    $quizData = $request->input('quiz');
    $questions = $request->input('questions');
    $answers = $request->input('answers');

    $groupedAnswers = Utils::groupBy('_question_id', $answers);

    //DB::beginTransaction();

    $quizData['user_id'] = \auth('api')->user()->id;
    $quiz = Quiz::create($quizData);

    $new_questions = [];
    $new_answers = [];

    foreach ($questions as $question) {
        $question['quiz_id'] = $quiz->id;
        $new_question = Question::create($question);
        $new_questions[] = $new_question;

        $qid = $question['_question_id'];

        if (isset($groupedAnswers[$qid])) {
            $question_answers = $groupedAnswers[$qid];

            foreach ($question_answers as $answer) {
                $answer['quiz_id'] = $quiz->id;
                $answer['question_id'] = $new_question->id;

                $new_answer = Answer::create($answer);
                $new_answers[] = $new_answer;
            }
        }
    }

    //DB::commit();

    $resData = ['quiz' => $quiz, 'questions' => $new_questions, 'answer' => $new_answers];

    return response()->json($resData, 201);
}

My current code algorithm:

  1. Creates Quiz object
  2. In foreach loop assign Quiz::id to Question object quiz_id foreign key column and creates
  3. In inner foreach loop assign Question::id to Answer object question_id foreign key column and creates

Problem

This algorithm creates Q (questions count) * A (answers count) SQL queries - and this this very slow.

For example, if quiz contains 50 questions, each have 4 answers variants, query will contain 50 * 4 = 200 SQL queries.

So, how to change this bad solution to make it work faster?



Solution 1:[1]

The following solution will result in:

  • One query to insert the Quiz.
  • One query for each Question.
  • One query per Question to insert its answers if it has any.

So a maximun of 1 + (questions_count)*2 queries.

If your answers didn't rely on the question_id everything could be done in 4 queries

    public function store(Request $request): JsonResponse
    {
        /* ******************************************* */
        //  GETTING AND INSERTING QUIZ
        /* ******************************************* */
        $quizData = $request->input('quiz');
        $quizData['user_id'] = \auth('api')->user()->id;
        $quiz = Quiz::create($quizData);

        /* ******************************************* */
        //  GETTING QUESTIONS AND THEIR ANSWERS
        /* ******************************************* */
        $questions = $request->input('questions');

        $answers = $request->input('answers');
        $answersByQuestion = Utils::groupBy('_question_id', $answers);

        // ***********************************************
        // ***********************************************
        $new_questions = [];
        $new_answers = [];

        foreach ($questions as $question) {
            // $question['quiz_id'] = $quiz->id; $new_question = Question::create($question);
            $new_question = $quiz->questions()->create($question);
            $new_questions[] = $new_question; // FOR THE RESPONSE

            if (isset($answersByQuestion[$question['_question_id']])) {

                // PREPARING ANSWERS FOR BULK INSERT
                foreach ($answersByQuestion[$question['_question_id']] as $answer) {

                    $answer['quiz_id'] = $quiz->id;
                    $answer['question_id'] = $new_question->id;
                    $new_answers[] = $answer;
                }
            }
            DB::table('answers')->insert($new_answers);
        }

        $resData = ['quiz' => $quiz, 'questions' => $new_questions, 'answer' => $new_answers];

        return response()->json($resData, 201);
    }

Idea for 4 queries (Ultimate optimization)

I see that:

  • you use _question_id on the Answers to do the link with their Question.
  • A Quiz has-many questions and answers.
  • An Answer belong to one question and its one quiz.

enter image description here

So do this.

  1. Create the Quiz.
  2. Take the quiz_id and bulk insert all the questions including the new field _question_id.
  3. Select all the Quiz questions.
  • Prepare a nested array that contains all Answers and add to each answer the actual question_id by the help of _question_id.
  1. Bulk insert all the answers.

Solution 2:[2]

One Quiz has many Questions. This 1-to-many relationship is properly handled by questions.quiz_id.

Similarly, one Question has many Answers. This 1-to-many relationship is properly handled by Answers.question_id.

But there is a minor no-no -- Answers.quiz_id is "redundant" since it can be found via Questions. In proper schema design, that column should not exist. For only 200 Answers (or even for a million answers) you cannot make a "performance argument".

Round trips and lots of statements

If practical, all the quizzes could be inserted in one INSERT statement; all the questions in another statement; and all the answers in a third. (No, I don't know how to do that in Laravel; but I can explain for MySQL.)

Meanwhile, a hundred, perhaps a thousand, queries per second is nothing to worry about.

(As a bonus, "batching" inserts runs about 10 times as fast.)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2