'How to figure out what is causing high CPU usage spikes for laravel project

I have built a relatively complicated system (CMS) which consists of two projects - admin site and user site. On admin site we create content and add it to pages. User site reflects those pages. The problem I'm facing is that something is causing 100% cpu usage spikes and I have no idea what that is. I've implemented data caching on crucial parts of the system which in my opinion are most resource-hungry but that only helped a little bit, It's still hitting that 100% cpu usage and the whole site goes down as it is unable to handle requests. We have around 300 users daily, which isn't that many IMO that Laravel app couldn't handle. I'm using blade templates so it's server-side rendering (thinking that a rework to a full Vue frontend might solve it but not sure). Perhaps someone could guide me to a right direction of what steps to take to figure out which function(s) in my code are causing these problems. I've tried using Laravel debugbar but I don't know what parameters are considered "satisfy-able". I've tried enabling long query logging in mysql, that provided no results. We're hosting on a dedicated server

CPU Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz (4 core(s)).

OS AlmaLinux 8.4.

8GB RAM

I have a full control over the server but I'm not a qualified system administrator so I'm not sure if there's anything I can do to figure this out by running some terminal commands. I've tried running the "top" command and it just shows mysql and php-fpm as the processes that are consuming all of the CPU resources but that doesn't help me.

I'm suspecting there might be a memory leak somewhere but how may I find it?

enter image description here



Solution 1:[1]

Rate Per Second = RPS

Suggestions to consider for your my.cnf [mysqld] section (use of SET GLOBAL variable_name=value; may SET these dynamic variables and not require stop/start of instance)

read_rnd_buffer_size=64*1024  # from 256K to reduce handler_read_rnd_next RPS of 284,967
max_write_lock_count=16  # from ~ 4 billion to allow ReadData after nn locks
innodb_io_capacity=400  # from 200 to use more of available IOPS for your HDD
innodb_buffer_pool_size=1*1024*1024*1024  # from 128M to reduce innodb_data_reads RPS of 3
innodb_adaptive_max_sleep_delay=10000  # from 150000 for 1 second vs 15 second delay when busy

If any questions, please make contact. CPU stress should be significantly reduced. Many more opportunities exist to improve performance.

Observations, A) handler_rollback averages 1 every 220 seconds - expensive operation for instance. B) innodb_secondary_index_triggered_cluster_reads in status report shows 251 RPS and is likely caused by frequent use of a table with no PRIMARY KEY. C) select_scan averages 5 RPS - indicates index(s) missing that would improve performance of queries. D) slow_query_log is ONLY useful when ON and when long_query_time is less than 10 seconds.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wilson Hauck