Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Performance Issues at index.php?option=com_content&view=articles SQL __workflow_associations #43701

Open
SmikeSix2 opened this issue Jun 25, 2024 · 8 comments

Comments

@SmikeSix2
Copy link

We have a few joomla new portals with lots of articles, in the 20k+ count.
The Article index.php?option=com_content&view=articles felt really slow and a lot of editors were feeling bad about the new version.
I had time to look into it and the SQL produced for selecting the articles (at least in my case), with the inner joins takes 3s on a rly big site.
TLDR go into debug mode to the articles view and look at the sql query, then see how long it takes or do an explain, it may be only in our systems with some error while migrating or some indexes missing. MariaDB latest version

I fixed it by changing two inner joins to left joins and forcing an index of the __workflow_associations table. Without the forcing of the index, it does some weird shit ->
idx_extension,202,const,21516,Using index condition; Using temporary; Using filesort

My final querying looks like that:

->join('LEFT', $db->quoteName('#__languages', 'l'), $db->quoteName('l.lang_code') . ' = ' . $db->quoteName('a.language'))
->join('LEFT', $db->quoteName('#__content_frontpage', 'fp'), $db->quoteName('fp.content_id') . ' = ' . $db->quoteName('a.id'))
->join('LEFT', $db->quoteName('#__users', 'uc'), $db->quoteName('uc.id') . ' = ' . $db->quoteName('a.checked_out'))
->join('LEFT', $db->quoteName('#__viewlevels', 'ag'), $db->quoteName('ag.id') . ' = ' . $db->quoteName('a.access'))
->join('LEFT', $db->quoteName('#__categories', 'c'), $db->quoteName('c.id') . ' = ' . $db->quoteName('a.catid'))
->join('LEFT', $db->quoteName('#__categories', 'parent'), $db->quoteName('parent.id') . ' = ' . $db->quoteName('c.parent_id'))
->join('LEFT', $db->quoteName('#__users', 'ua'), $db->quoteName('ua.id') . ' = ' . $db->quoteName('a.created_by'))
->join('INNER', $db->quoteName('#__workflow_associations', 'wa') . ' FORCE INDEX (PRIMARY)', $db->quoteName('wa.item_id') . ' = ' . $db->quoteName('a.id'))
->join('LEFT', $db->quoteName('#__workflow_stages', 'ws'), $db->quoteName('ws.id') . ' = ' . $db->quoteName('wa.stage_id'))
->join('LEFT', $db->quoteName('#__workflows', 'w'), $db->quoteName('w.id') . ' = ' . $db->quoteName('ws.workflow_id'));

(copy from forum post)
INNER JOIN leads to a temporary table with 22k entries for our smallest portal.

With INNER JOIN 500ms, LEFT JOIN its 1.5ms. In bigger portals that up to 5s and makes it quite unusable. Not sure if the DB is missing an index, i tried to optimize, but for now ill change that behaviour

EXPLAIN SELECT a.id,a.asset_id,a.title,a.alias,a.checked_out,a.checked_out_time,a.catid,a.state,a.access,a.created,a.created_by,a.created_by_alias,a.modified,a.ordering,a.featured,a.language,a.hits,a.publish_up,a.publish_down,a.introtext,a.fulltext,a.note,a.images,a.metakey,a.metadesc,a.metadata,a.version,fp.featured_up,fp.featured_down,l.title AS language_title,l.image AS language_image,uc.name AS editor,ag.title AS access_level,c.title AS category_title,c.created_user_id AS category_uid,c.level AS category_level,c.published AS category_published,parent.title AS parent_category_title,parent.id AS parent_category_id,parent.created_user_id AS parent_category_uid,parent.level AS parent_category_level,ua.name AS author_name,wa.stage_id AS stage_id,ws.title AS stage_title,ws.workflow_id AS workflow_id,w.title AS workflow_title,COALESCE(NULLIF(ROUND(v.rating_sum / v.rating_count, 0), 0), 0) AS rating,COALESCE(NULLIF(v.rating_count, 0), 0) AS rating_count
FROM hycwr_content AS a
LEFT JOIN hycwr_languages AS l ON l.lang_code = a.language
LEFT JOIN hycwr_content_frontpage AS fp ON fp.content_id = a.id
LEFT JOIN hycwr_users AS uc ON uc.id = a.checked_out
LEFT JOIN hycwr_viewlevels AS ag ON ag.id = a.access
LEFT JOIN hycwr_categories AS c ON c.id = a.catid
LEFT JOIN hycwr_categories AS parent ON parent.id = c.parent_id
LEFT JOIN hycwr_users AS ua ON ua.id = a.created_by
INNER JOIN hycwr_workflow_associations AS wa ON wa.item_id = a.id
INNER JOIN hycwr_workflow_stages AS ws ON ws.id = wa.stage_id
INNER JOIN hycwr_workflows AS w ON w.id = ws.workflow_id
LEFT JOIN hycwr_content_rating AS v ON a.id = v.content_id
WHERE wa.extension = 'com_content.article' AND a.state IN (1,0)
ORDER BY a.id desc LIMIT 20;

EXPLAIN SELECT a.id,a.asset_id,a.title,a.alias,a.checked_out,a.checked_out_time,a.catid,a.state,a.access,a.created,a.created_by,a.created_by_alias,a.modified,a.ordering,a.featured,a.language,a.hits,a.publish_up,a.publish_down,a.introtext,a.fulltext,a.note,a.images,a.metakey,a.metadesc,a.metadata,a.version,fp.featured_up,fp.featured_down,l.title AS language_title,l.image AS language_image,uc.name AS editor,ag.title AS access_level,c.title AS category_title,c.created_user_id AS category_uid,c.level AS category_level,c.published AS category_published,parent.title AS parent_category_title,parent.id AS parent_category_id,parent.created_user_id AS parent_category_uid,parent.level AS parent_category_level,ua.name AS author_name,COALESCE(NULLIF(ROUND(v.rating_sum / v.rating_count, 0), 0), 0) AS rating,COALESCE(NULLIF(v.rating_count, 0), 0) AS rating_count
FROM hycwr_content AS a
LEFT JOIN hycwr_languages AS l ON l.lang_code = a.language
LEFT JOIN hycwr_content_frontpage AS fp ON fp.content_id = a.id
LEFT JOIN hycwr_users AS uc ON uc.id = a.checked_out
LEFT JOIN hycwr_viewlevels AS ag ON ag.id = a.access
LEFT JOIN hycwr_categories AS c ON c.id = a.catid
LEFT JOIN hycwr_categories AS parent ON parent.id = c.parent_id
LEFT JOIN hycwr_users AS ua ON ua.id = a.created_by
LEFT JOIN hycwr_content_rating AS v ON a.id = v.content_id
WHERE (a.state = 0 OR a.state = 1)
ORDER BY a.id desc LIMIT 20;

EXPLAIN SELECT a.id, a.title, a.alias, a.checked_out, a.checked_out_time, a.catid, a.state, a.access, a.created, a.created_by, a.created_by_alias, a.modified, a.ordering, a.featured, a.language, a.hits, a.publish_up, a.publish_down, a.note,l.title AS language_title, l.image AS language_image,uc.name AS editor,ag.title AS access_level,c.title AS category_title, c.created_user_id AS category_uid, c.level AS category_level,parent.title AS parent_category_title, parent.id AS parent_category_id,
parent.created_user_id AS parent_category_uid, parent.level AS parent_category_level,ua.name AS author_name,COALESCE(NULLIF(ROUND(v.rating_sum / v.rating_count, 0), 0), 0) AS rating,
COALESCE(NULLIF(v.rating_count, 0), 0) as rating_count

FROM hycwr_content AS a

LEFT JOIN hycwr_languages AS l
ON l.lang_code = a.language

LEFT JOIN hycwr_users AS uc
ON uc.id=a.checked_out

LEFT JOIN hycwr_viewlevels AS ag
ON ag.id = a.access

LEFT JOIN hycwr_categories AS c
ON c.id = a.catid

LEFT JOIN hycwr_categories AS parent
ON parent.id = c.parent_id

LEFT JOIN hycwr_users AS ua
ON ua.id = a.created_by

LEFT JOIN hycwr_content_rating AS v
ON a.id = v.content_id

WHERE (a.state = 0 OR a.state = 1)

ORDER BY a.id desc
LIMIT 20;
Top
Smike
Joomla! Apprentice
Joomla! Apprentice
Posts: 10
Joined: Wed Mar 06, 2019 3:45 pm
Re: We have a news backend with few thousand articles, the workflow leads to slow loading times...
Post by Smike » Tue Jun 25, 2024 3:31 pm

Also in the live system the index primary needed to be enforced, or it did use also a slow query. im not sure why though, it had the same db.

->join('LEFT', $db->quoteName('#__workflow_associations', 'wa'), $db->quoteName('wa.item_id') . ' = ' . $db->quoteName('a.id'))

LEFT JOIN hycwr_workflow_associations AS wa FORCE INDEX (PRIMARY) ON wa.item_id = a.id

System information (as much as possible)

Debian.

Additional comments

@richard67
Copy link
Member

Not sure if we can use a „FORCE INDEX“ clause in the join in our code which produces the SQL, as it also has to work with PostgreSQL, and the database framework classes as far as I know don’t provide an API method for that, so we would have to use plain SQL for that clause.

@richard67
Copy link
Member

Regarding the joins of workflow tables this PR might help and solve parts of this issue: #40176

@carlitorweb
Copy link
Member

carlitorweb commented Jun 25, 2024

About performance when you list all the articles, that query select the fulltext of each article and that increase a lot the load time.
A workaround I found when the site is big, is use somehow the archive state, in this way at leats the query skip a good amount of articles who do not need be listed

@SmikeSix2
Copy link
Author

Not sure if we can use a „FORCE INDEX“ clause in the join in our code which produces the SQL, as it also has to work with PostgreSQL, and the database framework classes as far as I know don’t provide an API method for that, so we would have to use plain SQL for that clause.

yeah its odd, I actually first changed them to left joins, which worked locally.
on the servers, for some reason, it used some other __workflow_associations index and created a temporarily sorted 22k table.
so in both cases, LEFT or INNER join there were some performance issues. the #40176 probably would fix this

(all LEFT JOINS)
It takes workflow_associations as first index
LEFT JOIN hycwr_workflow_associations AS wa ON wa.item_id = a.id

1,SIMPLE,wa,ref,"PRIMARY,idx_item_stage_extension,idx_item_id,idx_extension",idx_extension,202,const,21516,Using index condition; Using temporary; Using filesort

VS
LEFT JOIN hycwr_workflow_associations AS wa FORCE INDEX (PRIMARY) ON wa.item_id = a.id

1,SIMPLE,a,index,"PRIMARY,idx_state,idx_id_desc,idx_content_main,idx_hycwr_content_state,idx_state_id",PRIMARY,4,,39,Using where

Oh yeah it uses that index because the WHERE has:

wa.extension = 'com_content.article' ...

(we are on joomla 4 and have 16 portals with 1million+ articles, i just upgraded them to 4 from 3, i just cant do that agian)

@SmikeSix2
Copy link
Author

SmikeSix2 commented Jun 26, 2024

Also in the bigger Websites you make a COUNT(*) on all articles, with all JOINS. Thats really not good, takes 2s on a large website.

I worked myself around and just got rid of workflows conditions and i added a $info to DatabaseQuery and in BaseDatabaseModel i added:

if($query->info == "articles_com_content"){ $query->clear('select')->clear('order')->clear('limit')->clear('offset')->clear('join')->select('COUNT(*)'); }else{ $query->clear('select')->clear('order')->clear('limit')->clear('offset')->select('COUNT(*)'); }
Very hackish, but it just needs to work.

You should extend the DatabaseQuery and look if it has an optimized COUNT(*) for the paginations.

SELECT COUNT(*)
FROM hycwr_content AS a
WHERE a.state IN (0,1)

2.4ms
vs
2s
SELECT COUNT(*)
FROM kog80_content AS a
LEFT JOIN kog80_languages AS l ON l.lang_code = a.language
LEFT JOIN kog80_content_frontpage AS fp ON fp.content_id = a.id
LEFT JOIN kog80_users AS uc ON uc.id = a.checked_out
LEFT JOIN kog80_viewlevels AS ag ON ag.id = a.access
LEFT JOIN kog80_categories AS c ON c.id = a.catid
LEFT JOIN kog80_categories AS parent ON parent.id = c.parent_id
LEFT JOIN kog80_users AS ua ON ua.id = a.created_by
LEFT JOIN kog80_workflow_associations AS wa FORCE INDEX (PRIMARY) ON wa.item_id = a.id
LEFT JOIN kog80_workflow_stages AS ws ON ws.id = wa.stage_id
LEFT JOIN kog80_workflows AS w ON w.id = ws.workflow_id
LEFT JOIN kog80_content_rating AS v ON a.id = v.content_id
WHERE a.state IN (0,1)

Edit: Sorry for the bad English and unformatted text. I was in a rush.

@SmikeSix2
Copy link
Author

SmikeSix2 commented Aug 6, 2024

Had holidays and optimized our backend, I'm writing those down so you guys can see where performance is bad. Its not a good solution and absolutely hackish, but we have part of these portals live and our administrators are sad.

I'm adding the changes I made. The list view of the articles for normal administrator users loaded in 12 seconds. I rewrote central pieces; if anyone needs to fix this in the future, here's what I did.

There are 2 main operations: first, the query for the content with all its subjoins, and second, the COUNT(*) query for pagination, which uses the same query but replaces its fields. It's extremely inefficient in large portals, and you should do tests with portals in the 100k articles range.

The COUNT(*) for the pagination now looks like this:

SELECT STRAIGHT_JOIN COUNT(*)
FROM `hycwr_content` AS `a`
LEFT JOIN `hycwr_categories` AS `c` ON `c`.`id` = `a`.`catid`
LEFT JOIN `hycwr_categories` AS `parent` ON `parent`.`id` = `c`.`parent_id`
WHERE `a`.`access` IN (:preparedArray1,:preparedArray2,:preparedArray3) AND `c`.`access` IN (:preparedArray4,:preparedArray5,:preparedArray6) AND `a`.`state` IN (:preparedArray7,:preparedArray8)

This now takes 9.2ms instead of 6s.

The normal query looks like this:

SELECT STRAIGHT_JOIN `a`.`id`,`a`.`asset_id`,`a`.`title`,`a`.`alias`,`a`.`checked_out`,`a`.`checked_out_time`,`a`.`catid`,`a`.`state`,`a`.`access`,`a`.`created`,`a`.`created_by`,`a`.`created_by_alias`,`a`.`modified`,`a`.`ordering`,`a`.`featured`,`a`.`language`,`a`.`hits`,`a`.`publish_up`,`a`.`publish_down`,`a`.`introtext`,`a`.`fulltext`,`a`.`note`,`a`.`images`,`a`.`metakey`,`a`.`metadesc`,`a`.`metadata`,`a`.`version`,`fp`.`featured_up`,`fp`.`featured_down`,`l`.`title` AS `language_title`,`l`.`image` AS `language_image`,`uc`.`name` AS `editor`,`ag`.`title` AS `access_level`,`c`.`title` AS `category_title`,`c`.`created_user_id` AS `category_uid`,`c`.`level` AS `category_level`,`c`.`published` AS `category_published`,`parent`.`title` AS `parent_category_title`,`parent`.`id` AS `parent_category_id`,`parent`.`created_user_id` AS `parent_category_uid`,`parent`.`level` AS `parent_category_level`,`ua`.`name` AS `author_name`,`wa`.`stage_id` AS `stage_id`,`ws`.`title` AS `stage_title`,`ws`.`workflow_id` AS `workflow_id`,`w`.`title` AS `workflow_title`,COALESCE(NULLIF(ROUND(`v`.`rating_sum` / `v`.`rating_count`, 0), 0), 0) AS `rating`,COALESCE(NULLIF(`v`.`rating_count`, 0), 0) AS `rating_count`
FROM `hycwr_content` AS `a`
LEFT JOIN `hycwr_languages` AS `l` ON `l`.`lang_code` = `a`.`language`
LEFT JOIN `hycwr_content_frontpage` AS `fp` ON `fp`.`content_id` = `a`.`id`
LEFT JOIN `hycwr_users` AS `uc` ON `uc`.`id` = `a`.`checked_out`
LEFT JOIN `hycwr_viewlevels` AS `ag` ON `ag`.`id` = `a`.`access`
LEFT JOIN `hycwr_categories` AS `c` ON `c`.`id` = `a`.`catid`
LEFT JOIN `hycwr_categories` AS `parent` ON `parent`.`id` = `c`.`parent_id`
LEFT JOIN `hycwr_users` AS `ua` ON `ua`.`id` = `a`.`created_by`
LEFT JOIN `hycwr_workflow_associations` AS `wa` FORCE INDEX (PRIMARY) ON `wa`.`item_id` = `a`.`id`
LEFT JOIN `hycwr_workflow_stages` AS `ws` ON `ws`.`id` = `wa`.`stage_id`
LEFT JOIN `hycwr_workflows` AS `w` ON `w`.`id` = `ws`.`workflow_id`
LEFT JOIN `hycwr_content_rating` AS `v` ON `a`.`id` = `v`.`content_id`
WHERE `a`.`access` IN (:preparedArray1,:preparedArray2,:preparedArray3) AND `c`.`access` IN (:preparedArray4,:preparedArray5,:preparedArray6) AND `a`.`state` IN (:preparedArray7,:preparedArray8)
ORDER BY a.id desc LIMIT 20

This now takes 885μs instead of 7s.

I'll write the steps to fix it. It was trial and error, but I managed to get it down for all users while keeping most functionality.

CREATE INDEX idx_access_state_catid ON hycwr_content (access, state, catid); //needed for the COUNT(*)
ALTER TABLE `kog80_content` ADD INDEX `idx_created_state_access` (`created`, `state`, `access`); //i think thats needed too

I've made changes to 3 core classes. We don't plan to update Joomla anymore, but since it needs to work with a lot of portals, it's necessary. It's not nice and works only with MariaDB, and you need to really take care of the DB or it gets slow.

C:\xampp_8\htdocs_\libraries\vendor\joomla\database\src\DatabaseQuery.php

abstract class DatabaseQuery implements QueryInterface
{
    public $info; //added field info`
...
public function select($columns, $useStraighJoin = false)
	{
	...
	if(!$useStraighJoin)
		$this->select = new Query\QueryElement('SELECT', $columns);
	else
		$this->select = new Query\QueryElement('SELECT STRAIGHT_JOIN', $columns); //needed cause db does not know whats best
...

public function clear($clause = null, $subType = null)
	{
	
	....
		case 'join':
		//reason is because the workflow_associations in the COUNT(*) are really slow. it works in the general query, but not with COUNT(*)
				if ($subType !== null && is_string($subType) && !empty($this->join)) {
					$newJoins = [];
					foreach ($this->join as $join) {
						
						if ($join instanceof Joomla\Database\Query\QueryElement) {
							$elements = $join->getElements();
							
							if (!empty($elements) && is_array($elements)) {
								$firstElement = reset($elements);
								if (strpos($firstElement, $subType) !== false) {
									$newJoins[] = $join;
								}
							}
						}else{
							if (stripos($join, $subType) !== false) {
								$newJoins[] = $join;
							}
						}
					}
					$this->join = $newJoins;
				} else {
					$this->join = null;
				}
	//joins can be filtered by a subtype

C:\xampp_8\htdocs_\libraries\src\MVC\Model\BaseDatabaseModel.php

protected function _getListCount($query)
    {
    ....
    //this is the COUNT(*) query
    //with the hackish info
    
      if($query->info == "articles_com_content"){
                $query->clear('select')->clear('order')->clear('limit')->clear('offset')->clear('join', '#__categories')->select('COUNT(*)', true);
            }else{
                 $query->clear('select')->clear('order')->clear('limit')->clear('offset')->select('COUNT(*)');
            }
    

C:\xampp_8\htdocs_\administrator\components\com_content\src\Model\ArticlesModel.php

protected function getListQuery()
    {
        // Create a new query object.
        $db    = $this->getDatabase();
        $query = $db->getQuery(true);
        $user  = $this->getCurrentUser();

        $params = ComponentHelper::getParams('com_content');

        // Select the required fields from the table.
        $query->select(
            $this->getState(
                'list.select',
                [
                    $db->quoteName('a.id'),
                    $db->quoteName('a.asset_id'),
                    $db->quoteName('a.title'),
                    $db->quoteName('a.alias'),
                    $db->quoteName('a.checked_out'),
                    $db->quoteName('a.checked_out_time'),
                    $db->quoteName('a.catid'),
                    $db->quoteName('a.state'),
                    $db->quoteName('a.access'),
                    $db->quoteName('a.created'),
                    $db->quoteName('a.created_by'),
                    $db->quoteName('a.created_by_alias'),
                    $db->quoteName('a.modified'),
                    $db->quoteName('a.ordering'),
                    $db->quoteName('a.featured'),
                    $db->quoteName('a.language'),
                    $db->quoteName('a.hits'),
                    $db->quoteName('a.publish_up'),
                    $db->quoteName('a.publish_down'),
                    $db->quoteName('a.introtext'),
                    $db->quoteName('a.fulltext'),
                    $db->quoteName('a.note'),
                    $db->quoteName('a.images'),
                    $db->quoteName('a.metakey'),
                    $db->quoteName('a.metadesc'),
                    $db->quoteName('a.metadata'),
                    $db->quoteName('a.version'),
                ]
                ), true
        )
            ->select(
                [
                    $db->quoteName('fp.featured_up'),
                    $db->quoteName('fp.featured_down'),
                    $db->quoteName('l.title', 'language_title'),
                    $db->quoteName('l.image', 'language_image'),
                    $db->quoteName('uc.name', 'editor'),
                    $db->quoteName('ag.title', 'access_level'),
                    $db->quoteName('c.title', 'category_title'),
                    $db->quoteName('c.created_user_id', 'category_uid'),
                    $db->quoteName('c.level', 'category_level'),
                    $db->quoteName('c.published', 'category_published'),
                    $db->quoteName('parent.title', 'parent_category_title'),
                    $db->quoteName('parent.id', 'parent_category_id'),
                    $db->quoteName('parent.created_user_id', 'parent_category_uid'),
                    $db->quoteName('parent.level', 'parent_category_level'),
                    $db->quoteName('ua.name', 'author_name'),
                    $db->quoteName('wa.stage_id', 'stage_id'),
                    $db->quoteName('ws.title', 'stage_title'),
                    $db->quoteName('ws.workflow_id', 'workflow_id'),
                    $db->quoteName('w.title', 'workflow_title'),
                ]
            )
            ->from($db->quoteName('#__content', 'a'))
            ->join('LEFT', $db->quoteName('#__languages', 'l'), $db->quoteName('l.lang_code') . ' = ' . $db->quoteName('a.language'))
            ->join('LEFT', $db->quoteName('#__content_frontpage', 'fp'), $db->quoteName('fp.content_id') . ' = ' . $db->quoteName('a.id'))
            ->join('LEFT', $db->quoteName('#__users', 'uc'), $db->quoteName('uc.id') . ' = ' . $db->quoteName('a.checked_out'))
            ->join('LEFT', $db->quoteName('#__viewlevels', 'ag'), $db->quoteName('ag.id') . ' = ' . $db->quoteName('a.access'))
            ->join('LEFT', $db->quoteName('#__categories', 'c'), $db->quoteName('c.id') . ' = ' . $db->quoteName('a.catid'))
            ->join('LEFT', $db->quoteName('#__categories', 'parent'), $db->quoteName('parent.id') . ' = ' . $db->quoteName('c.parent_id'))
            ->join('LEFT', $db->quoteName('#__users', 'ua'), $db->quoteName('ua.id') . ' = ' . $db->quoteName('a.created_by'))
            ->join('LEFT', $db->quoteName('#__workflow_associations', 'wa') . ' FORCE INDEX (PRIMARY)', $db->quoteName('wa.item_id') . ' = ' . $db->quoteName('a.id'))
            ->join('LEFT', $db->quoteName('#__workflow_stages', 'ws'), $db->quoteName('ws.id') . ' = ' . $db->quoteName('wa.stage_id'))
            ->join('LEFT', $db->quoteName('#__workflows', 'w'), $db->quoteName('w.id') . ' = ' . $db->quoteName('ws.workflow_id'));

        $query->info ="articles_com_content";

This affects the frontend:
I also added a cache for the subquery or I execute the subquery on its own. If the subquery is in the main query, it also slows down to about 2s. So instead of WHERE bla IN (SELECT id FROM bla), it becomes WHERE bla IN (1,2,3). The subquery seems to cause the DB to select slow indexes.

C:\xampp_8\htdocs_\components\com_content\src\Model\ArticlesModel.php

private $categoryIdsCache = [];
 public function getCategoryIds($categoryId, $levels = 1)
    {
        $cacheKey = "category_ids_{$categoryId}_{$levels}";

        // Check in-memory cache first
        if (isset($this->categoryIdsCache[$cacheKey])) {
            return $this->categoryIdsCache[$cacheKey];
        }

        // Try APCu if available
        if (function_exists('apcu_fetch')) {
            $categoryIds = apcu_fetch($cacheKey);
            if ($categoryIds !== false) {
                $this->categoryIdsCache[$cacheKey] = $categoryIds;
                return $categoryIds;
            }
        }

        // Cache miss, execute the query
        $db = Factory::getDbo();
        $query = $db->getQuery(true)
            ->select($db->quoteName('sub.id'))
            ->from($db->quoteName('#__categories', 'sub'))
            ->join(
                'INNER',
                $db->quoteName('#__categories', 'this'),
                $db->quoteName('sub.lft') . ' > ' . $db->quoteName('this.lft')
                    . ' AND ' . $db->quoteName('sub.rgt') . ' < ' . $db->quoteName('this.rgt')
            )
            ->where($db->quoteName('this.id') . ' = :categoryId');

        $query->bind(':categoryId', $categoryId, ParameterType::INTEGER);

        if ($levels >= 0) {
            $query->where($db->quoteName('sub.level') . ' <= ' . $db->quoteName('this.level') . ' + :levels');
            $query->bind(':levels', $levels, ParameterType::INTEGER);
        }

        $db->setQuery($query);
        $categoryIds = $db->loadColumn();

        // Add the main category ID to the array
        array_unshift($categoryIds, $categoryId);

        // Store in memory cache
        $this->categoryIdsCache[$cacheKey] = $categoryIds;

        // Store in APCu if available
        if (function_exists('apcu_store')) {
            apcu_store($cacheKey, $categoryIds, 3600);
        }

        return $categoryIds;
    }
    protected function getListQuery()
    {
    ....
$categoryIds  = $this->getCategoryIds($categoryId, $levels);
               


                if (empty($categoryIds)) {
                    $subQuery = $db->getQuery(true)
                        ->select($db->quoteName('sub.id'))
                        ->from($db->quoteName('#__categories', 'sub'))
                        ->join(
                            'INNER',
                            $db->quoteName('#__categories', 'this'),
                            $db->quoteName('sub.lft') . ' > ' . $db->quoteName('this.lft')
                                . ' AND ' . $db->quoteName('sub.rgt') . ' < ' . $db->quoteName('this.rgt')
                        )
                        ->where($db->quoteName('this.id') . ' = :subCategoryId');

                    $query->bind(':subCategoryId', $categoryId, ParameterType::INTEGER);

                    if ($levels >= 0) {
                        $subQuery->where($db->quoteName('sub.level') . ' <= ' . $db->quoteName('this.level') . ' + :levels');
                        $query->bind(':levels', $levels, ParameterType::INTEGER);
                    }
                    $query->bind(':categoryId', $categoryId, ParameterType::INTEGER);
                    $query->where(
                        '(' . $db->quoteName('a.catid') . $type . ':categoryId OR ' . $db->quoteName('a.catid') . ' IN (' . $subQuery . '))'
                    );
                }else{
                    $inClause = implode(',', $categoryIds);
                    $query->bind(':categoryId', $categoryId, ParameterType::INTEGER);
                    $query->where(
                        '(' . $db->quoteName('a.catid') . $type . ':categoryId OR ' . $db->quoteName('a.catid') . ' IN (' . $inClause . '))'
                    );
                }

best mike

@Hackwar
Copy link
Member

Hackwar commented Aug 6, 2024

Thank you for this extensive report from you. This is valuable information and we indeed have to do performance improvements. I'm not entirely sure if we can implement everything you did here, but I will bring this up with the team and keep this issue on my list.

While there is a tool to generate giant sites, that tool unfortunately still generates artificial sites. I would like to have a site like yours to have real life data. Would it be possible to share a copy of one of your portals for development purposes? If this would be possible, could you contact me via hannes.papenberg@community.joomls.org ?

@Fedik
Copy link
Member

Fedik commented Aug 6, 2024

If the site Authors have non super user rights, there also can be an issue with permission performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants