Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database insertion error due to invalid character (statistics / keywords) #4072

Open
tobiasorth opened this issue May 17, 2024 · 1 comment

Comments

@tobiasorth
Copy link

tobiasorth commented May 17, 2024

Keywords for the internal statistics are split at position 128 using substr:

$keywords = substr($keywords, 0, 128);

However, this is not safe for multi byte or mixed byte strings. In some cases this leads to invalid last character in the $keywords variable resulting in a insertion error in the database:
Incorrect string value: '\xE8' for column `xyz`.`tx_solr_statistics`.`keywords` at row 1

Replacing $keywords = substr($keywords, 0, 128); with $keywords = mb_substr($keywords, 0, 128, "utf-8"); resolves this issue.

The following search phrase triggers the error:
欲に弱風イどた模最タネシル)少ルキヘ社影供セス時69九ヱネ転根クルニス割員ク薦推テコケ転楽フ申任化そ能愛ほさぎ集検ち


Maintainers notes:

Target Versions:

  • release-11.5.x
  • release-12.0.x
  • main
@dkd-kaehm
Copy link
Collaborator

@tobiasorth
Thanks for reporting.
Please transfer your suggestion to the pull-request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants