Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes with out of memory when sites contain large video files #29

Open
scottlaird opened this issue Aug 20, 2024 · 0 comments · May be fixed by #30
Open

Crashes with out of memory when sites contain large video files #29

scottlaird opened this issue Aug 20, 2024 · 0 comments · May be fixed by #30

Comments

@scottlaird
Copy link

It looks like hugo-algolia tries to index all files, when it should probably ignore large non-text files. I added a few large video files to my site and now I can't reindex anything because the indexer is running out of RAM:

$ npm run index

> index
> hugo-algolia -i 'content/posts/**'


<--- Last few GCs --->

[1092906:0x573744fd3100]    27459 ms: Mark-sweep 2102.8 (2136.5) -> 2102.7 (2105.5) MB, 4.3 / 0.0 ms  (average mu = 0.955, current mu = 0.004) last resort GC in old space requested
[1092906:0x573744fd3100]    27464 ms: Mark-sweep 2102.7 (2105.5) -> 2102.7 (2105.5) MB, 5.3 / 0.0 ms  (average mu = 0.937, current mu = 0.003) last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x7c31af77ecb9]
    1: StubFrame [pc: 0x7c31af7bbe79]
Security context: 0x3b208f8db9a1 <JSObject>
    2: replace [0x3b208f8ca221](this=0x22835feb3e19 <Very long string[156308060]>,0x22835feb3a09 <JSRegExp <String[#20]: \s{0,2}\[.*?\]: .*?$>>,0x018b31640731 <String[#0]: >)
    3: /* anonymous */ [0x345b8145879] [/home/scott/hugo/scottstuff/node_modules/remove-markdown/index.js:35] [bytecode=0x2e86ca337d79 offset=373](...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: 0x7c31ae8bc31c node::Abort() [/lib/x86_64-linux-gnu/libnode.so.72]
 2: 0x7c31ae7ec67c  [/lib/x86_64-linux-gnu/libnode.so.72]
 3: 0x7c31aec91e2a v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/lib/x86_64-linux-gnu/libnode.so.72]
 4: 0x7c31aec921e4 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/lib/x86_64-linux-gnu/libnode.so.72]
 5: 0x7c31aee445d9  [/lib/x86_64-linux-gnu/libnode.so.72]
 6: 0x7c31aee581df v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/lib/x86_64-linux-gnu/libnode.so.72]
 7: 0x7c31aee1cae1 v8::internal::Factory::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [/lib/x86_64-linux-gnu/libnode.so.72]
 8: 0x7c31aee25ad8 v8::internal::Factory::NewRawTwoByteString(int, v8::internal::AllocationType) [/lib/x86_64-linux-gnu/libnode.so.72]
 9: 0x7c31af0829dd v8::internal::String::SlowFlatten(v8::internal::Isolate*, v8::internal::Handle<v8::internal::ConsString>, v8::internal::AllocationType) [/lib/x86_64-linux-gnu/libnode.so.72]
10: 0x7c31af142a7b v8::internal::RegExpImpl::IrregexpExec(v8::internal::Isolate*, v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, int, v8::internal::Handle<v8::internal::RegExpMatchInfo>) [/lib/x86_64-linux-gnu/libnode.so.72]
11: 0x7c31af19eb00 v8::internal::Runtime_RegExpExec(int, unsigned long*, v8::internal::Isolate*) [/lib/x86_64-linux-gnu/libnode.so.72]
12: 0x7c31af77ecb9  [/lib/x86_64-linux-gnu/libnode.so.72]
Aborted (core dumped)

I can try increasing the RAM (obviously not the best workaround, but easy), but that's not good enough either:

$ NODE_OPTIONS=--max_old_space_size=8000 npm run index

> index
> hugo-algolia -i 'content/posts/**'

buffer.js:605
    slice: (buf, start, end) => buf.utf8Slice(start, end),
                                    ^

Error: Cannot create a string longer than 0x3fffffe7 characters
    at Object.slice (buffer.js:605:37)
    at Buffer.toString (buffer.js:802:14)
    at Object.readFileSync (fs.js:408:41)
    at Function.matter.read (/home/scott/hugo/scottstuff/node_modules/gray-matter/index.js:161:16)
    at HugoAlgolia.HugoAlgolia.readFile (/home/scott/hugo/scottstuff/node_modules/hugo-algolia/lib/index.js:278:25)
    at HugoAlgolia.HugoAlgolia.readDirectory (/home/scott/hugo/scottstuff/node_modules/hugo-algolia/lib/index.js:218:14)
    at HugoAlgolia.HugoAlgolia.index (/home/scott/hugo/scottstuff/node_modules/hugo-algolia/lib/index.js:102:10)
    at Object.<anonymous> (/home/scott/hugo/scottstuff/node_modules/hugo-algolia/bin/index.js:23:26)
    at Module._compile (internal/modules/cjs/loader.js:999:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10) {
  code: 'ERR_STRING_TOO_LONG'
}

There should really be an option to skip any file over some specified file size, maybe 1M or so.

scottlaird added a commit to scottlaird/hugo-algolia that referenced this issue Aug 21, 2024
This should fix replicatedhq#29, which causes hugo-algolia to crash when indexing sites with large files, such as video content.

Signed-off-by: Scott Laird <scott@sigkill.org>
@scottlaird scottlaird linked a pull request Aug 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant