Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Page Support for Code Issue: 16198 #21064

Closed
Closed
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
0a76c7e
Initial Large Page (2M) support for node.js
suresh-srinivas Mar 23, 2018
fa0e324
Add support for checking if explicitHugePages and transparentHugePage…
suresh-srinivas Mar 26, 2018
fdb45cd
Added License headers
suresh-srinivas Mar 30, 2018
b7791c2
Get rid of waring messages & code cleanup
suresh-srinivas Mar 30, 2018
6631eea
Protect the large pages under #ifdef NODE_ENABLE_LARGE_CODE_PAGES and…
suresh-srinivas Apr 4, 2018
843089c
Added configure option to enable huge pages
uttampawar Apr 9, 2018
077bc01
Added else clause to set node_use_large_pages=false by default
uttampawar Apr 9, 2018
7bdd9fc
Merge branch 'configure_fixes' into 2M-Pages-For-Code-16198
uttampawar Apr 9, 2018
b91de5a
Finished adding checks at appropriate places to handle possible error…
uttampawar May 8, 2018
7ef956a
Changed return type for two function from void to int
uttampawar May 9, 2018
b02e7a9
Fixed lint errors.
uttampawar May 24, 2018
2af82b1
Added one more condition check for verify the start address of newly …
uttampawar May 26, 2018
cae9285
Fixed syntax error due to wring data type. int64 to int64_t
uttampawar May 30, 2018
30114b6
Removed explictHugePages code
suresh-srinivas May 30, 2018
39e1f0d
Merge branch 'add-checks' into 2M-Pages-For-Code-16198
suresh-srinivas May 30, 2018
49cd0de
Merge branch '2M-Pages-For-Code-16198' of https://github.intel.com/DS…
suresh-srinivas May 30, 2018
2dd9e8c
Added Large Page Support
suresh-srinivas May 30, 2018
51d0f02
Merge branch 'intel-large-pages' of https://github.com/suresh-sriniva…
suresh-srinivas May 31, 2018
2f672ee
Update PR based on feedback
suresh-srinivas Jun 1, 2018
31504cc
Addressing the additional PR feedback
suresh-srinivas Jun 3, 2018
f998c58
Fix the gypi style issue
suresh-srinivas Jun 3, 2018
9f15cfc
Update with stylistic changes (eg char* instead of char *, IsLargePag…
suresh-srinivas Jun 4, 2018
d6de361
Add additional guard so large pages is only under Linux and target_ar…
suresh-srinivas Jun 4, 2018
29c7d13
Style fixes according to the Node C++ Style Guide
suresh-srinivas Jun 4, 2018
9828036
Eliminate ld.script and use implicit script
suresh-srinivas Jun 7, 2018
600cf54
Add #if defined(NODE_WANT_INTERNALS) && NODE_WANT_INTERNALS
suresh-srinivas Jun 7, 2018
bf259e2
Fix the test failures
suresh-srinivas Jun 7, 2018
f0a6dcb
Add detailed help message to configure --with-largepages
suresh-srinivas Jun 8, 2018
4610793
Fix style issues and created inline functions instead of macros for a…
suresh-srinivas Jun 12, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions configure
Original file line number Diff line number Diff line change
Expand Up @@ -392,6 +392,11 @@ parser.add_option('--with-etw',
dest='with_etw',
help='build with ETW (default is true on Windows)')

parser.add_option('--use-largepages',
action='store_true',
dest='node_use_large_pages',
help='build with Large Pages support (enabled only for Linux).')

intl_optgroup.add_option('--with-intl',
action='store',
dest='with_intl',
Expand Down Expand Up @@ -936,6 +941,9 @@ def configure_node(o):
else:
o['variables']['node_use_dtrace'] = 'false'

use_large_pages = (flavor == 'linux' and options.node_use_large_pages)
o['variables']['node_use_large_pages'] = b(use_large_pages)

if options.no_ifaddrs:
o['defines'] += ['SUNOS_NO_IFADDRS']

Expand Down
233 changes: 233 additions & 0 deletions ld.script
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
/* Script for -z combreloc: combine and sort reloc sections */
/* Copyright (C) 2014-2015 Free Software Foundation, Inc.
Copying and distribution of this script, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too low level for me to review, how do we make sure its correctness? are there documentations around this?

Copy link
Contributor Author

@suresh-srinivas suresh-srinivas Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were 3 line modification to the default ld.script to create a variable name(s) for the start of actual node text segment.

 PROVIDE (__nodetext = .);
 PROVIDE (_nodetext = .);
 PROVIDE (nodetext = .);

OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
"elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("=/usr/local/lib/x86_64-linux-gnu"); SEARCH_DIR("=/lib/x86_64-linux-gnu"); SEARCH_DIR("=/usr/lib/x86_64-linux-gnu"); SEARCH_DIR("=/usr/local/lib64"); SEARCH_DIR("=/lib64"); SEARCH_DIR("=/usr/lib64"); SEARCH_DIR("=/usr/local/lib"); SEARCH_DIR("=/lib"); SEARCH_DIR("=/usr/lib"); SEARCH_DIR("=/usr/x86_64-linux-gnu/lib64"); SEARCH_DIR("=/usr/x86_64-linux-gnu/lib");
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
.hash : { *(.hash) }
.gnu.hash : { *(.gnu.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
.rela.dyn :
{
*(.rela.init)
*(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
*(.rela.fini)
*(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
*(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
*(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
*(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
*(.rela.ctors)
*(.rela.dtors)
*(.rela.got)
*(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
*(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*)
*(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*)
*(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*)
*(.rela.ifunc)
}
.rela.plt :
{
*(.rela.plt)
PROVIDE_HIDDEN (__rela_iplt_start = .);
*(.rela.iplt)
PROVIDE_HIDDEN (__rela_iplt_end = .);
}
.init :
{
KEEP (*(SORT_NONE(.init)))
}
.plt : { *(.plt) *(.iplt) }
.plt.got : { *(.plt.got) }
.plt.bnd : { *(.plt.bnd) }
PROVIDE (__nodetext = .);
PROVIDE (_nodetext = .);
PROVIDE (nodetext = .);
.text :
{
*(.text.unlikely .text.*_unlikely .text.unlikely.*)
*(.text.exit .text.exit.*)
*(.text.startup .text.startup.*)
*(.text.hot .text.hot.*)
*(.text .stub .text.* .gnu.linkonce.t.*)
/* .gnu.warning sections are handled specially by elf32.em. */
*(.gnu.warning)
}
.fini :
{
KEEP (*(SORT_NONE(.fini)))
}
PROVIDE (__etext = .);
PROVIDE (_etext = .);
PROVIDE (etext = .);
.rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
.rodata1 : { *(.rodata1) }
.eh_frame_hdr : { *(.eh_frame_hdr) *(.eh_frame_entry .eh_frame_entry.*) }
.eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) *(.eh_frame.*) }
.gcc_except_table : ONLY_IF_RO { *(.gcc_except_table
.gcc_except_table.*) }
.gnu_extab : ONLY_IF_RO { *(.gnu_extab*) }
/* These sections are generated by the Sun/Oracle C++ compiler. */
.exception_ranges : ONLY_IF_RO { *(.exception_ranges
.exception_ranges*) }
/* Adjust the address for the data segment. We want to adjust up to
the same address within the page on the next page up. */
. = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE));
/* Exception handling */
.eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) }
.gnu_extab : ONLY_IF_RW { *(.gnu_extab) }
.gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
.exception_ranges : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) }
/* Thread Local Storage sections */
.tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
.tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array))
PROVIDE_HIDDEN (__preinit_array_end = .);
}
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*)))
KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors))
PROVIDE_HIDDEN (__init_array_end = .);
}
.fini_array :
{
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*)))
KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors))
PROVIDE_HIDDEN (__fini_array_end = .);
}
.ctors :
{
/* gcc uses crtbegin.o to find the start of
the constructors, so we make sure it is
first. Because this is a wildcard, it
doesn't matter if the user does not
actually link against crtbegin.o; the
linker won't look for a file to match a
wildcard. The wildcard also means that it
doesn't matter which directory crtbegin.o
is in. */
KEEP (*crtbegin.o(.ctors))
KEEP (*crtbegin?.o(.ctors))
/* We don't want to include the .ctor section from
the crtend.o file until after the sorted ctors.
The .ctor section from the crtend file contains the
end of ctors marker and it must be last */
KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
KEEP (*(SORT(.ctors.*)))
KEEP (*(.ctors))
}
.dtors :
{
KEEP (*crtbegin.o(.dtors))
KEEP (*crtbegin?.o(.dtors))
KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .dtors))
KEEP (*(SORT(.dtors.*)))
KEEP (*(.dtors))
}
.jcr : { KEEP (*(.jcr)) }
.data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) }
.dynamic : { *(.dynamic) }
.got : { *(.got) *(.igot) }
. = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
.got.plt : { *(.got.plt) *(.igot.plt) }
.data :
{
*(.data .data.* .gnu.linkonce.d.*)
SORT(CONSTRUCTORS)
}
.data1 : { *(.data1) }
_edata = .; PROVIDE (edata = .);
. = .;
__bss_start = .;
.bss :
{
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
/* Align here to ensure that the .bss section occupies space up to
_end. Align after .bss to ensure correct alignment even if the
.bss section disappears because there are no input sections.
FIXME: Why do we need it? When there is no .bss section, we don't
pad the .data section. */
. = ALIGN(. != 0 ? 64 / 8 : 1);
}
.lbss :
{
*(.dynlbss)
*(.lbss .lbss.* .gnu.linkonce.lb.*)
*(LARGE_COMMON)
}
. = ALIGN(64 / 8);
. = SEGMENT_START("ldata-segment", .);
.lrodata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
{
*(.lrodata .lrodata.* .gnu.linkonce.lr.*)
}
.ldata ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
{
*(.ldata .ldata.* .gnu.linkonce.l.*)
. = ALIGN(. != 0 ? 64 / 8 : 1);
}
. = ALIGN(64 / 8);
_end = .; PROVIDE (end = .);
. = DATA_SEGMENT_END (.);
/* Stabs debugging sections. */
.stab 0 : { *(.stab) }
.stabstr 0 : { *(.stabstr) }
.stab.excl 0 : { *(.stab.excl) }
.stab.exclstr 0 : { *(.stab.exclstr) }
.stab.index 0 : { *(.stab.index) }
.stab.indexstr 0 : { *(.stab.indexstr) }
.comment 0 : { *(.comment) }
/* DWARF debug sections.
Symbols in the DWARF debugging sections are relative to the beginning
of the section so we begin them at 0. */
/* DWARF 1 */
.debug 0 : { *(.debug) }
.line 0 : { *(.line) }
/* GNU DWARF 1 extensions */
.debug_srcinfo 0 : { *(.debug_srcinfo) }
.debug_sfnames 0 : { *(.debug_sfnames) }
/* DWARF 1.1 and DWARF 2 */
.debug_aranges 0 : { *(.debug_aranges) }
.debug_pubnames 0 : { *(.debug_pubnames) }
/* DWARF 2 */
.debug_info 0 : { *(.debug_info .gnu.linkonce.wi.*) }
.debug_abbrev 0 : { *(.debug_abbrev) }
.debug_line 0 : { *(.debug_line .debug_line.* .debug_line_end ) }
.debug_frame 0 : { *(.debug_frame) }
.debug_str 0 : { *(.debug_str) }
.debug_loc 0 : { *(.debug_loc) }
.debug_macinfo 0 : { *(.debug_macinfo) }
/* SGI/MIPS DWARF 2 extensions */
.debug_weaknames 0 : { *(.debug_weaknames) }
.debug_funcnames 0 : { *(.debug_funcnames) }
.debug_typenames 0 : { *(.debug_typenames) }
.debug_varnames 0 : { *(.debug_varnames) }
/* DWARF 3 */
.debug_pubtypes 0 : { *(.debug_pubtypes) }
.debug_ranges 0 : { *(.debug_ranges) }
/* DWARF Extension. */
.debug_macro 0 : { *(.debug_macro) }
.gnu.attributes 0 : { KEEP (*(.gnu.attributes)) }
/DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can probably be reduced to ~20 lines. It's pretty well hidden in all the definitions but I infer you really only care about the nodetext symbols.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see how it can be reduced to 20 lines. Much of it is what is used by the default linker script. Yes the only reason for me to modify it was to find the start of the actual text. I had trouble with moving the plt regions and wanted to avoid several of those regions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://sourceware.org/binutils/docs/ld/Implicit-Linker-Scripts.html#Implicit-Linker-Scripts - instead of replacing the default linker script, you use an implicit linker script that works in addition to the default linker script.

Replacing the default linker script isn't an option, IMO, because that will be a never-ending maintenance hassle for us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implicit linker scripts wont really help as I cant do a symbol assignment for something that is already specified in the linker script. I will investigate if there is some other approach to get the address of the .text segment.

Copy link
Contributor Author

@suresh-srinivas suresh-srinivas Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am able eliminate the ld.script by using the LD option -Wl,-Ttext=<address to place text segment>. This places the text segment at a specific address which I can then use as the start for the 2M mapping. This will hard code the text starting address.

The other way is to read the elf file and find the .text segment and determine the address.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fixed address is bad for security reasons and reading it from the ELF header won't work when ASLR is enabled.

I guess you could compute the ASLR slide at run-time but that gets complicated fast when code and data is spread across several segments.

Another issue with that approach: two segments with different protection might be located within < 2 MB from each other.

Are you sure implicit linker scripts don't work? I remember using that to put symbols at fixed locations.

What about retrieving the path to the default linker script with ld --verbose and passing -T /path/to/default/script.ld -T custom.ld?

Copy link
Contributor Author

@suresh-srinivas suresh-srinivas Jun 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Let me look into the implicit linker script some more. Another option is to generate the linker script using the ld --verbose and to then add my symbol to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to figure this out. GNU ld has the INSERT option which makes the externally supported script not override the default script but insert the code into the default. So I will delete the ld.script and insert a new ld.implicit.script

PROVIDE (_nodetext = .);
PROVIDE (nodetext = .);
INSERT BEFORE .text;

13 changes: 13 additions & 0 deletions node.gyp
Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,19 @@
'src/tls_wrap.h'
],
}],
[ 'node_use_large_pages=="true"', {
'defines': [ 'NODE_ENABLE_LARGE_CODE_PAGES=1' ],
# The current implementation of Large Pages is under Linux.
# Other implementations are possible but not currently supported.
#
'conditions': [
[ 'OS=="linux"', {
'sources': [
'src/node_large_page.cc'
],
}],
]
} ],
],
},
{
Expand Down
13 changes: 12 additions & 1 deletion node.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,18 @@
},
},
'conditions': [
['OS!="aix" and node_shared=="false"', {
['OS=="linux" and node_shared=="false" \
and target_arch=="x64" \
and node_use_large_pages=="true"', {
'ldflags': [
'-Wl,-T <(PRODUCT_DIR)/../../ld.script',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this on on other platforms than x86_64 Linux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No only on x86_64 Linux, specifically Ubuntu 16.04 is what I have done most of my measurement and testing on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I figured as much. This needs an additional guard to ensure it's only activated when OS=="linux" and target_arch="x64".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks fixed it.

'-Wl,--whole-archive,<(obj_dir)/deps/uv/<(STATIC_LIB_PREFIX)'
'uv<(STATIC_LIB_SUFFIX)',
'-Wl,--no-whole-archive',
]
}],
['OS!="aix" and node_shared=="false" \
and node_use_large_pages=="false"', {
'ldflags': [
'-Wl,--whole-archive,<(obj_dir)/deps/uv/<(STATIC_LIB_PREFIX)'
'uv<(STATIC_LIB_SUFFIX)',
Expand Down
10 changes: 10 additions & 0 deletions src/node.cc
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
#ifdef NODE_ENABLE_VTUNE_PROFILING
#include "../deps/v8/src/third_party/vtune/v8-vtune.h"
#endif
#include "node_large_page.h"

#include <errno.h>
#include <fcntl.h> // _O_RDWR
Expand Down Expand Up @@ -4377,6 +4378,15 @@ int Start(int argc, char** argv) {

CHECK_GT(argc, 0);

#ifdef NODE_ENABLE_LARGE_CODE_PAGES
if (node::IsLargePagesEnabled()) {
if ((node::MapStaticCodeToLargePages()) != 0) {
fprintf(stderr, "Reverting to default page size\n");
}
}
#endif


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why we are doing it here, as opposed to the very beginning of everything, in main?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason. This was a place where other things were initialized like VTune support.

// Hack around with the argv pointer. Used for process.title = "blah".
argv = uv_setup_args(argc, argv);

Expand Down
Loading