-
Notifications
You must be signed in to change notification settings - Fork 0
/
ch01.xml
55 lines (55 loc) · 10.2 KB
/
ch01.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="ch01">
<title>Introduction</title>
<para><indexterm><primary>Introduction</primary></indexterm>This document is a guide to using IBM's <ulink url="https://www.ibm.com/developerworks/learn/security/index.html">POWER8 in-core cryptography</ulink>. The purpose of the book is to document in-core cryptography more completely for developers and quality assurance personnel who wish to take advantage of the features.</para>
<para>POWER8 in-core cryptography includes CPU instructions to accelerate AES, SHA-256, SHA-512 and polynomial multiplication. Additionally POWER9 adds a random number generator called DARN. This document includes treatments of AES, SHA-256, SHA-512 and DARN. It does not include a discussion of polynomial multiplication at the moment, but the chapter is stubbed-out (and waiting for a contributor).</para>
<para>The POWER8 extensions for in-core cryptography find their ancestry in the Altivec SIMD coprocessor. The POWER8 vector unit includes Vector-Scalar Extensions (VSX) and the instruction set for in-core cryptography is a part of it. You can find additional information on VSX in Chapter 7 of the <ulink url="https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0">IBM Power ISA Version 3.0B</ulink> at the OpenPOWER Foundation website<!-- at IBM's website at <ulink url="https://www.ibm.com/developerworks/">TODO</ulink -->.</para>
<section id="intro_arch">
<title>Architecture</title>
<para>There are two POWER architectures that you will encounter as you are working on your implementation. The first is POWER7, and it is governed by ISA 2.06B documents. The second is POWER8, and it is governed by ISA 2.07B documents.</para>
<para>In-core cryptography requires POWER8 and ISA 2.07B support. POWER8 is the ISA that provides the instructions for AES, SHA and polynomial multiplication. POWER7 provides other useful instructions, like unaligned loads and stores. If you are working with POWER8, then you have everything in POWER7 and earlier.</para>
<para>Note that if you are developing for little-endian Linux on PowerPC, then POWER7 is not available. POWER8 and ISA 2.07B support is the minimum requirement for little-endian systems.</para>
</section>
<section id="intro_openpower">
<title>OpenPOWER Foundation</title>
<para>The <ulink url="https://openpowerfoundation.org">OpenPOWER Foundation</ulink> is an open technical community based on the POWER architecture. Its mission is to create an open ecosystem, using the POWER architecture to share expertise, investment, and server-class intellectual property to serve the evolving needs of customers and industry.</para>
<para>The Foundation publishes many technical documents for the POWER architectures in its <ulink url="https://openpowerfoundation.org/technical/resource-catalog/">Resource Catalog</ulink>, including POWER8 and ISA 2.07B. Change flags in the document show the content added relative to POWER7 and ISA 2.06B, which predates the OpenPOWER Foundation.</para>
</section>
<section id="intro_compilers">
<title>Compilers</title>
<para><indexterm><primary>GCC</primary></indexterm><indexterm><primary>XLC</primary></indexterm><indexterm><primary>Clang</primary></indexterm><indexterm><primary>Compiler</primary><secondary>GCC</secondary></indexterm><indexterm><primary>Compiler</primary><secondary>XLC</secondary></indexterm><indexterm><primary>Compiler</primary><secondary>Clang</secondary></indexterm>The book does not discriminate compilers. All the samples will compile with both GCC and IBM XL C/C++. XL C/C++ is IBM's flagship compiler, and it is referred to as XLC on occasion.</para>
<para>The samples may compile with LLVM's Clang but it was not tested. The compile farm does not have Clang installed so we could not test it. We would like to see how well Clang performs when compared to GCC and XLC. If you encounter a problem using Clang then please report it.</para>
<para>The compiler you use can make a measurable difference on you program. For example, you will probably obtain different benchmark results using GCC and XLC. You will even obtain different benchmark results among versions of the same compiler. For example, GCC 7.2 is generally faster than GCC 4.8.5, and both SHA-256 and SHA-512 builtin implementations will speed up by about 2 cycles per byte (cpb) using GCC 7.</para>
<para>Compilers are discussed in more detail at <xref linkend="gcc_compiler"/>, <xref linkend="xlc_compiler"/> and <xref linkend="clang_compiler"/>.</para>
</section>
<section id="intro_src_code">
<title>Source code</title>
<para><indexterm><primary>Source code</primary></indexterm>The source code in the book is a mix of C and C++. The SHA-256 and SHA-512 samples were written in C++ to avoid compile errors due to the SHA API requiring 4-bit literal constants. We could not pass parameters through functions and obtain the necessary <systemitem>constexpr-ness</systemitem> so template parameters were used instead.</para>
<para>There is no source code library to download <emphasis>per se</emphasis>. The code is taken from Botan, Crypto++ and OpenSSL free software projects. Some code is taken from Andy Polyakov and Cryptogams. Some code is taken from GitHub projects. And some code was written and thrown away after testing.</para>
<para>The code for AES and SHA was collected and placed at <ulink url="https://github.com/noloader/AES-Intrinsics">AES Intrinsics</ulink> and <ulink url="https://github.com/noloader/SHA-Intrinsics">SHA Intrinsics</ulink> GitHubs but the code is not in library format. Rather they are stand alone proof of concepts.</para>
</section>
<section id="intro_cfarm" xreflabel="Compile Farm">
<title>Compile Farm</title>
<para><indexterm><primary>Compile farm</primary></indexterm><indexterm><primary>Compile farm</primary><secondary>gcc112</secondary></indexterm><indexterm><primary>Compile farm</primary><secondary>gcc119</secondary></indexterm><indexterm><primary>Compile farm</primary><secondary>gcc135</secondary></indexterm>The GCC Compile Farm (Cfarm) is a collection of hardware available to free and open source software developers run by the GNU organization. The farm provides hardware for Intel x86, Aarch64, MIPS, SPARC and PowerPC.</para>
<para>The Compile Farm offers at least seven PowerPC machines. <systemitem class="domainname">gcc112</systemitem> and <systemitem class="domainname">gcc119</systemitem> are the POWER8 iron. <systemitem class="domainname">gcc135</systemitem> is the POWER9 iron. The other machines are POWER7 hardware. <systemitem>gcc112</systemitem> and <systemitem>gcc135</systemitem> are a Linux PowerPC, 64-bit, little-endian machine (ppc64-le), and <systemitem>gcc119</systemitem> is an AIX PowerPC, 64-bit, big-endian machine (ppc64-be).</para>
<para>Both POWER8 machines are IBM POWER System S822 with two CPU cards. <systemitem>gcc112</systemitem> has 160 logical CPUs and runs at 3.4 GHz. <systemitem>gcc119</systemitem> has 64 logical CPUs and runs at 4.1 GHz. At 4.1 GHz and 192 GB of RAM <systemitem>gcc119</systemitem> is probably one of the fastest machine you will work on.</para>
<para>If you are a free and open software developer then you are eligible for a free <ulink url="https://cfarm.tetaneutral.net/">GCC Compile Farm</ulink> account. Access is provided through SSH.</para>
<para>If you work on machines in the Compile Farm then be mindful of the default GCC compiler. It is probably GCC 4.8.5, and you usually get better code generation and performance with GCC 7.2 located at <systemitem>/opt/cfarm/gcc-latest</systemitem>.</para>
</section>
<section id="intro_contribute" xreflabel="Contributing">
<title>Contributing</title>
<para><indexterm><primary>Contributing</primary></indexterm>This book is free software. If you see an opportunity for improvement, an error or an omission then please submit a pull request or open a bug report.</para>
</section>
<section id="intro_organization" xreflabel="Book Organization">
<title>Organization</title>
<para><indexterm><primary>Administrivia</primary></indexterm>The book proceeds in eight parts. First, administrivia is discussed, like how to determine machine endianness and how to load and store a vector from memory. A full treatment of vector programming is its own book, but the discussion should be adequate to move on to the more interesting tasks.</para>
<para><indexterm><primary>Feature detection</primary></indexterm>Second, runtime feature detections is discussed for AIX and Linux. Runtime detection allows you to switch to a faster implementation at runtime when the hardware provides the support.</para>
<para><indexterm><primary>AES</primary></indexterm>Third, AES is discussed. AES is specified in <ulink url="https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.197.pdf">FIPS 197, Advanced Encryption Standard (AES)</ulink>. You should read the standard if you are not familiar with the block cipher.</para>
<para><indexterm><primary>SHA</primary></indexterm>Fourth, SHA is discussed. SHA is specified in <ulink url="https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.180-4.pdf">FIPS 180-4, Secure Hash Standard (SHS)</ulink>. You should read the standard if you are not familiar with the hash.</para>
<para><indexterm><primary>Polynomial multiplication</primary></indexterm>Fifth, polynomial multiplication is discussed. Polynomial multiplications is important for CRC-32, CRC-32C and GCM mode of operation for AES.</para>
<para>Sixth, performance is discussed. The implementations are compared against C and C++ routines and assembly language routines from OpenSSL. The OpenSSL routines are high quality and written by Andy Polyakov.</para>
<para><indexterm><primary>Assembly language</primary></indexterm>Seventh, assembly language integration is discussed. Andy Polyakov dual licenses his cryptographic implementations and you can use his routines once you know how to integrate them.</para>
<para><indexterm><primary>Performance</primary></indexterm>Finally, performance and benchmarking is discussed. C/C++, C++ using builtins and assembly language routines are benchmarked using GCC.</para>
</section>
</chapter>