[AVR] Cherry-pick 17 upstream AVR backend fixes into the Rust LLVM fork #66

A number of multiplication instructions (muls, mulsu, fmul, fmuls, fmulsu) had the wrong register class for an operand. This resulted in the wrong register being used for the instruction. Example: target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8" target triple = "avr-atmel-none" define i16 @sliceAppend(i16, i16, i16, i16, i16, i16) addrspace(1) { %d = mul i16 %0, %5 ret i16 %d } The first instruction would be muls r24, r31 before this patch. The r31 should have been r15 if you look at the intermediate forms during instruction selection / register allocation, but the generated instruction uses r31. After this patch, an extra movw is inserted to get %5 in range for muls. To make sure this bug is fixed everywhere, I checked all instructions and found that most multiplication instructions suffered from this bug, which I have fixed with this patch. No other instructions appear to be affected. Differential Revision: https://reviews.llvm.org/D74281

Adjusting by 2 breaks DWARF output. With this fix, programs start to compile and produce valid DWARF output. Differential Revision: https://reviews.llvm.org/D74213

Summary: LDRdPtr expanded from LDWRdPtr shouldn't define its second operand(SrcReg). The second operand is its source register. Add -verify-machineinstrs into command line of testcases can trigger this error. Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75437

Found by the LLVM MemorySanitizer tests when switching AVR to a default backend. ELFArch must be initialized before the call to initializeSubtargetDependencies(). The uninitialized read would occur deep within TableGen'd code.

…target The initialization order was not correct. These bugs were discovered by valgrind. They appear to work fine in practice but this patch should unblock switching the AVR backend on by default as now a standard AVR llc invocation runs without memory errors. The AVRISelLowering constructor would run before the subtarget boolean fields were initialized to false. Now, the initialization order is correct.

In the past, AVR functions were only lowered with interrupt-specific machine code if the function was defined with the "avr-interrupt" or "avr-signal" calling conventions. This patch modifies the backend so that if the function does not have a special calling convention, but does have an "interrupt" attribute, that function is interpreted as a function with interrupts. This also extracts the "is this function an interrupt" logic from several disparate places in the backend into one AVRMachineFunctionInfo attribute. Bug found by Wilhelm Meier.

The avr-libc provides *divmodqi4, *divmodhi4, and *divmodsi4 functions, but does not provide a *divmoddi4. Instead it provides regular *divdi3 and *moddi3 functions. Note that avr-libc doesn't support *divti3 or *modti3 for 128-bit integer manipulation. Source: https://github.com/gcc-mirror/gcc/blob/releases/gcc-5.4.0/libgcc/config/avr/lib1funcs.S Differential Revision: https://reviews.llvm.org/D78437

Previously, the AVR backend would put functions in .progmem.data. This is probably a regression from when functions still lived in address space 0. With this change, only global constants are placed in .progmem.data. This is not complete: avr-gcc additionally respects -fdata-sections for progmem global constants, which LLVM doesn't yet do. But fixing that is a bit more complicated (and I believe other backends such as RISC-V might also have similar issues). Differential Revision: https://reviews.llvm.org/D78212

Summary: On XMEGA, I/O address space is same as data address space - there is no 0x20 offset, because CPU General Purpose Registers are not mapped in data address space. From https://en.wikipedia.org/wiki/AVR_microcontrollers > In the XMEGA variant, the working register file is not mapped into the data address space; as such, it is not possible to treat any of the XMEGA's working registers as though they were SRAM. Instead, the I/O registers are mapped into the data address space starting at the very beginning of the address space. Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: hiraditya, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77207 Patch by Vlastimil Labsky.

@LemonBoy

Summary: Handle the emission of `R_AVR_6` ELF relocation type. Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: hiraditya, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78721 Patch by @LemonBoy https://reviews.llvm.org/p/LemonBoy/

This patch fixes a bug in stack save/restore code. Because the frame pointer was saved/restored manually (not by marking it as clobbered) the StackSize variable was not updated accordingly. Most code still worked, but code that tried to load a parameter passed on the stack did not. This commit fixes this by marking the frame pointer as a callee-clobbered register. This will let it be saved without any effort in prolog/epilog code and will make sure the correct address is calculated for loading parameters that are passed on the stack. This approach is used by most other targets (such as X86, AArch64 and RISC-V). Differential Revision: https://reviews.llvm.org/D78579

@bar

An instruction like this will need to allocate some stack space for the last parameter: %x = call addrspace(1) i16 @bar(i64 undef, i64 undef, i16 undef, i16 0) This worked fine when passing an actual value (in this case 0). However, when passing undef, no value was pushed to the stack and therefore no push instructions were created. This caused an unbalanced stack leading to interesting results. This commit fixes that by replacing the push logic with a regular stack adjustment and stack-relative load/stores. This is less efficient but at least it correctly compiles the code. I can think of a few improvements in the future: * The stack should have been adjusted in the function prologue when there are no allocas in the function. * Many (if not most) stack adjustments can be replaced by pushing/popping the values directly. Exactly like the previous code attempted but didn't do correctly. * Small stack adjustments can be done more efficiently with a few push/pop instructions (pushing/popping bogus values), both for code size and for speed. All in all, as long as there are no allocas in the function I think that it is almost always more efficient to emit regular push/pop instructions. This is however left for future optimizations. Differential Revision: https://reviews.llvm.org/D78581

@foo

Code like the following: define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) { entry: %conv = zext i1 %b to i32 %add = add nsw i32 %conv, %a ret i32 %add } Would compile to the following (incorrect) code: foo: mov r18, r20 clr r19 add r22, r18 adc r23, r19 sbci r24, 0 sbci r25, 0 ret Those sbci instructions are clearly wrong, they should have been adc instructions. This commit improves codegen to use adc instead: foo: mov r18, r20 clr r19 ldi r20, 0 ldi r21, 0 add r22, r18 adc r23, r19 adc r24, r20 adc r25, r21 ret This code is not optimal (it could be just 5 instructions instead of the current 9) but at least it doesn't miscompile. Differential Revision: https://reviews.llvm.org/D78439

I'm not entirely sure why this was ever needed, but when I remove both adjustments all tests still pass. This fixes a bug where a long branch (using the `jmp` instead of the `rjmp` instruction) was incorrectly adjusted by 2 because it jumps to an absolute address instead of a PC-relative address. I could have added AVR::fixup_call to the list of exceptions, but it seemed more sensible to me to just remove this code. Differential Revision: https://reviews.llvm.org/D78459

@fun

Summary: The previous version relied on the standard calling convention using std::reverse() to try to force the AVR ABI. But this only works for simple cases, it fails for example with aggregate types. This patch rewrites the calling convention with custom C++ code, that implements the ABI defined in https://gcc.gnu.org/wiki/avr-gcc. To do that it adds a few 16-bit pseudo registers for unaligned argument passing, such as R24R23. For example this function: define void @fun({ i8, i16 } %a) will pass %a.0 in R22 and %a.1 in R24R23. There are no instructions that can use these pseudo registers, so a new register class, DREGSMOVW, is defined to make them apart. Also the ArgCC_AVR_BUILTIN_DIV is no longer necessary, as it is identical to the C++ behavior (actually the clobber list is more strict for __div* functions, but that is currently unimplemented). Reviewers: dylanmckay Subscribers: Gaelan, Sh4rK, indirect, jwagen, efriedma, dsprenkels, hiraditya, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68524 Patch by Rodrigo Rivas Costa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AVR] Cherry-pick 17 upstream AVR backend fixes into the Rust LLVM fork #66

[AVR] Cherry-pick 17 upstream AVR backend fixes into the Rust LLVM fork #66

Commits on Jun 23, 2020