-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync branch with main #2
Commits on Aug 9, 2024
-
[IRBuilder] Generate nuw GEPs for struct member accesses (llvm#99538)
Generate nuw GEPs for struct member accesses, as inbounds + non-negative implies nuw. Regression tests are updated using update scripts where possible, and by find + replace where not.
Configuration menu - View commit details
-
Copy full SHA for 94473f4 - Browse repository at this point
Copy the full SHA 94473f4View commit details -
Revert "[mlir][ArmSME] Pattern to swap shape_cast(tranpose) with tran…
…spose(shape_cast) (llvm#100731)" (llvm#102457) This reverts commit 88accd9. This change can be dropped in favor of just llvm#102017.
Configuration menu - View commit details
-
Copy full SHA for fc4485b - Browse repository at this point
Copy the full SHA fc4485bView commit details -
[NFC] Use references to avoid copying (llvm#99863)
Modifying `auto` to `auto&` to avoid unnecessary copying
Configuration menu - View commit details
-
Copy full SHA for 3e806c8 - Browse repository at this point
Copy the full SHA 3e806c8View commit details -
[clang] Implement CWG2627 Bit-fields and narrowing conversions (llvm#…
…78112) https://cplusplus.github.io/CWG/issues/2627.html It is no longer a narrowing conversion when converting a bit-field to a type smaller than the field's declared type if the bit-field has a width small enough to fit in the target type. This includes integral promotions (`long long i : 8` promoted to `int` is no longer narrowing, allowing `c.i <=> c.i`) and list-initialization (`int n{ c.i };`) Also applies back to C++11 as this is a defect report.
Configuration menu - View commit details
-
Copy full SHA for 574e958 - Browse repository at this point
Copy the full SHA 574e958View commit details -
[mlir][vector] Disable
vector.matrix_multiply
for scalable vectors (l……lvm#102573) Disables `vector.matrix_multiply` for scalable vectors. As per the docs: > This is the counterpart of llvm.matrix.multiply in MLIR I'm not aware of any use of matrix-multiply intrinsics in the context of scalable vectors, hence disabling.
Configuration menu - View commit details
-
Copy full SHA for 4c19de9 - Browse repository at this point
Copy the full SHA 4c19de9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 24be4d5 - Browse repository at this point
Copy the full SHA 24be4d5View commit details -
[flang][OpenMP] Handle multiple ranges in
num_teams
clause (llvm#10…Configuration menu - View commit details
-
Copy full SHA for 3064646 - Browse repository at this point
Copy the full SHA 3064646View commit details -
[InstCombine] Remove unnecessary RUN line from test (NFC)
As all the necessary information is encoded using attributes nowadays, this test doesn't actually depend on the triple anymore.
Configuration menu - View commit details
-
Copy full SHA for 0795ab4 - Browse repository at this point
Copy the full SHA 0795ab4View commit details -
[RISCV] Add Syntacore SCR5 RV32/64 processors definition (llvm#102285)
Syntacore SCR5 is an entry-level Linux-capable 32/64-bit RISC-V processor core. Overview: https://syntacore.com/products/scr5 Scheduling model will be added in a subsequent PR. Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com> Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
Configuration menu - View commit details
-
Copy full SHA for 02645d6 - Browse repository at this point
Copy the full SHA 02645d6View commit details -
Revert "Enable logf128 constant folding for hosts with 128bit floats (l…
…lvm#96287)" This reverts commit ccb2b01. Causes buildbot failures, e.g. on ppc64le builders.
Configuration menu - View commit details
-
Copy full SHA for a15de17 - Browse repository at this point
Copy the full SHA a15de17View commit details -
LSV/test/AArch64: add missing lit.local.cfg; fix build (llvm#102607)
Follow up on 199d6f2 (LSV: document hang reported in llvm#37865) to fix the build when omitting the AArch64 target. Add the missing lit.local.cfg.
Configuration menu - View commit details
-
Copy full SHA for fff78a5 - Browse repository at this point
Copy the full SHA fff78a5View commit details -
[MemoryBuiltins] Handle allocator attributes on call-site
We should handle allocator attributes not only on function declarations, but also on the call-site. That way we can e.g. also optimize cases where the allocator function is a virtual function call. This was already supported in some of the MemoryBuiltins helpers, but not all of them. This adds support for allocsize, alloc-family and allockind("free").
Configuration menu - View commit details
-
Copy full SHA for 1953629 - Browse repository at this point
Copy the full SHA 1953629View commit details -
[AArch64] Add invalid 1 x vscale costs for reductions and reduction-o…
…perations. (llvm#102105) The code-generator is currently not able to handle scalable vectors of <vscale x 1 x eltty>. The usual "fix" for this until it is supported is to mark the costs of loads/stores with an invalid cost, preventing the vectorizer from vectorizing at those factors. But on rare occasions loops do not contain load/stores, only reductions. So whilst this is still unsupported return an invalid cost to avoid selecting vscale x 1 VFs. The cost of a reduction is not currently used by the vectorizer so this adds the cost to the add/mul/and/or/xor or min/max that should feed the reduction. It includes reduction costs too, for completeness. This change will be removed when code-generation for these types is sufficiently reliable. Fixes llvm#99760
Configuration menu - View commit details
-
Copy full SHA for 0b745a1 - Browse repository at this point
Copy the full SHA 0b745a1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8ce6449 - Browse repository at this point
Copy the full SHA 8ce6449View commit details -
[MemoryBuiltins] Simplify getCalledFunction() helper (NFC)
If nobuiltin is set, directly return nullptr instead of using a separate out parameter and having all callers check this.
Configuration menu - View commit details
-
Copy full SHA for 5bc1f9e - Browse repository at this point
Copy the full SHA 5bc1f9eView commit details -
AMDGPU/NewPM: Port SIFixSGPRCopies to new pass manager (llvm#102614)
This allows moving some tests relying on -stop-after=amdgpu-isel to move to checking -stop-after=finalize-isel instead, which will more reliably pass the verifier.
Configuration menu - View commit details
-
Copy full SHA for cf54cae - Browse repository at this point
Copy the full SHA cf54caeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d77dd5 - Browse repository at this point
Copy the full SHA 1d77dd5View commit details -
Fix a unit test input file (llvm#102567)
I forgot to update the version info in the SDKSettings file when I updated it to the real version relevant to the test.
Configuration menu - View commit details
-
Copy full SHA for 4c5ef66 - Browse repository at this point
Copy the full SHA 4c5ef66View commit details -
[MLIR][GPU-LLVM] Convert
gpu.func
tollvm.func
(llvm#101664)Add support in `-convert-gpu-to-llvm-spv` to convert `gpu.func` to `llvm.func` operations. - `spir_kernel`/`spir_func` calling conventions used for kernels/functions. - `workgroup` attributions encoded as additional `llvm.ptr<3>` arguments. - No attribute used to annotate kernels - `reqd_work_group_size` attribute using to encode `gpu.known_block_size`. - `llvm.mlir.workgroup_attrib_size` used to encode workgroup attribution sizes. This will be attached to the pointer argument workgroup attributions lower to. **Note**: A notable missing feature that will be addressed in a follow-up PR is a `-use-bare-ptr-memref-call-conv` option to replace MemRef arguments with bare pointers to the MemRef element types instead of the current MemRef descriptor approach. --------- Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Configuration menu - View commit details
-
Copy full SHA for d45de80 - Browse repository at this point
Copy the full SHA d45de80View commit details -
[mlir][spirv] Support
memref
inconvert-to-spirv
pass (llvm#102534)This PR adds conversion patterns for MemRef to the `convert-to-spirv` pass, introduced in llvm#95942. Conversions from MemRef memory space to SPIR-V storage class were also included, and would run before the final dialect conversion phase. **Future Plans** - Add tests for ops other than `memref.load` and `memref.store` --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 93fc459 - Browse repository at this point
Copy the full SHA 93fc459View commit details -
Configuration menu - View commit details
-
Copy full SHA for ff1cc5b - Browse repository at this point
Copy the full SHA ff1cc5bView commit details -
[AMDGPU][AsmParser][NFCI] All NamedIntOperands to be of the i32 type. (…
…llvm#102616) There's no need for them to have different types. Part of <llvm#62629>.
Configuration menu - View commit details
-
Copy full SHA for 335bc3c - Browse repository at this point
Copy the full SHA 335bc3cView commit details -
Configuration menu - View commit details
-
Copy full SHA for dad1cb9 - Browse repository at this point
Copy the full SHA dad1cb9View commit details -
[Clang][OMPX] Add the code generation for multi-dim
num_teams
(llvm……#101407) This patch adds the code generation support for multi-dim `num_teams` clause when it is used with `target teams ompx_bare` construct.
Configuration menu - View commit details
-
Copy full SHA for ee8100b - Browse repository at this point
Copy the full SHA ee8100bView commit details -
[SelectionDAG] Use unaligned store/load to move AVX registers onto st…
…ack for `insertelement` (llvm#82130) Prior to this patch, SelectionDAG generated aligned move onto stacks for AVX registers when the function was marked as a no-realign-stack function. This lead to misalignment between the stack and the instruction generated. This patch fixes the issue. There was a similar issue reported for `extractelement` which was fixed in a6614ec Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 259742a - Browse repository at this point
Copy the full SHA 259742aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3bd63d4 - Browse repository at this point
Copy the full SHA 3bd63d4View commit details -
[Clang] Simplify specifying passes via -Xoffload-linker (llvm#102483)
Make it possible to do things like the following, regardless of whether the offload target is nvptx or amdgpu: ``` $ clang -O1 -g -fopenmp --offload-arch=native test.c \ -Xoffload-linker -mllvm=-pass-remarks=inline \ -Xoffload-linker -mllvm=-force-remove-attribute=g.internalized:noinline\ -Xoffload-linker --lto-newpm-passes='forceattrs,default<O1>' \ -Xoffload-linker --lto-debug-pass-manager \ -foffload-lto ``` To accomplish that: - In clang-linker-wrapper, do not forward options via `-Wl` if they might have literal commas. Use `-Xlinker` instead. - In clang-nvlink-wrapper, accept `--lto-debug-pass-manager` and `--lto-newpm-passes`. - In clang-nvlink-wrapper, drop `-passes` because it's inconsistent with the interface of `lld`, which is used instead of clang-nvlink-wrapper when the target is amdgpu. Without this patch, `-passes` is passed to `nvlink`, producing an error anyway. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
Configuration menu - View commit details
-
Copy full SHA for 3c639b8 - Browse repository at this point
Copy the full SHA 3c639b8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5c0eb1a - Browse repository at this point
Copy the full SHA 5c0eb1aView commit details -
[gn] Give two scripts argparse.RawDescriptionHelpFormatter
Without this, the doc string is put in a single line. These scripts have multi-line docstrings, so this makes their --help output look much nicer. Otherwise, no behavior change.
Configuration menu - View commit details
-
Copy full SHA for f4d5b14 - Browse repository at this point
Copy the full SHA f4d5b14View commit details -
[X86] Convert truncsat clamping patterns to use SDPatternMatch. NFC.
Inspired by llvm#99418 (which hopefully we can replace this code with at some point)
Configuration menu - View commit details
-
Copy full SHA for 669d844 - Browse repository at this point
Copy the full SHA 669d844View commit details -
[Clang] Fix Handling of Init Capture with Parameter Packs in LambdaSc…
…opeForCallOperatorInstantiationRAII (llvm#100766) This PR addresses issues related to the handling of `init capture` with parameter packs in Clang's `LambdaScopeForCallOperatorInstantiationRAII`. Previously, `addInstantiatedCapturesToScope` would add `init capture` containing packs to the scope using the type of the `init capture` to determine the expanded pack size. However, this approach resulted in a pack size of 0 because `getType()->containsUnexpandedParameterPack()` returns `false`. After extensive testing, it appears that the correct pack size can only be inferred from `getInit`. But `getInit` may reference parameters and `init capture` from an outer lambda, as shown in the following example: ```cpp auto L = [](auto... z) { return [... w = z](auto... y) { // ... }; }; ``` To address this, `addInstantiatedCapturesToScope` in `LambdaScopeForCallOperatorInstantiationRAII` should be called last. Additionally, `addInstantiatedCapturesToScope` has been modified to only add `init capture` to the scope. The previous implementation incorrectly called `MakeInstantiatedLocalArgPack` for other non-init captures containing packs, resulting in a pack size of 0. ### Impact This patch affects scenarios where `LambdaScopeForCallOperatorInstantiationRAII` is passed with `ShouldAddDeclsFromParentScope = false`, preventing the correct addition of the current lambda's `init capture` to the scope. There are two main scenarios for `ShouldAddDeclsFromParentScope = false`: 1. **Constraints**: Sometimes constraints are instantiated in place rather than delayed. In this case, `LambdaScopeForCallOperatorInstantiationRAII` does not need to add `init capture` to the scope. 2. **`noexcept` Expressions**: The expressions inside `noexcept` have already been transformed, and the packs referenced within have been expanded. Only `RebuildLambdaInfo` needs to add the expanded captures to the scope, without requiring `addInstantiatedCapturesToScope` from `LambdaScopeForCallOperatorInstantiationRAII`. ### Considerations An alternative approach could involve adding a data structure within the lambda to record the expanded size of the `init capture` pack. However, this would increase the lambda's size and require extensive modifications. This PR is a prerequisite for implmenting llvm#61426
Configuration menu - View commit details
-
Copy full SHA for 52126dc - Browse repository at this point
Copy the full SHA 52126dcView commit details -
[mlir] Verifier: steal bit to track seen instead of set. (llvm#102626)
Tracking a set containing every block and operation visited can become very expensive and is unnecessary. Co-authored-by: Will Dietz <w@wdtz.org>
Configuration menu - View commit details
-
Copy full SHA for 7a98071 - Browse repository at this point
Copy the full SHA 7a98071View commit details -
[Arm][AArch64][Clang] Respect function's branch protection attributes. (
llvm#101978) Default attributes assigned to all functions according to the command line parameters. Some functions might have their own attributes and we need to set or remove attributes accordingly. Tests are updated to test this scenarios too.
Configuration menu - View commit details
-
Copy full SHA for 9e9fa00 - Browse repository at this point
Copy the full SHA 9e9fa00View commit details -
[AMDGPU][AsmParser][NFC] Remove a misleading comment. (llvm#102604)
The work of ParseRegularReg() should remain to be parsing the register as it was specified, and not to try translate it to anything else. It's up to operand predicates to decide on what is and is not an acceptable register for an operand, including considering its expected register class, and for the rest of the AsmParser infrastructure to handle it respectively from there on.
Configuration menu - View commit details
-
Copy full SHA for 52220c2 - Browse repository at this point
Copy the full SHA 52220c2View commit details -
[MLIR][DLTI][Transform] Introduce transform.dlti.query (llvm#101561)
This transform op makes it possible to query attributes associated to IR by means of the DLTI dialect. The op takes both a `key` and a target `op` to perform the query at. Facility functions automatically find the closest ancestor op which defines the appropriate DLTI interface or has an attribute implementing a DLTI interface. By default the lookup uses the data layout interfaces of DLTI. If the optional `device` parameter is provided, the lookup happens with respect to the interfaces for TargetSystemSpec and TargetDeviceSpec. This op uses new free-standing functions in the `dlti` namespace to not only look up specifications via the `DataLayoutSpecOpInterface` and on `ModuleOp`s but also on any ancestor op that has an appropriate DLTI attribute.
Configuration menu - View commit details
-
Copy full SHA for 8f21ff9 - Browse repository at this point
Copy the full SHA 8f21ff9View commit details -
Configuration menu - View commit details
-
Copy full SHA for f4fb735 - Browse repository at this point
Copy the full SHA f4fb735View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5c016bf - Browse repository at this point
Copy the full SHA 5c016bfView commit details -
[IR] Add method to GlobalVariable to change type of initializer. (llv…
…m#102553) With opaque pointers, nothing directly uses the value type, so we can mutate it if we want. This avoid doing a complicated RAUW dance.
Configuration menu - View commit details
-
Copy full SHA for 2f8f58d - Browse repository at this point
Copy the full SHA 2f8f58dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 23209d1 - Browse repository at this point
Copy the full SHA 23209d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f19a7b - Browse repository at this point
Copy the full SHA 6f19a7bView commit details -
[mlir][vector][test] Split tests from vector-transfer-flatten.mlir (l…
…lvm#102584) Move tests that exercise DropUnitDimFromElementwiseOps and DropUnitDimsFromTransposeOp to a dedicated file. While these patterns are collected under populateFlattenVectorTransferPatterns (and are tested via -test-vector-transfer-flatten-patterns), they can actually be tested without the xfer Ops, and hence the split. Note, this is mostly just moving tests from one file to another. The only real change is the removal of the following check-lines: ```mlir // CHECK-128B-NOT: memref.collapse_shape ``` These were added specifically to check the "flattening" logic (which introduces `memref.collapse_shape`). However, these tests were never meant to test that logic (in fact, that's the reason I am moving them to a different file) and hence are being removed as copy&paste errors. I also removed the following TODO: ```mlir /// TODO: Potential duplication with tests from: /// * "vector-dropleadunitdim-transforms.mlir" /// * "vector-transfer-drop-unit-dims-patterns.mlir" ``` I've checked what patterns are triggered in those test files and neither DropUnitDimFromElementwiseOps nor DropUnitDimsFromTransposeOp does.
Configuration menu - View commit details
-
Copy full SHA for 5123f2c - Browse repository at this point
Copy the full SHA 5123f2cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 37c6683 - Browse repository at this point
Copy the full SHA 37c6683View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7752fec - Browse repository at this point
Copy the full SHA 7752fecView commit details -
[msan] Support vst{2,3,4}_lane instructions (llvm#101215)
This generalizes MSan's Arm NEON vst support, to include the lane-specific variants. This also updates the test from llvm#100645.
Configuration menu - View commit details
-
Copy full SHA for cb5ec37 - Browse repository at this point
Copy the full SHA cb5ec37View commit details -
Configuration menu - View commit details
-
Copy full SHA for 95820ca - Browse repository at this point
Copy the full SHA 95820caView commit details -
[libc][math][c23] Add fadd{l,f128} C23 math functions (llvm#102531)
Co-authored-by: OverMighty <its.overmighty@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 8c81fb6 - Browse repository at this point
Copy the full SHA 8c81fb6View commit details -
[AMDGPU] Move
AMDGPUAttributorPass
to full LTO post link stage (llv……m#102086) Currently `AMDGPUAttributorPass` is registered in default optimizer pipeline. This will allow the pass to run in default pipeline as well as at thinLTO post link stage. However, it will not run in full LTO post link stage. This patch moves it to full LTO.
Configuration menu - View commit details
-
Copy full SHA for 2fe61a5 - Browse repository at this point
Copy the full SHA 2fe61a5View commit details -
[asan] Switch allocator to dynamic base address (llvm#98511)
This ports a fix from memprof (llvm#98510), which has a shadow mapping that is similar to ASan (8 bytes of shadow memory per 64 bytes of app memory). This patch changes the allocator to dynamically choose a base address, as suggested by Vitaly for memprof. This simplifies ASan's #ifdef's and avoids potential conflict in the event that ASan were to switch to a dynamic shadow offset in the future [1]. [1] Since shadow memory is mapped before the allocator is mapped: - dynamic shadow and fixed allocator (old memprof): could fail if "unlucky" (e.g., https://lab.llvm.org/buildbot/#/builders/66/builds/1361/steps/17/logs/stdio) - dynamic shadow and dynamic allocator (HWASan; current memprof): always works - fixed shadow and fixed allocator (current ASan): always works, if constants are carefully chosen - fixed shadow and dynamic allocator (ASan with this patch): always works
Configuration menu - View commit details
-
Copy full SHA for 7ede1c4 - Browse repository at this point
Copy the full SHA 7ede1c4View commit details -
[Clang] Add env var for nvptx-arch/amdgpu-arch timeout (llvm#102521)
When working on very busy systems, check-offload frequently fails many tests with this diagnostic: ``` clang: error: cannot determine amdgcn architecture: /tmp/llvm/build/bin/amdgpu-arch: Child timed out: ; consider passing it via '-march' ``` This patch accepts the environment variable `CLANG_TOOLCHAIN_PROGRAM_TIMEOUT` to set the timeout. It also increases the timeout from 10 to 60 seconds.
Configuration menu - View commit details
-
Copy full SHA for 1ea0865 - Browse repository at this point
Copy the full SHA 1ea0865View commit details -
[MIPS] Fix missing ANDI optimization (llvm#97689)
1. Add MipsPat to optimize (andi (srl (truncate i64 $1), x), y) to (andi (truncate (dsrl i64 $1, x)), y). 2. Add MipsPat to optimize (ext (truncate i64 $1), x, y) to (truncate (dext i64 $1, x, y)). The assembly result is the same as gcc. Fixes llvm#42826
Configuration menu - View commit details
-
Copy full SHA for e711a0c - Browse repository at this point
Copy the full SHA e711a0cView commit details -
[scudo] Separated committed and decommitted entries. (llvm#101409)
Initially, the LRU list stored all mapped entries with no distinction between the committed (non-madvise()'d) entries and decommitted (madvise()'d) entries. Now these two types of entries re separated into two lists, allowing future cache logic to branch depending on whether or not entries are committed or decommitted. Furthermore, the retrieval algorithm will prioritize committed entries over decommitted entries. Specifically, committed entries that satisfy the MaxUnusedCachePages requirement are retrieved before optimal-fit, decommitted entries. This commit addresses the compiler errors raised [here](llvm#100818 (comment)).
Configuration menu - View commit details
-
Copy full SHA for 9f3ff8d - Browse repository at this point
Copy the full SHA 9f3ff8dView commit details -
Suppress spurious warnings due to R_RISCV_SET_ULEB128
llvm-objdump -S issues unnecessary warnings for RISC-V relocatable files containing .debug_loclists or .debug_rnglists sections with ULEB128 relocations. This occurred because `DWARFObjInMemory` verifies support for all relocation types, triggering warnings for unsupported ones. ``` % llvm-objdump -S a.o ... 0000000000000000 <foo>: warning: failed to compute relocation: R_RISCV_SUB_ULEB128, Invalid data was encountered while parsing the file warning: failed to compute relocation: R_RISCV_SET_ULEB128, Invalid data was encountered while parsing the file ... ``` This change fixes llvm#101544 by declaring support for the two ULEB128 relocation types, silencing the spurious warnings. --- In DWARF v5 builds, DW_LLE_offset_pair/DW_RLE_offset_pair might be generated in .debug_loclists/.debug_rnglists with ULEB128 relocations. They are only read by llvm-dwarfdump to dump section content and verbose DW_AT_location/DW_AT_ranges output for relocatable files. The DebugInfoDWARF user (e.g. DWARFDebugRnglists.cpp) calls `Data.getULEB128` without checking the ULEB128 relocations, as the unrelocated value holds meaning (refer to the assembler implementation https://reviews.llvm.org/D157657). This differs from `.quad .Lfoo`, which requires relocation reading (e.g. https://reviews.llvm.org/D74404). Pull Request: llvm#101607
Configuration menu - View commit details
-
Copy full SHA for edf45e4 - Browse repository at this point
Copy the full SHA edf45e4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b77531 - Browse repository at this point
Copy the full SHA 6b77531View commit details -
Configuration menu - View commit details
-
Copy full SHA for b6cbd01 - Browse repository at this point
Copy the full SHA b6cbd01View commit details -
[NVPTX] support switch statement with brx.idx (reland) (llvm#102550)
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx] (https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)). Depending on the heuristics in DAG selection, `switch` statements may now be lowered using `brx.idx`. Note: this fixes the previous issue in llvm#102400 by adding the isBarrier attribute to BRX_END
Configuration menu - View commit details
-
Copy full SHA for ccc3127 - Browse repository at this point
Copy the full SHA ccc3127View commit details -
[RISCV] Move PseudoVSET(I)VLI expansion to use PseudoInstExpansion. (l…
…lvm#102496) Instead of expanding in RISCVExpandPseudoInsts, expand during MachineInstr to MCInst lowering. We weren't doing anything in expansion other than copying operands.
Configuration menu - View commit details
-
Copy full SHA for 31c75a1 - Browse repository at this point
Copy the full SHA 31c75a1View commit details -
[RISCV] Remove riscv-experimental-rv64-legal-i32. (llvm#102509)
This has received no development work in a while and is slowly bit rotting as new extensions are added. At the moment, I don't think this is viable without adding a new invariant that 32 bit values are always in sign extended form like Mips64 does. We are very dependent on computeKnownBits and ComputeNumSignBits in SelectionDAG to remove sign extends created for ABI reasons. If we can't propagate sign bit information through 64-bit values in SelectionDAG, we can't effectively clean up those extends.
Configuration menu - View commit details
-
Copy full SHA for ca7ad38 - Browse repository at this point
Copy the full SHA ca7ad38View commit details -
[clang] Wire -fptrauth-returns to "ptrauth-returns" fn attribute. (ll…
…vm#102416) We already ended up with -fptrauth-returns, the feature macro, the lang opt, and the actual backend lowering. The only part left is threading it all through PointerAuthOptions, to drive the addition of the "ptrauth-returns" attribute to generated functions. While there, do minor cleanup on ptrauth-function-attributes.c. This also adds ptrauth_key_return_address to ptrauth.h.
Configuration menu - View commit details
-
Copy full SHA for 2eb6e30 - Browse repository at this point
Copy the full SHA 2eb6e30View commit details -
Return available function types for BindingDecls. (llvm#102196)
Only return nullptr when we don't have an available QualType.
Configuration menu - View commit details
-
Copy full SHA for e5697d7 - Browse repository at this point
Copy the full SHA e5697d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b27a57 - Browse repository at this point
Copy the full SHA 6b27a57View commit details -
Revert "[AMDGPU] Move
AMDGPUAttributorPass
to full LTO post link st……age (llvm#102086)" This reverts commit 2fe61a5.
Configuration menu - View commit details
-
Copy full SHA for 492484e - Browse repository at this point
Copy the full SHA 492484eView commit details -
[LLVM][rtsan] rtsan transform to preserve CFGAnalyses (llvm#102651)
Follow on to llvm#101232, as suggested in the comments, narrow the scope of the preserved analyses.
Configuration menu - View commit details
-
Copy full SHA for 22cce65 - Browse repository at this point
Copy the full SHA 22cce65View commit details -
[clang] Implement -fptrauth-auth-traps. (llvm#102417)
This provides -fptrauth-auth-traps, which at the frontend level only controls the addition of the "ptrauth-auth-traps" function attribute. The attribute in turn controls various aspects of backend codegen, by providing the guarantee that every "auth" operation generated will trap on failure. This can either be delegated to the hardware (if AArch64 FPAC is known to be available), in which case this attribute doesn't change codegen. Otherwise, if FPAC isn't available, this asks the backend to emit additional instructions to check and trap on auth failure.
Configuration menu - View commit details
-
Copy full SHA for d179acd - Browse repository at this point
Copy the full SHA d179acdView commit details -
[libc] Use cpp::numeric_limits in preference to C23 <limits.h> macros (…
…llvm#102665) This updates some code to consistently use cpp::numeric_limits, the src/__support polyfill for std::numeric_limits, rather than the C <limits.h> macros. This is in keeping with the general C++-oriented style in libc code, and also sidesteps issues about the new C23 *_WIDTH macros that the compiler-provided header does not define outside C23 mode. Bug: https://issues.fuchsia.dev/358196552
Configuration menu - View commit details
-
Copy full SHA for 2f6a879 - Browse repository at this point
Copy the full SHA 2f6a879View commit details -
[lldb] Move definition of SBSaveCoreOptions dtor out of header (llvm#…
…102539) This class is technically not usable in its current state. When you use it in a simple C++ project, your compiler will complain about an incomplete definition of SaveCoreOptions. Normally this isn't a problem, other classes in the SBAPI do this. The difference is that SBSaveCoreOptions has a default destructor in the header, so the compiler will attempt to generate the code for the destructor with an incomplete definition of the impl type. All methods for every class, including constructors and destructors, must have a separate implementation not in a header.
Configuration menu - View commit details
-
Copy full SHA for 101cf54 - Browse repository at this point
Copy the full SHA 101cf54View commit details -
[mlir][ODS] Consistent
cppType
/cppClassName
usage (llvm#102657)Make sure that the usage of `cppType` and `cppClassName` of type and attribute definitions/constraints is consistent in TableGen. - `cppClassName`: The C++ class name of the type or attribute. - `cppType`: The fully qualified C++ class name: C++ namespace and C++ class name. Basically, we should always use the fully qualified C++ class name for parameter types, return types or template arguments. Also some minor cleanups. Fixes llvm#57279.
Configuration menu - View commit details
-
Copy full SHA for 35f55f5 - Browse repository at this point
Copy the full SHA 35f55f5View commit details -
[LTO] enable
ObjCARCContractPass
only on optimized build (llvm#101114)\llvm#92331 tried to make `ObjCARCContractPass` by default, but it caused a regression on O0 builds and was reverted. This patch trys to bring that back by: 1. reverts the [revert](llvm@1579e9c). 2. `createObjCARCContractPass` only on optimized builds. Tests are updated to refelect the changes. Specifically, all `O0` tests should not include `ObjCARCContractPass` Signed-off-by: Peter Rong <PeterRong@meta.com>
Configuration menu - View commit details
-
Copy full SHA for 74e4694 - Browse repository at this point
Copy the full SHA 74e4694View commit details -
[mlir][ODS] Verify type constraints in Types and Attributes (llvm#102326
) When a type/attribute is defined in TableGen, a type constraint can be used for parameters, but the type constraint verification was missing. Example: ``` def TestTypeVerification : Test_Type<"TestTypeVerification"> { let parameters = (ins AnyTypeOf<[I16, I32]>:$param); // ... } ``` No verification code was generated to ensure that `$param` is I16 or I32. When type constraints a present, a new method will generated for types and attributes: `verifyInvariantsImpl`. (The naming is similar to op verifiers.) The user-provided verifier is called `verify` (no change). There is now a new entry point to type/attribute verification: `verifyInvariants`. This function calls both `verifyInvariantsImpl` and `verify`. If neither of those two verifications are present, the `verifyInvariants` function is not generated. When a type/attribute is not defined in TableGen, but a verifier is needed, users can implement the `verifyInvariants` function. (This function was previously called `verify`.) Note for LLVM integration: If you have an attribute/type that is not defined in TableGen (i.e., just C++), you have to rename the verification function from `verify` to `verifyInvariants`. (Most attributes/types have no verification, in which case there is nothing to do.) Depends on llvm#102657.
Configuration menu - View commit details
-
Copy full SHA for 7359a6b - Browse repository at this point
Copy the full SHA 7359a6bView commit details -
[libc] Fix use of cpp::numeric_limits<...>::digits (llvm#102674)
The previous change replaced INT_WIDTH with cpp::numberic_limits<int>::digits, but these don't have the same value. While INT_WIDTH == UINT_WIDTH, not so for ::digits, so use cpp::numberic_limits<unsigned int>::digits et al instead for the intended effects. Bug: https://issues.fuchsia.dev/358196552
Configuration menu - View commit details
-
Copy full SHA for a21cf56 - Browse repository at this point
Copy the full SHA a21cf56View commit details -
[SandboxIR] Implement the InsertElementInst class (llvm#102404)
Heavily based on work by @vporpo.
Configuration menu - View commit details
-
Copy full SHA for 66d8735 - Browse repository at this point
Copy the full SHA 66d8735View commit details -
[flang][cuda] Convert cuf.alloc for box to fir.alloca in device conte…
…xt (llvm#102662) In device context managed memory is not available so it makes no sense to allocate the descriptor using it. Fall back to fir.alloca as it is handled well in device code. cuf.free is just dropped.
Configuration menu - View commit details
-
Copy full SHA for 841327d - Browse repository at this point
Copy the full SHA 841327dView commit details -
[libc] Clean up remaining use of *_WIDTH macros in printf (llvm#102679)
The previous change missed the second spot doing the same thing. Bug: https://issues.fuchsia.dev/358196552
Configuration menu - View commit details
-
Copy full SHA for 6e8a751 - Browse repository at this point
Copy the full SHA 6e8a751View commit details -
Configuration menu - View commit details
-
Copy full SHA for e8eec71 - Browse repository at this point
Copy the full SHA e8eec71View commit details -
Configuration menu - View commit details
-
Copy full SHA for 842789b - Browse repository at this point
Copy the full SHA 842789bView commit details -
[mlir] Add support for parsing nested PassPipelineOptions (llvm#101118)
- Added a default parsing implementation to `PassOptions` to allow `Option`/`ListOption` to wrap PassOption objects. This is helpful when creating meta-pipelines (pass pipelines composed of pass pipelines). - Updated `ListOption` printing to enable round-tripping the output of `dump-pass-pipeline` back into `mlir-opt` for more complex structures.
Configuration menu - View commit details
-
Copy full SHA for 165c6d1 - Browse repository at this point
Copy the full SHA 165c6d1View commit details -
[NVPTX][NFC] Update tests to use bfloat type (llvm#101493)
Intrinsics are defined with a bfloat type as of commit 250f2bb, not i16 and i32 storage types. As such declarations are no longer needed once the correct types are used.
Configuration menu - View commit details
-
Copy full SHA for 8a5e179 - Browse repository at this point
Copy the full SHA 8a5e179View commit details -
Configuration menu - View commit details
-
Copy full SHA for 13fc914 - Browse repository at this point
Copy the full SHA 13fc914View commit details -
[SandboxIR] Clean up tracking code with the help of emplaceIfTracking…
…() (llvm#102406) This patch introduces Tracker::emplaceIfTracking(), a wrapper of Tracker::track() that will conditionally create the change object if tracking is enabled. This patch also removes the `Parent` member field of `IRChangeBase`.
Configuration menu - View commit details
-
Copy full SHA for f7ad495 - Browse repository at this point
Copy the full SHA f7ad495View commit details -
[CodeGen][NFCI] Don't re-implement parts of ASTContext::getIntWidth (l…
…lvm#101765) ASTContext::getIntWidth returns 1 if isBooleanType(), and falls back on getTypeSize in the default case, which itself just returns the Width from getTypeInfo's returned struct, so can be used in all cases here, not just for _BitInt types.
Configuration menu - View commit details
-
Copy full SHA for e91e0f5 - Browse repository at this point
Copy the full SHA e91e0f5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4bffbba - Browse repository at this point
Copy the full SHA 4bffbbaView commit details -
[compiler-rt][NFC] Replace environment variable with %t (llvm#102197)
Certain tests within the compiler-rt subproject encountered "command not found" errors when using lit's internal shell, particularly when trying to use the `DIR` environment variable. When checking with the command `LIT_USE_INTERNAL_SHELL=1 ninja check-compiler-rt`, I encountered the following error: ``` ******************** Testing: FAIL: SanitizerCommon-ubsan-i386-Linux :: sanitizer_coverage_trace_pc_guard-init.cpp (146 of 9570) ******************** TEST 'SanitizerCommon-ubsan-i386-Linux :: sanitizer_coverage_trace_pc_guard-init.cpp' FAILED ******************** Exit Code: 127 Command Output (stdout): -- # RUN: at line 5 DIR=/usr/local/google/home/harinidonthula/llvm-project/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/ubsan-i386-Linux/Output/sanitizer_coverage_trace_pc_guard-init.cpp.tmp_workdir # executed command: DIR=/usr/local/google/home/harinidonthula/llvm-project/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/ubsan-i386-Linux/Output/sanitizer_coverage_trace_pc_guard-init.cpp.tmp_workdir # .---command stderr------------ # | 'DIR=/usr/local/google/home/harinidonthula/llvm-project/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/ubsan-i386-Linux/Output/sanitizer_coverage_trace_pc_guard-init.cpp.tmp_workdir': command not found # `----------------------------- # error: command failed with exit status: 127 ``` In this patch, I resolved these issues by removing the use of the `DIR` environment variable. Instead, the tests now directly utilize `%t_workdir` for managing temporary directories. Additionally, I simplified the tests by embedding the clang command arguments directly into the test scripts, which avoids complications with environment variable expansion under lit's internal shell. This fix ensures that the tests run smoothly with lit's internal shell and prevents the "command not found" errors, improving the reliability of the test suite when executed in this environment. fixes: llvm#102395 [link to RFC](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179)
Configuration menu - View commit details
-
Copy full SHA for c69b8c4 - Browse repository at this point
Copy the full SHA c69b8c4View commit details -
[TargetLowering] Handle vector types in expandFixedPointMul (llvm#102635
) In TargetLowering::expandFixedPointMul when expanding fixed point multiplication, and when using a widened MUL as strategy for the lowering, there was a bug resulting in assertion failures like this: Assertion `VT.isVector() == N1.getValueType().isVector() && "SIGN_EXTEND result type type should be vector iff the operand " "type is vector!"' failed. Problem was that we did not consider that VT could be a vector type when setting up the WideVT. This patch should fix that bug.
Configuration menu - View commit details
-
Copy full SHA for bbefd57 - Browse repository at this point
Copy the full SHA bbefd57View commit details -
[libc] Fix CFP long double and add tests (llvm#102660)
The previous patch removing the fenv requirement for str to float had an error that got missed due to a lack of tests. This patch fixes the issue and adds tests, as well as updating the existing tests.
Configuration menu - View commit details
-
Copy full SHA for 7299c7f - Browse repository at this point
Copy the full SHA 7299c7fView commit details -
[libc] Moved range_reduction_double ifdef statement (llvm#102659)
Sin/cos/tan fuzzers were having issues with ONE_TWENTY_EIGHT_OVER_PI, so the LIBC_TARGET_CPU_HAS_FMA ifdef statement got moved from the sin/cos/tan .cpp files to the range_reduction_double_common.cpp file.
Configuration menu - View commit details
-
Copy full SHA for 1d8d5d6 - Browse repository at this point
Copy the full SHA 1d8d5d6View commit details -
[SandboxIR][NFC] Use Tracker.emplaceIfTracking()
This patch replaces some of the remaining uses of Tracker::track() to Tracker::emplaceIfTracking().
Configuration menu - View commit details
-
Copy full SHA for 44f30c8 - Browse repository at this point
Copy the full SHA 44f30c8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 93a31cd - Browse repository at this point
Copy the full SHA 93a31cdView commit details -
[ThinLTO]Clean up 'import-assume-unique-local' flag. (llvm#102424)
While manual compiles can specify full file paths and build automation tools use full, unique paths in practice, it's not clear whether it's a general good practice to enforce full paths (fail a build if relative paths are used). `NumDefs == 1` condition [1] should hold true for many internal-linkage vtables as long as full paths are indeed used to salvage the marginal performance when local-linkage vtables are imported due to indirect reference. llvm#100448 (comment) has more details. [1] https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215
Configuration menu - View commit details
-
Copy full SHA for 51a3bc1 - Browse repository at this point
Copy the full SHA 51a3bc1View commit details
Commits on Aug 10, 2024
-
[SandboxIR][NFC] SingleLLVMInstructionImpl class (llvm#102687)
This patch introduces the SingleLLVMInstructionImpl class which implements a couple of functions shared across all Instructions that map to a single LLVM Instructions. This avoids code replication.
Configuration menu - View commit details
-
Copy full SHA for 5351723 - Browse repository at this point
Copy the full SHA 5351723View commit details -
Configuration menu - View commit details
-
Copy full SHA for 786c409 - Browse repository at this point
Copy the full SHA 786c409View commit details -
Configuration menu - View commit details
-
Copy full SHA for 23c8128 - Browse repository at this point
Copy the full SHA 23c8128View commit details -
AMDGPU/NewPM: Port SIAnnotateControlFlow to new pass manager (llvm#10…
…2653) Does not yet add it to the pass pipeline. Somehow it causes 2 tests to assert in SelectionDAG, in functions without any control flow.
Configuration menu - View commit details
-
Copy full SHA for 76f722f - Browse repository at this point
Copy the full SHA 76f722fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 77e68fb - Browse repository at this point
Copy the full SHA 77e68fbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3696a34 - Browse repository at this point
Copy the full SHA 3696a34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c8d479 - Browse repository at this point
Copy the full SHA 6c8d479View commit details -
[msan] Use namespace qualifier. NFC
nsan will port msan_allocator.cpp and msan_thread.cpp. Clean up the two files first.
Configuration menu - View commit details
-
Copy full SHA for e0ddd42 - Browse repository at this point
Copy the full SHA e0ddd42View commit details -
[llvm] Construct SmallVector with ArrayRef (NFC) (llvm#102712)
Without this patch, the constructor arguments come from SmallVectorImpl, not ArrayRef. This patch switches them to ArrayRef so that we can construct SmallVector with a single argument. Note that LLVM Programmer’s Manual prefers ArrayRef to SmallVectorImpl for flexibility.
Configuration menu - View commit details
-
Copy full SHA for e9a47a6 - Browse repository at this point
Copy the full SHA e9a47a6View commit details -
Configuration menu - View commit details
-
Copy full SHA for fcf6dc3 - Browse repository at this point
Copy the full SHA fcf6dc3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 165f453 - Browse repository at this point
Copy the full SHA 165f453View commit details -
Configuration menu - View commit details
-
Copy full SHA for 109f2f0 - Browse repository at this point
Copy the full SHA 109f2f0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0c783be - Browse repository at this point
Copy the full SHA 0c783beView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7a6acd9 - Browse repository at this point
Copy the full SHA 7a6acd9View commit details -
[llvm-objdump,test] Fix source-interleave.ll when /proc/self/cwd is u…
…navailable e.g. on Mach-O
Configuration menu - View commit details
-
Copy full SHA for a52e486 - Browse repository at this point
Copy the full SHA a52e486View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9a227ba - Browse repository at this point
Copy the full SHA 9a227baView commit details -
[libc++] re-enable clang-tidy in the CI and fix any issues (llvm#102658)
It looks like we've accidentally disabled clang-tidy in the CI. This re-enables it and fixes the issues accumulated while it was disabled.
Configuration menu - View commit details
-
Copy full SHA for 5c717d6 - Browse repository at this point
Copy the full SHA 5c717d6View commit details -
[clang][Interp] Improve "in call to" call argument printing (llvm#102735
Configuration menu - View commit details
-
Copy full SHA for 979abf1 - Browse repository at this point
Copy the full SHA 979abf1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 86691f8 - Browse repository at this point
Copy the full SHA 86691f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b57f6b - Browse repository at this point
Copy the full SHA 3b57f6bView commit details -
[Polly] Use separate DT/LI/SE for outlined subfn. NFC. (llvm#102460)
DominatorTree, LoopInfo, and ScalarEvolution are function-level analyses that expect to be called only on instructions and basic blocks of the function they were original created for. When Polly outlined a parallel loop body into a separate function, it reused the same analyses seemed to work until new checks to be added in llvm#101198. This patch creates new analyses for the subfunctions. GenDT, GenLI, and GenSE now refer to the analyses of the current region of code. Outside of an outlined function, they refer to the same analysis as used for the SCoP, but are substituted within an outlined function. Additionally to the cross-function queries of DT/LI/SE, we must not create SCEVs that refer to a mix of expressions for old and generated values. Currently, SCEVs themselves do not "remember" which ScalarEvolution analysis they were created for, but mixing them is just as unexpected as using DT/LI across function boundaries. Hence `SCEVLoopAddRecRewriter` was combined into `ScopExpander`. `SCEVLoopAddRecRewriter` only replaced induction variables but left SCEVUnknowns to reference the old function. `SCEVParameterRewriter` would have done so but its job was effectively superseded by `ScopExpander`, and now also `SCEVLoopAddRecRewriter`. Some issues persist put marked with a FIXME in the code. Changing them would possibly cause this patch to be not NFC anymore.
Configuration menu - View commit details
-
Copy full SHA for 22c77f2 - Browse repository at this point
Copy the full SHA 22c77f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 59f7a80 - Browse repository at this point
Copy the full SHA 59f7a80View commit details -
[LLD][NFC] Don't use x64 import library for x86 target in safeseh-md …
…tests. (llvm#102736) Use llvm-lib to generate input library instead of a binary blob.
Configuration menu - View commit details
-
Copy full SHA for 955be52 - Browse repository at this point
Copy the full SHA 955be52View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2849ebb - Browse repository at this point
Copy the full SHA 2849ebbView commit details -
[mlir][vector] Use
DenseI64ArrayAttr
in vector.multi_reduction (llv……m#102637) This prevents some unnecessary conversions to/from int64_t and IntegerAttr.
Configuration menu - View commit details
-
Copy full SHA for 5f26497 - Browse repository at this point
Copy the full SHA 5f26497View commit details -
[clang][Interp] Only zero-init first union member (llvm#102744)
Zero-initializing all of them accidentally left the last member active. Only initialize the first one.
Configuration menu - View commit details
-
Copy full SHA for ac47edd - Browse repository at this point
Copy the full SHA ac47eddView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1c26992 - Browse repository at this point
Copy the full SHA 1c26992View commit details -
[clang][Interp] Ignore unnamed bitfields when zeroing records (llvm#1…
…02749) Including unions, where this is more important.
Configuration menu - View commit details
-
Copy full SHA for 8d908b8 - Browse repository at this point
Copy the full SHA 8d908b8View commit details -
[clang][Interp] Fix activating via indirect field initializers (llvm#…
…102753) Pointer::activate() propagates up anyway, so that is handled. But we need to call activate() in any case since the parent might not be a union, but the activate() is still needed. Always call it and hope that the InUnion flag takes care of the potential performance problems.
Configuration menu - View commit details
-
Copy full SHA for 9d6cec5 - Browse repository at this point
Copy the full SHA 9d6cec5View commit details -
[NFC] Fix TableGen include guards to match paths (llvm#102746)
- Fix include guards for headers under utils/TableGen to match their paths.
Configuration menu - View commit details
-
Copy full SHA for 8a61bfc - Browse repository at this point
Copy the full SHA 8a61bfcView commit details -
[GISel] Handle more opcodes in constant_fold_binop (llvm#102640)
Update the list of opcodes handled by the constant_fold_binop combine to match the ones that are folded in CSEMIRBuilder::buildInstr.
Configuration menu - View commit details
-
Copy full SHA for 9bb7c11 - Browse repository at this point
Copy the full SHA 9bb7c11View commit details -
[Support] Assert that DomTree nodes share parent (llvm#101198)
A dominance query of a block that is in a different function is ill-defined, so assert that getNode() is only called for blocks that are in the same function. There are two cases, where this behavior did occur. LoopFuse didn't explicitly do this, but didn't invalidate the SCEV block dispositions, leaving dangling pointers to free'ed basic blocks behind, causing use-after-free. We do, however, want to be able to dereference basic blocks inside the dominator tree, so that we can refer to them by a number stored inside the basic block.
Configuration menu - View commit details
-
Copy full SHA for 8101d18 - Browse repository at this point
Copy the full SHA 8101d18View commit details -
This patch fixes: clang/lib/Serialization/ASTReader.cpp:11484:13: error: unused variable '_' [-Werror,-Wunused-variable]
Configuration menu - View commit details
-
Copy full SHA for ac83582 - Browse repository at this point
Copy the full SHA ac83582View commit details -
[Serialization] Use traditional for loops (NFC) (llvm#102761)
The use of _ requires either: - (void)_ and curly braces, or - [[maybe_unused]]. For simple repetitions like these, we can use traditional for loops for readable warning-free code.
Configuration menu - View commit details
-
Copy full SHA for 4ce2f98 - Browse repository at this point
Copy the full SHA 4ce2f98View commit details -
[clang][Interp] Handle union copy/move ctors (llvm#102762)
They don't have a body and we need to implement them ourselves. Use the Memcpy op to do that.
Configuration menu - View commit details
-
Copy full SHA for 496b224 - Browse repository at this point
Copy the full SHA 496b224View commit details -
[sanitizer,test] Restore -fno-sized-deallocation coverage
-fsized-deallocation was recently made the default for C++17 onwards (llvm#90373). While here, remove unneeded -faligned-allocation.
Configuration menu - View commit details
-
Copy full SHA for c27415f - Browse repository at this point
Copy the full SHA c27415fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 80eea01 - Browse repository at this point
Copy the full SHA 80eea01View commit details -
[Utils] Add new merge-release-pr.py script. (llvm#101630)
This script helps the release managers merge backport PR's. It does the following things: * Validate the PR, checks approval, target branch and many other things. * Rebases the PR * Checkout the PR locally * Pushes the PR to the release branch * Deletes the local branch I have found the script very helpful to merge the PR's.
Configuration menu - View commit details
-
Copy full SHA for f3e950a - Browse repository at this point
Copy the full SHA f3e950aView commit details -
[DFAJumpThreading] Rewrite the way paths are enumerated (llvm#96127)
I tried to add a limit to number of blocks visited in the paths() function but even with a very high limit the transformation coverage was being reduced. After looking at the code it seemed that the function was trying to create paths of the form `SwitchBB...DeterminatorBB...SwitchPredecessor`. This is inefficient because a lot of nodes in those paths (nodes before DeterminatorBB) would be irrelevant to the optimization. We only care about paths of the form `DeterminatorBB_Pred DeterminatorBB...SwitchBB`. This weeds out a lot of visited nodes. In this patch I have added a hard limit to the number of nodes visited and changed the algorithm for path calculation. Primarily I am traversing the use-def chain for the PHI nodes that define the state. If we have a hole in the use-def chain (no immediate predecessors) then I call the paths() function. I also had to the change the select instruction unfolding code to insert redundant one input PHIs to allow the use of the use-def chain in calculating the paths. The test suite coverage with this patch (including a limit on nodes visited) is as follows: Geomean diff: dfa-jump-threading.NumTransforms: +13.4% dfa-jump-threading.NumCloned: +34.1% dfa-jump-threading.NumPaths: -80.7% Compile time effect vs baseline (pass enabled by default) is mostly positive: https://llvm-compile-time-tracker.com/compare.php?from=ad8705fda25f64dcfeb6264ac4d6bac36bee91ab&to=5a3af6ce7e852f0736f706b4a8663efad5bce6ea&stat=instructions:u Change-Id: I0fba9e0f8aa079706f633089a8ccd4ecf57547ed
Configuration menu - View commit details
-
Copy full SHA for b167ada - Browse repository at this point
Copy the full SHA b167adaView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe31363 - Browse repository at this point
Copy the full SHA fe31363View commit details -
[Clang][CodeGen] Fix bad codegen when building Clang with latest MSVC (…
…llvm#102681) Before this PR, when using the latest MSVC `Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33813 for x64` one of the Clang unit test used to fail: `CodeGenObjC/gnustep2-direct-method.m`, see full failure log: [here](llvm#100517 (comment)). This PR temporarily shuffles around the code to make the MSVC inliner/ optimizer happy and avoid the bug. MSVC bug report: https://developercommunity.visualstudio.com/t/Bad-code-generation-when-building-LLVM-w/10719589?port=1025&fsid=e572244a-cde7-4d75-a73d-9b8cd94204dd
Configuration menu - View commit details
-
Copy full SHA for 2ba1cc8 - Browse repository at this point
Copy the full SHA 2ba1cc8View commit details -
[clang-format] Add BreakBinaryOperations configuration (llvm#95013)
By default, clang-format packs binary operations, but it may be desirable to have compound operations be on individual lines instead of being packed. This PR adds the option `BreakBinaryOperations` to break up large compound binary operations to be on one line each. This applies to all logical and arithmetic/bitwise binary operations Maybe partially addresses llvm#79487 ? Closes llvm#58014 Closes llvm#57280
Configuration menu - View commit details
-
Copy full SHA for c5a4291 - Browse repository at this point
Copy the full SHA c5a4291View commit details -
[clang-format] Fix a serious bug in
git clang-format -f
(llvm#102629)With the --force (or -f) option, git-clang-format wipes out input files excluded by a .clang-format-ignore file if they have unstaged changes. This patch adds a hidden clang-format option --list-ignored that lists such excluded files for git-clang-format to filter out. Fixes llvm#102459.
Configuration menu - View commit details
-
Copy full SHA for 986bc3d - Browse repository at this point
Copy the full SHA 986bc3dView commit details -
[llvm-exegesis][unittests] Also disable SubprocessMemoryTest on SPARC (…
…llvm#102755) Three `llvm-exegesis` tests ``` LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/DefinitionFillsCompletely LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/MultipleDefinitions LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/OneDefinition ``` `FAIL` on Linux/sparc64 like ``` llvm/unittests/tools/llvm-exegesis/X86/SubprocessMemoryTest.cpp:68: Failure Expected equality of these values: SharedMemoryMapping[I] Which is: '\0' ExpectedValue[I] Which is: '\xAA' (170) ``` It seems like this test only works on little-endian hosts: three sub-tests are already disabled on powerpc and s390x (both big-endian), and the fourth is additionally guarded against big-endian hosts (making the other guards unnecessary). However, since it's not been analyzed if this is really an endianess issue, this patch disables the whole test on powerpc and s390x as before adding sparc to the mix. Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.
Configuration menu - View commit details
-
Copy full SHA for a417083 - Browse repository at this point
Copy the full SHA a417083View commit details -
Configuration menu - View commit details
-
Copy full SHA for b728f37 - Browse repository at this point
Copy the full SHA b728f37View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8c4e039 - Browse repository at this point
Copy the full SHA 8c4e039View commit details
Commits on Aug 11, 2024
-
Revert "[Support] Assert that DomTree nodes share parent" (llvm#102780)
Configuration menu - View commit details
-
Copy full SHA for 3c3df1b - Browse repository at this point
Copy the full SHA 3c3df1bView commit details -
Configuration menu - View commit details
-
Copy full SHA for f498638 - Browse repository at this point
Copy the full SHA f498638View commit details -
Configuration menu - View commit details
-
Copy full SHA for fa12aa7 - Browse repository at this point
Copy the full SHA fa12aa7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ac42af - Browse repository at this point
Copy the full SHA 4ac42afView commit details -
[profgen][NFC] Pass parameter as const_ref
Pass `ProbeNode` parameter of `trackInlineesOptimizedAway` as const reference. Reviewers: wlei-llvm, WenleiHe Reviewed By: WenleiHe Pull Request: llvm#102787
Configuration menu - View commit details
-
Copy full SHA for 242f4e8 - Browse repository at this point
Copy the full SHA 242f4e8View commit details -
[MC][profgen][NFC] Expand auto for MCDecodedPseudoProbe
Expand autos in select places in preparation to llvm#102789. Reviewers: dcci, maksfb, WenleiHe, rafaelauler, ayermolo, wlei-llvm Reviewed By: WenleiHe, wlei-llvm Pull Request: llvm#102788
Configuration menu - View commit details
-
Copy full SHA for cd15d12 - Browse repository at this point
Copy the full SHA cd15d12View commit details -
Configuration menu - View commit details
-
Copy full SHA for 073b057 - Browse repository at this point
Copy the full SHA 073b057View commit details -
[clang][Interp] Properly adjust instance pointer in virtual calls (ll…
…vm#102800) `getDeclPtr()` will not just return what we want, but in this case a pointer to the `vu` local variable.
Configuration menu - View commit details
-
Copy full SHA for 712ab80 - Browse repository at this point
Copy the full SHA 712ab80View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2a00bf4 - Browse repository at this point
Copy the full SHA 2a00bf4View commit details -
[Docs] Update meetup contact mail address (llvm#99321)
Arnaud is no longer active.
Configuration menu - View commit details
-
Copy full SHA for 3036bcd - Browse repository at this point
Copy the full SHA 3036bcdView commit details -
[NFC][libclang/python] Fix code highlighting in release notes (llvm#1…
…02807) This corrects a release note introduced in llvm#98745
Configuration menu - View commit details
-
Copy full SHA for a245a98 - Browse repository at this point
Copy the full SHA a245a98View commit details -
[VPlan] Move VPWidenLoadRecipe::execute to VPlanRecipes.cpp (NFC).
Move VPWidenLoadRecipe::execute to VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp
Configuration menu - View commit details
-
Copy full SHA for 35d3625 - Browse repository at this point
Copy the full SHA 35d3625View commit details -
AMDGPU: Try to add some more amdgpu-perf-hint tests (llvm#102644)
This test has hardly any test coverage, and no IR tests. Add a few more tests involving calls, and add some IR checks. This pass needs a lot of work to improve the test coverage, and to actually use the cost model instead of making up its own accounting scheme.
Configuration menu - View commit details
-
Copy full SHA for 2b0a88f - Browse repository at this point
Copy the full SHA 2b0a88fView commit details -
NewPM/AMDGPU: Port AMDGPUPerfHintAnalysis to new pass manager (llvm#1…
…02645) This was much more difficult than I anticipated. The pass is not in a good state, with poor test coverage. The legacy PM does seem to be relying on maintaining the map state between different SCCs, which seems bad. The pass is going out of its way to avoid putting the attributes it introduces onto non-callee functions. If it just added them, we could use them directly instead of relying on the map, I would think. The NewPM path uses a ModulePass; I'm not sure if we should be using CGSCC here but there seems to be some missing infrastructure to support backend defined ones.
Configuration menu - View commit details
-
Copy full SHA for dd094b2 - Browse repository at this point
Copy the full SHA dd094b2View commit details -
[CI][libclang] Add PR autolabeling for libclang (llvm#102809)
This automatically adds the `clang:as-a-library` label on PRs for the C and Python bindings and the libclang library --------- Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for f070f61 - Browse repository at this point
Copy the full SHA f070f61View commit details -
[clang-tidy] Fix modernize-use-std-format lit test signature (llvm#10…
…2759) My fix for my original fix of issue llvm#92896 in 666d224 modified the function signature for fmt::sprintf to more accurately match the real implementation in libfmt but failed to do the same for absl::StrFormat. The latter fix applied equally well to absl::StrFormat so it's important that its test verifies that the bug is fixed too.
Configuration menu - View commit details
-
Copy full SHA for 4589bf9 - Browse repository at this point
Copy the full SHA 4589bf9View commit details -
[LV] Collect profitable VFs in ::getBestVF. (NFCI)
Move collectig profitable VFs to ::getBestVF, in preparation for retiring selectVectorizationFactor.
Configuration menu - View commit details
-
Copy full SHA for 7024cec - Browse repository at this point
Copy the full SHA 7024cecView commit details -
[LV] Adjust test for llvm#48188 to use AVX level closer to report.
Update AVX level for llvm#48188 to be closer to the one used in the preproducer.
Configuration menu - View commit details
-
Copy full SHA for 4399dbe - Browse repository at this point
Copy the full SHA 4399dbeView commit details -
[LV] Regenerate check lines in preparation for llvm#99808.
Regenerate check lines for test to avoid unrelated changes in llvm#99808.
Configuration menu - View commit details
-
Copy full SHA for 5286656 - Browse repository at this point
Copy the full SHA 5286656View commit details -
Configuration menu - View commit details
-
Copy full SHA for 94e6786 - Browse repository at this point
Copy the full SHA 94e6786View commit details -
[RFC][GlobalISel] InstructionSelect: Allow arbitrary instruction eras…
Configuration menu - View commit details
-
Copy full SHA for d2336fd - Browse repository at this point
Copy the full SHA d2336fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for d1957dd - Browse repository at this point
Copy the full SHA d1957ddView commit details -
[GlobalISel] Combiner: Install Observer into MachineFunction
The Combiner doesn't install the Observer into the MachineFunction. This probably went unnoticed, because MachineFunction::getObserver() is currently only used in constrainOperandRegClass(), but this might cause issues down the line. Pull Request: llvm#102156
Configuration menu - View commit details
-
Copy full SHA for bf3aa88 - Browse repository at this point
Copy the full SHA bf3aa88View commit details -
Configuration menu - View commit details
-
Copy full SHA for fe59b84 - Browse repository at this point
Copy the full SHA fe59b84View commit details -
[GlobalISel] Don't remove from unfinalized GISelWorkList
Remove a hack from GISelWorkList caused by the Combiner removing instructions from an unfinalized GISelWorkList during the DCE phase. This is in preparation for larger changes to the WorkListMaintainer. Pull Request: llvm#102158
Configuration menu - View commit details
-
Copy full SHA for 65c7213 - Browse repository at this point
Copy the full SHA 65c7213View commit details -
Configuration menu - View commit details
-
Copy full SHA for 846dccc - Browse repository at this point
Copy the full SHA 846dcccView commit details -
[LegalizeTypes][RISCV] Use SExtOrZExtPromotedOperands to promote oper…
…ands for USUBSAT. (llvm#102781) It doesn't matter which extend we use to promote the operands. Use whatever is the most efficient. The custom handler for RISC-V was using SIGN_EXTEND when the Zbb extension is enabled so we no longer need that.
Configuration menu - View commit details
-
Copy full SHA for 257c479 - Browse repository at this point
Copy the full SHA 257c479View commit details -
[nsan] Add NsanThread and clear static TLS shadow
On thread creation, asan/hwasan/msan/tsan unpoison the thread stack and static TLS blocks in case the blocks reuse previously freed memory that is possibly poisoned. glibc nptl/allocatestack.c allocates thread stack using a hidden, non-interceptable function. nsan is similar: the shadow types for the thread stack and static TLS blocks should be set to unknown, otherwise if the static TLS blocks reuse previous shadow memory, and `*p += x` instead of `*p = x` is used for the first assignment, the mismatching user and shadow memory could lead to false positives. NsanThread is also needed by the next patch to use the sanitizer allocator. Pull Request: llvm#102718
Configuration menu - View commit details
-
Copy full SHA for 249db51 - Browse repository at this point
Copy the full SHA 249db51View commit details -
Bump CI container clang version to 18.1.8 (llvm#102803)
This patch bumps the CI container LLVM version to 18.1.8. This should've been bumped a while ago, but I just noticed that it was out of date. This also allows us to drop a patch that we manually had to add as it is by default included in v18.
Configuration menu - View commit details
-
Copy full SHA for 167c71a - Browse repository at this point
Copy the full SHA 167c71aView commit details -
[mlir][affine] Fix crash in mlir::affine::getForInductionVarOwner() (l…
…lvm#102625) This change fixes a crash when getOwner()->getParent() is a nullptr
Configuration menu - View commit details
-
Copy full SHA for d1bc41f - Browse repository at this point
Copy the full SHA d1bc41fView commit details -
[LV] Support generating masks for switch terminators. (llvm#99808)
Update createEdgeMask to created masks where the terminator in Src is a switch. We need to handle 2 separate cases: 1. Dst is not the default desintation. Dst is reached if any of the cases with destination == Dst are taken. Join the conditions for each case where destination == Dst using a logical OR. 2. Dst is the default destination. Dst is reached if none of the cases with destination != Dst are taken. Join the conditions for each case where the destination is != Dst using a logical OR and negate it. Edge masks are created for every destination of cases and/or default when requesting a mask where the source is a switch. Fixes llvm#48188. PR: llvm#99808
Configuration menu - View commit details
-
Copy full SHA for f0df4fb - Browse repository at this point
Copy the full SHA f0df4fbView commit details -
Make msan_allocator.cpp more conventional. NFC
nsan will port msan_allocator.cpp.
Configuration menu - View commit details
-
Copy full SHA for 2438f41 - Browse repository at this point
Copy the full SHA 2438f41View commit details -
[msan] Remove unneeded nullness CHECK
The pointer will immediate be dereferenced.
Configuration menu - View commit details
-
Copy full SHA for 1d0d1f2 - Browse repository at this point
Copy the full SHA 1d0d1f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4134592 - Browse repository at this point
Copy the full SHA 4134592View commit details -
[LV] Handle SwitchInst in ::isPredicatedInst.
After f0df4fb, isPredicatedInst needs to handle SwitchInst as well. Handle it the same as BranchInst. This fixes a crash in the newly added test and improves the results for one of the existing tests in predicate-switch.ll Should fix https://lab.llvm.org/buildbot/#/builders/113/builds/2099.
Configuration menu - View commit details
-
Copy full SHA for 60680f7 - Browse repository at this point
Copy the full SHA 60680f7View commit details -
[CMake] Followup to llvm#102396 and restore old DynamicLibrary symbol…
…s behavior (llvm#102671) Followup to llvm#102138 and llvm#102396, restore more old behavior to fix ppc64-aix bot.
Configuration menu - View commit details
-
Copy full SHA for 32973b0 - Browse repository at this point
Copy the full SHA 32973b0View commit details -
[NFC] Eliminate top-level "using namespace" from some headers. (llvm#…
…102751) - Eliminate top-level "using namespace" from some headers.
Configuration menu - View commit details
-
Copy full SHA for 1753008 - Browse repository at this point
Copy the full SHA 1753008View commit details -
libc: Remove
extern "C"
from main declarations (llvm#102825)This is invalid in C++, and clang recently started warning on it as of llvm#101853
Configuration menu - View commit details
-
Copy full SHA for 1b71c47 - Browse repository at this point
Copy the full SHA 1b71c47View commit details -
Configuration menu - View commit details
-
Copy full SHA for b7c7dbd - Browse repository at this point
Copy the full SHA b7c7dbdView commit details -
[rtsan] Make sure rtsan gets initialized on mac (llvm#100188)
Intermittently on my mac I was getting the same nullptr crash in dlsym. We need to make sure rtsan gets initialized on mac between when the binary starts running, and the first intercepted function is called. Until that point we should use the DlsymAllocator.
Configuration menu - View commit details
-
Copy full SHA for 0a2a319 - Browse repository at this point
Copy the full SHA 0a2a319View commit details -
This fixes: ``` [6831/7617] Building CXX object tools\lldb\source\Target\CMakeFiles\lldbTarget.dir\ThreadPlanSingleThreadTimeout.cpp.obj C:\src\git\llvm-project\lldb\source\Target\ThreadPlanSingleThreadTimeout.cpp(66) : warning C4715: 'lldb_private::ThreadPlanSingleThreadTimeout::StateToString': not all control paths return a value ```
Configuration menu - View commit details
-
Copy full SHA for af09dd6 - Browse repository at this point
Copy the full SHA af09dd6View commit details -
[openmp][runtime] Silence warnings
This fixes several of those when building with MSVC on Windows: ``` [3625/7617] Building CXX object projects\openmp\runtime\src\CMakeFiles\omp.dir\kmp_affinity.cpp.obj C:\src\git\llvm-project\openmp\runtime\src\kmp_affinity.cpp(2637): warning C4062: enumerator 'KMP_HW_UNKNOWN' in switch of enum 'kmp_hw_t' is not handled C:\src\git\llvm-project\openmp\runtime\src\kmp.h(628): note: see declaration of 'kmp_hw_t' ```
Configuration menu - View commit details
-
Copy full SHA for 20baa9a - Browse repository at this point
Copy the full SHA 20baa9aView commit details -
[compiler-rt] Silence warnings
This fixes a few of these warnings, when building with Clang ToT on Windows: ``` [622/7618] Building CXX object projects\compiler-rt\lib\sanitizer_common\CMakeFiles\RTSanitizerCommonSymbolizer.x86_64.dir\sanitizer_symbolizer_win.cpp.obj C:\src\git\llvm-project\compiler-rt\lib\sanitizer_common\sanitizer_symbolizer_win.cpp(74,3): warning: cast from 'FARPROC' (aka 'long long (*)()') to 'decltype(::StackWalk64) *' (aka 'int (*)(unsigned long, void *, void *, _tagSTACKFRAME64 *, void *, int (*)(void *, unsigned long long, void *, unsigned long, unsigned long *), void *(*)(void *, unsigned long long), unsigned long long (*)(void *, unsigned long long), unsigned long long (*)(void *, void *, _tagADDRESS64 *))') converts to incompatible function type [-Wcast-function-type-mismatch] ``` This is similar to llvm#97905
Configuration menu - View commit details
-
Copy full SHA for 7202fe5 - Browse repository at this point
Copy the full SHA 7202fe5View commit details -
This fixes the following warning, when building with Clang ToT on Windows: ``` [6668/7618] Building CXX object tools\lldb\source\Plugins\Process\Windows\Common\CMakeFiles\lldbPluginProcessWindowsCommon.dir\TargetThreadWindows.cpp.obj C:\src\git\llvm-project\lldb\source\Plugins\Process\Windows\Common\TargetThreadWindows.cpp(182,22): warning: cast from 'FARPROC' (aka 'long long (*)()') to 'GetThreadDescriptionFunctionPtr' (aka 'long (*)(void *, wchar_t **)') converts to incompatible function type [-Wcast-function-type-mismatch] ``` This is similar to: llvm#97905
Configuration menu - View commit details
-
Copy full SHA for a819b0e - Browse repository at this point
Copy the full SHA a819b0eView commit details -
[lldb] Fix dangling expression
This fixes the following: ``` [6603/7618] Building CXX object tools\lldb\source\Plugins\ObjectFile\PECOFF\CMakeFiles\lldbPluginObjectFilePECOFF.dir\WindowsMiniDump.cpp.obj C:\src\git\llvm-project\lldb\source\Plugins\ObjectFile\PECOFF\WindowsMiniDump.cpp(29,25): warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl] 29 | const auto &outfile = core_options.GetOutputFile().value(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 warning generated. ```
Configuration menu - View commit details
-
Copy full SHA for e79e601 - Browse repository at this point
Copy the full SHA e79e601View commit details -
Configuration menu - View commit details
-
Copy full SHA for e2f9c18 - Browse repository at this point
Copy the full SHA e2f9c18View commit details
Commits on Aug 12, 2024
-
[mlir] Fix build after ec50f58 (llvm#101021)
This commit fixes what appears to be invalid C++ -- a lambda capturing a variable before it is declared. The code compiles with GCC and Clang but not MSVC.
Configuration menu - View commit details
-
Copy full SHA for 80ff391 - Browse repository at this point
Copy the full SHA 80ff391View commit details -
[LoopVectorize][X86][AMDLibm] Add Missing AMD LibM trig vector intrin…
…sics (llvm#101125) Adding the following linked to their docs: - [amd_vrs16_acosf](https://github.com/amd/aocl-libm-ose/blob/9c0b67293ba01e509a6308247d82a8f1adfbbc67/scripts/libalm.def#L221) - [amd_vrd2_cosh](https://github.com/amd/aocl-libm-ose/blob/9c0b67293ba01e509a6308247d82a8f1adfbbc67/scripts/libalm.def#L124) - [amd_vrs16_tanhf](https://github.com/amd/aocl-libm-ose/blob/9c0b67293ba01e509a6308247d82a8f1adfbbc67/scripts/libalm.def#L224)
Configuration menu - View commit details
-
Copy full SHA for efc6b50 - Browse repository at this point
Copy the full SHA efc6b50View commit details -
[NFC] [C++20] [Modules] Adjust the implementation of wasDeclEmitted t…
…o make it more clear The preivous implementation of wasDeclEmitted may be confusing that why we need to filter the declaration not from modules. Now adjust the implementations to avoid the problems.
Configuration menu - View commit details
-
Copy full SHA for 4399f2a - Browse repository at this point
Copy the full SHA 4399f2aView commit details -
Revert "[CMake] Followup to llvm#102396 and restore old DynamicLibrar…
…y symbols behavior (llvm#102671)" This reverts commit 32973b0. This fix doesn't fix the build failure as expected and making few other configuration broken too.
Configuration menu - View commit details
-
Copy full SHA for 435654b - Browse repository at this point
Copy the full SHA 435654bView commit details -
[Sanitizer] Make sanitizer passes idempotent (llvm#99439)
This PR changes the sanitizer passes to be idempotent. When any sanitizer pass is run after it has already been run before, double instrumentation is seen in the resulting IR. This happens because there is no check in the pass, to verify if IR has been instrumented before. This PR checks if "nosanitize_*" module flag is already present and if true, return early without running the pass again.
Configuration menu - View commit details
-
Copy full SHA for 62ced81 - Browse repository at this point
Copy the full SHA 62ced81View commit details -
[mlir][IR] Auto-generate element type verification for VectorType (ll…
…vm#102449) llvm#102326 enables verification of type parameters that are type constraints. The element type verification for `VectorType` (and maybe other builtin types in the future) can now be auto-generated. Also remove redundant error checking in the vector type parser: element type and dimensions are already checked by the verifier (which is called from `getChecked`). Depends on llvm#102326.
Configuration menu - View commit details
-
Copy full SHA for 7d4aa1f - Browse repository at this point
Copy the full SHA 7d4aa1fView commit details -
[clang][Interp][NFC] Cleanup CheckActive()
Assert that the given pointer is in a union if it's not active and use a range-based for loop to find the active field.
Configuration menu - View commit details
-
Copy full SHA for c6062d3 - Browse repository at this point
Copy the full SHA c6062d3View commit details -
[mlir][linalg] fix linalg.batch_reduce_matmul auto cast (llvm#102585)
Fix the auto-cast of `linalg.batch_reduce_matmul` from `cast_to_T(A * cast_to_T(B)) + C` to `cast_to_T(A) * cast_to_T(B) + C`
Configuration menu - View commit details
-
Copy full SHA for 558d7ad - Browse repository at this point
Copy the full SHA 558d7adView commit details -
[clang][Interp][NFC] Move ctor compilation to compileConstructor
In preparation for having a similar function for destructors.
Configuration menu - View commit details
-
Copy full SHA for 27ed9b4 - Browse repository at this point
Copy the full SHA 27ed9b4View commit details -
Revert "[NFC] [C++20] [Modules] Adjust the implementation of wasDeclE…
…mitted to make it more clear" This reverts commit 4399f2a. This fails with Modules/aarch64-sme-keywords.cppm
Configuration menu - View commit details
-
Copy full SHA for cb372bd - Browse repository at this point
Copy the full SHA cb372bdView commit details -
Reapply "[AMDGPU] Always lower s/udiv64 by constant to MUL" (llvm#101942
) Reland llvm#100723, fixing the ARM issue at the cost of a small loss of optimization in `test/CodeGen/AMDGPU/fshr.ll` Solves llvm#100383
Configuration menu - View commit details
-
Copy full SHA for 7389545 - Browse repository at this point
Copy the full SHA 7389545View commit details -
[clang] Avoid triggering vtable instantiation for C++23 constexpr dtor (
llvm#102605) In C++23 anything can be constexpr, including a dtor of a class whose members and bases don't have constexpr dtors. Avoid early triggering of vtable instantiation int this case. Fixes llvm#102293
Configuration menu - View commit details
-
Copy full SHA for d469794 - Browse repository at this point
Copy the full SHA d469794View commit details -
[CMake] Don't pass -DBUILD_EXAMPLES to the build (llvm#102838)
The only use in `opt.cpp` was removed in d291f1f.
Configuration menu - View commit details
-
Copy full SHA for f696489 - Browse repository at this point
Copy the full SHA f696489View commit details -
[DataLayout] Move
operator=
to cpp file (NFC) (llvm#102849)`DataLayout` isn't exactly cheap to copy (448 bytes on a 64-bit host). Move `operator=` to cpp file to improve compilation time. Also move `operator==` closer to `operator=` and add a couple of FIXMEs.
Configuration menu - View commit details
-
Copy full SHA for 875b652 - Browse repository at this point
Copy the full SHA 875b652View commit details -
[GlobalISel] Fix implementation of CheckNumOperandsLE/GE
The condition was backwards - it was rejecting when the condition was met. Fixes llvm#102719
Configuration menu - View commit details
-
Copy full SHA for 50f4168 - Browse repository at this point
Copy the full SHA 50f4168View commit details -
[VPlan] Mark VPVectorPointer as only using the first part of the ptr.
VPVectorPointerRecipe only uses the first part of the pointer operand, so mark it accordingly. Follow-up suggested as part of llvm#99808.
Configuration menu - View commit details
-
Copy full SHA for 5a42a67 - Browse repository at this point
Copy the full SHA 5a42a67View commit details -
[mlir][Transforms] Add missing check in tosa::transpose::verify() (ll…
…vm#102099) The tosa::transpose::verify() should make sure that the permutation numbers are within the size of the input array. Otherwise it will cause a cryptic array out of bound assertion later.Fix llvm#99513.
Configuration menu - View commit details
-
Copy full SHA for c8b5d30 - Browse repository at this point
Copy the full SHA c8b5d30View commit details -
[AMDGPU] add missing checks in processBaseWithConstOffset (llvm#102310)
fixes llvm#102231 by inserting missing checks.
Configuration menu - View commit details
-
Copy full SHA for 273e0a4 - Browse repository at this point
Copy the full SHA 273e0a4View commit details -
[InstCombine] Don't change fn signature for calls to declarations (ll…
…vm#102596) transformConstExprCastCall() implements a number of highly dubious transforms attempting to make a call function type line up with the function type of the called function. Historically, the main value this had was to avoid function type mismatches due to pointer type differences, which is no longer relevant with opaque pointers. This patch is a step towards reducing the scope of the transform, by applying it only to definitions, not declarations. For declarations, the declared signature might not match the actual function signature, e.g. `void @fn()` is sometimes used as a placeholder for functions with unknown signature. The implementation already bailed out in some cases for declarations, but I think it would be safer to disable the transform entirely. For the test cases, I've updated some of them to use definitions instead, so that the test coverage is preserved.
Configuration menu - View commit details
-
Copy full SHA for cc14ecc - Browse repository at this point
Copy the full SHA cc14eccView commit details -
[llvm][llvm-readobj] Add NT_ARM_FPMR corefile note type (llvm#102594)
This contains the fpmr register which was added in Armv9.5-a. This register mainly contains controls for fp8 formats. It was added to the Linux Kernel in torvalds/linux@4035c22.
Configuration menu - View commit details
-
Copy full SHA for a07c6d9 - Browse repository at this point
Copy the full SHA a07c6d9View commit details -
[analyzer][NFC] Trivial refactoring of region invalidation (llvm#102456)
This commit removes `invalidateRegionsImpl()`, moving its body to `invalidateRegions(ValueList Values, ...)`, because it was a completely useless layer of indirection. Moreover I'm fixing some strange indentation within this function body and renaming two variables to the proper `UpperCamelCase` format.
Configuration menu - View commit details
-
Copy full SHA for b680862 - Browse repository at this point
Copy the full SHA b680862View commit details -
[VPlan] Replace hard-coded value number in test with pattern.
Make test more robust w.r.t. future changes.
Configuration menu - View commit details
-
Copy full SHA for 55d7e59 - Browse repository at this point
Copy the full SHA 55d7e59View commit details -
Configuration menu - View commit details
-
Copy full SHA for d12250c - Browse repository at this point
Copy the full SHA d12250cView commit details -
[dwarf2yaml] Correctly emit type and split unit headers (llvm#102471)
(DWARFv5) split units have an extra `dwo_id` field in the header. Type units have `type_signature` and `type_offset`.
Configuration menu - View commit details
-
Copy full SHA for 8a1846d - Browse repository at this point
Copy the full SHA 8a1846dView commit details -
[LV] Only OR unique edges when creating block-in masks.
This removes redundant ORs of matching masks. Follow-up to f0df4fb to reduce the number of redundant ORs for masks.
Configuration menu - View commit details
-
Copy full SHA for db0603c - Browse repository at this point
Copy the full SHA db0603cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 11ba72e - Browse repository at this point
Copy the full SHA 11ba72eView commit details -
[clang][analyzer] Remove array bounds check from PointerSubChecker (l…
…lvm#102580) At pointer subtraction only pointers are allowed that point into an array (or one after the end), this fact was checker by the checker. This check is now removed because it is a special case of array indexing error that is handled by different checkers (like ArrayBoundsV2).
Configuration menu - View commit details
-
Copy full SHA for e607360 - Browse repository at this point
Copy the full SHA e607360View commit details -
[lldb] Tolerate multiple compile units with the same DWO ID (llvm#100577
) I ran into this when LTO completely emptied two compile units, so they ended up with the same hash (see llvm#100375). Although, ideally, the compiler would try to ensure we don't end up with a hash collision even in this case, guaranteeing their absence is practically impossible. This patch ensures this situation does not bring down lldb.
Configuration menu - View commit details
-
Copy full SHA for 32a62eb - Browse repository at this point
Copy the full SHA 32a62ebView commit details -
[Flang][OpenMP] NFC: Use ConstructQueue::const_iterator (llvm#102612)
This patch replaces `ConstructQueue::iterator` arguments with `ConstructQueue::const_iterator` where it's used as a pointer to an element inside of a `const ConstructQueue &` passed along with it. Since these functions don't intend to modify the list or any elements in it, keeping constness consistent between both makes it simpler to work with.
Configuration menu - View commit details
-
Copy full SHA for ebf530c - Browse repository at this point
Copy the full SHA ebf530cView commit details -
[analyzer][NFC] Improve documentation of
invalidateRegion
methods (l……lvm#102477) ... within the classes `StoreManager` and `ProgramState` and describe the connection between the two methods.
Configuration menu - View commit details
-
Copy full SHA for 908c89e - Browse repository at this point
Copy the full SHA 908c89eView commit details -
[AArch64] Implement promotion type legalisation for histogram intrins…
…ic (llvm#101017) Currently the histogram intrinsic (llvm.experimental.vector.histogram.add) only allows i32 and i64 types for the memory locations to be updated, matching the restrictions of the histcnt instruction. This patch adds support for the legalisation of smaller types (i8 and i16) via promotion.
Configuration menu - View commit details
-
Copy full SHA for 670d208 - Browse repository at this point
Copy the full SHA 670d208View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0241e7 - Browse repository at this point
Copy the full SHA a0241e7View commit details -
[Serialization] Add a callback to register new created predefined dec…
…ls for DeserializationListener (llvm#102855) Close llvm#102684 The root cause of the issue is, it is possible that the predefined decl is not registered at the beginning of writing a module file but got created during the process of writing from reading. This is incorrect. The predefined decls should always be predefined decls. Another deep thought about the issue is, we shouldn't read any new things after we start to write the module file. But this is another deeper question.
Configuration menu - View commit details
-
Copy full SHA for 4915fdd - Browse repository at this point
Copy the full SHA 4915fddView commit details -
[X86] SimplifyDemandedVectorEltsForTargetNode - reduce width of X86IS…
…D::BLENDV nodes when upper elements are not demanded. Prep work for llvm#83402
Configuration menu - View commit details
-
Copy full SHA for 8949290 - Browse repository at this point
Copy the full SHA 8949290View commit details -
IR/AMDGPU: Autoupgrade amdgpu-unsafe-fp-atomics attribute (llvm#101698)
Delete the attribute and annotate any atomicrmw instructions in the function with new metadata.
Configuration menu - View commit details
-
Copy full SHA for 70feafd - Browse repository at this point
Copy the full SHA 70feafdView commit details -
[MLIR][DLTI][Transform] Introduce transform.dlti.query - 2nd attempt (l…
…lvm#102652) This transform op makes it possible to query attributes associated to IR by means of the DLTI dialect. The op takes both a `key` and a target `op` to perform the query at. Facility functions automatically find the closest ancestor op which defines the appropriate DLTI interface or has an attribute implementing a DLTI interface. By default the lookup uses the data layout interfaces of DLTI. If the optional `device` parameter is provided, the lookup happens with respect to the interfaces for TargetSystemSpec and TargetDeviceSpec. This op uses new free-standing functions in the `dlti` namespace to not only look up specifications via the `DataLayoutSpecOpInterface` and on `ModuleOp`s but also on any ancestor op that has an appropriate DLTI attribute.
Configuration menu - View commit details
-
Copy full SHA for 2ad3bcd - Browse repository at this point
Copy the full SHA 2ad3bcdView commit details -
AMDGPU: Use GCNTargetMachine in AMDGPUCodeGenPassBuilder (llvm#102805)
R600 has a separate CodeGenPassBuilder anyway.
Configuration menu - View commit details
-
Copy full SHA for 1c764b9 - Browse repository at this point
Copy the full SHA 1c764b9View commit details -
[lldb][test][AArch64] Regex match field values in register test
As these are flags they can be set or not depending on what the system libraries did prior to loading the program.
Configuration menu - View commit details
-
Copy full SHA for afe019c - Browse repository at this point
Copy the full SHA afe019cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 05b75e0 - Browse repository at this point
Copy the full SHA 05b75e0View commit details -
StructurizeCFG: Add SkipUniformRegions pass parameter to new PM versi…
…on (llvm#102812) Keep respecting the old cl::opt for now.
Configuration menu - View commit details
-
Copy full SHA for f86da4c - Browse repository at this point
Copy the full SHA f86da4cView commit details -
[X86] Fold extract_subvector(fp_to_uint(x)) case to match existing fp…
…_to_sint fold (necessary to fix llvm#83402 on AVX512 targets). Prep work for llvm#83402
Configuration menu - View commit details
-
Copy full SHA for 0ea9cdb - Browse repository at this point
Copy the full SHA 0ea9cdbView commit details -
[mlir][mesh] Shardingcontrol (llvm#102598)
This is a fixed copy of llvm#98145 (necessary after it got reverted). @sogartar @yaochengji This PR adds the following to llvm#98145: - `UpdateHaloOp` accepts a `memref` (instead of a tensor) and not returning a result to clarify its inplace-semantics - `UpdateHaloOp` accepts `split_axis` to allow multiple mesh-axes per tensor/memref-axis (similar to `mesh.sharding`) - The implementation of `Shardinginterface` for tensor operation (`tensor.empty` for now) moved from the tensor library to the mesh interface library. `spmdize` uses features from `mesh` dialect. @rengolin agreed that `tensor` should not depend on `mesh` so this functionality cannot live in a `tensor`s lib. The unfulfilled dependency caused the issues leading to reverting llvm#98145. Such cases are generally possible and might lead to re-considering the current structure (like for tosa ops). - rebased onto latest main -------------------------- Replacing `#mesh.sharding` attribute with operation `mesh.sharding` - extended semantics now allow providing optional `halo_sizes` and `sharded_dims_sizes` - internally a sharding is represented as a non-IR class `mesh::MeshSharding` What previously was ```mlir %sharded0 = mesh.shard %arg0 <@Mesh0, [[0]]> : tensor<4x8xf32> %sharded1 = mesh.shard %arg1 <@Mesh0, [[0]]> annotate_for_users : tensor<16x8xf32> ``` is now ```mlir %sharding = mesh.sharding @Mesh0, [[0]] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding : tensor<4x8xf32> %1 = mesh.shard %arg1 to %sharding annotate_for_users : tensor<16x8xf32> ``` and allows additional annotations to control the shard sizes: ```mlir mesh.mesh @Mesh0 (shape = 4) %sharding0 = mesh.sharding @Mesh0, [[0]] halo_sizes = [1, 2] : !mesh.sharding %0 = mesh.shard %arg0 to %sharding0 : tensor<4x8xf32> %sharding1 = mesh.sharding @Mesh0, [[0]] sharded_dims_sizes = [3, 5, 5, 3] : !mesh.sharding %1 = mesh.shard %arg1 to %sharding1 annotate_for_users : tensor<16x8xf32> ``` - `mesh.shard` op accepts additional optional attribute `force`, useful for halo updates - Some initial spmdization support for the new semantics - Support for `tensor.empty` reacting on `sharded_dims_sizes` and `halo_sizes` in the sharding - New collective operation `mesh.update_halo` as a spmdized target for shardings with `halo_sizes` --------- Co-authored-by: frank.schlimbach <fschlimb@smtp.igk.intel.com> Co-authored-by: Jie Fu <jiefu@tencent.com>
Configuration menu - View commit details
-
Copy full SHA for baabcb2 - Browse repository at this point
Copy the full SHA baabcb2View commit details -
Clean up after transition into opaque pointers. NFC (llvm#102631)
LegacyPointerTypes is not used any longer and can be removed from the LLVM context. Also remove a copy-pasted code comment in TypedPointerType that doesn't make sense (since there is no special case for address space zero in the TypedPointerType::get implementation).
Configuration menu - View commit details
-
Copy full SHA for 6ca6780 - Browse repository at this point
Copy the full SHA 6ca6780View commit details -
[verifier] Get rid of getResolverFunctionType. NFC (llvm#102631)
With opaque pointers we can just get the pointer type for the resolver function by using PointerType::get, making the GlobalIFunc::getResolverFunctionType function obsolete.
Configuration menu - View commit details
-
Copy full SHA for 1ff06c5 - Browse repository at this point
Copy the full SHA 1ff06c5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 145aff6 - Browse repository at this point
Copy the full SHA 145aff6View commit details -
TargetMachine: Move trivial setter/getter to header
The others are already inline here.
Configuration menu - View commit details
-
Copy full SHA for 7fe486a - Browse repository at this point
Copy the full SHA 7fe486aView commit details -
[AMDGPU][NFCI] Mark AGPRs and VGPRs with different flags in HWEncodin…
…g. (llvm#102650) Simplifies checks for AGPRs and VGPRs and makes them more explicit and less fragile.
Configuration menu - View commit details
-
Copy full SHA for c7107ca - Browse repository at this point
Copy the full SHA c7107caView commit details -
[AMDGPU][AsmParser] Eliminate validateExeczVcczOperands(). (llvm#102600)
Mention the names of unavailable registers in error messages to not make the diagnostics for execz/vccz less rich than it was. Clean up unnecessary name qualifications while there. Part of <llvm#62629>.
Configuration menu - View commit details
-
Copy full SHA for 7727853 - Browse repository at this point
Copy the full SHA 7727853View commit details -
[lldb/DWARF] Search fallback to the manual index in GetFullyQualified… (
llvm#102123) …Type This is needed to ensure we find a type if its definition is in a CU that wasn't indexed. This can happen if the definition is in some precompiled code (e.g. the c++ standard library) which was built with different flags than the rest of the binary.
Configuration menu - View commit details
-
Copy full SHA for 21ef272 - Browse repository at this point
Copy the full SHA 21ef272View commit details -
[lldb][test] Disable procfile by thread ID test when LLVM_ENABLE_THRE…
…ADS is not defined When LLVM_ENABLE_THREADS is not defined, llvm::get_threadid returns 0 which makes this test case fail. This is a pretty niche setting, Linaro uses it to stop lld crashing our 32 bit containers. So the test will get plenty of runs elsewhere. In lldb's code it's not getting the current thread ID anyway, it's using a value it got from ptrace. So even if that copy of lldb was built with LLVM_ENABLE_THREADS off, it should still be able to debug threads.
Configuration menu - View commit details
-
Copy full SHA for f2991bd - Browse repository at this point
Copy the full SHA f2991bdView commit details -
[Clang][OpenMP] Fix the wrong transform of
num_teams
claused introd……uced in llvm#99732 (llvm#102716)
Configuration menu - View commit details
-
Copy full SHA for aa86e5b - Browse repository at this point
Copy the full SHA aa86e5bView commit details -
[PS4/PS5][Driver] Allow -static in PlayStation drivers (llvm#102020)
On PlayStation, allow users to supply -static to the linker, via the driver. An initial step. Later changes will have the PS5 driver supply additional options to the linker, if and when -static is passed. SIE tracker: TOOLCHAIN-16704
Configuration menu - View commit details
-
Copy full SHA for 895ca18 - Browse repository at this point
Copy the full SHA 895ca18View commit details -
Configuration menu - View commit details
-
Copy full SHA for c876761 - Browse repository at this point
Copy the full SHA c876761View commit details -
[lldb][test] Break early when walking backtrace in concurrent tests
We only need to see that 1 frame of the stack is in user code. No need to carry on looking. Doing so actually caused a test failure on Armv8 Ubuntu Jammy where a libc function does not have a display name. I'm sure I'm going to get stung by this elsewhere, but for this test, breaking early sidesteps the problem.
Configuration menu - View commit details
-
Copy full SHA for 513c372 - Browse repository at this point
Copy the full SHA 513c372View commit details -
[SCEV] Fix incorrect extension in computeConstantDifference()
The Mul factor was zero-extended here, resulting in incorrect results for integers larger than 64-bit. As we currently only multiply by 1 or -1, just split this into two cases -- there's no need for a full multiplication here. Fixes llvm#102597.
Configuration menu - View commit details
-
Copy full SHA for 3512bcc - Browse repository at this point
Copy the full SHA 3512bccView commit details -
[AArch64] Add FEAT_SME_B16B16 and remove FEAT_B16B16 (llvm#102501)
Implement FEAT_SME_B16B16 to enable ZA-targeting non-widening SME BFloat16 instructions. Remove the now redundant FEAT_B16B16 which has been replaced by FEAT_SVE_B16B16 and FEAT_SME_B16B16 (this commit), see llvm#101480 for the details and reasoning of this change to LLVM. FEAT_SME_B16B16 is documented under the latest Armv9.4 feature documentation: https://developer.arm.com/documentation/109697/0100/Feature-descriptions/The-Armv9-4-architecture-extensio - Changes to Clang AArch64 frontend - Change target guard of SME2 ZA-targeting non-widening BFloat16 intrinsics to 'sme-b16b16' - Changes to LLVM AArch64 backend - llvm/lib/Target/AArch64/AArch64Features.td - Create FeatureSMEB16B16, which implies FeatureSME2 and FeatureSVEB16B16 - Remove FeatureB16B16 - Fix description of FeatureSVEB16B16 - llvm/lib/Target/AArch64/AArch64InstrInfo.td - Create HasSMEB16B16 predicate - llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td - Change predictication of SME2 ZA-targeting non-widening BFloat16 instructions to new HasSMEB16B16 - llvm/lib/Target/AArch64/AArch64.td - Add HasSMEB16B16 to SME2Unsupported (FEAT_SME_B16B16 implies FEAT_SME2) - llvm/lib/AArch64/AsmParser/AArch64AsmParser.cpp - Remove flag 'b16b16' mapping to removed FeatureB16B16 - Add flag 'sme-b16b16' mapping to new FeatureSMEB16B16 - Changes to LLVM unit tests - llvm/unittests/TargetParser/TargetParserTest.cpp - Add new sme-b16b16 flag to existing target parser tests - Add tests for the sme-b16b16 dependencies: - 'sme-b16b16' should enable 'sme2', 'sve-b16b16'. - Remove 'b16b16' from bf16 dependency test - Added MC tests - llvm/test/MC/AArch64/SME2p1 - To ensure that ZA-targeting multi-vector non-widening BFloat16 instructions are enabled by +sme-b16b16, and that this feature is removed by +nosme-b61b6. - Modidified tests - All CodeGen, Semantic, and MC tests that are effected by the removal of 'b16b16', have been modified to supply and/or expect 'sme-b16b16' where appropriate.
Configuration menu - View commit details
-
Copy full SHA for 1b936e4 - Browse repository at this point
Copy the full SHA 1b936e4View commit details -
[LV] Include chains feeding inductions in cost precomputation.
Include chain of ops feeding inductions in cost precomputation for inductions, not just the induction increment. In VPlan, those instructions will be cleaned up, as both phi and increment are generated by VPWidenIntOrFpInductionRecipe independently. Fixes llvm#101337.
Configuration menu - View commit details
-
Copy full SHA for cd08fad - Browse repository at this point
Copy the full SHA cd08fadView commit details -
[SPIR-V] Emit valid Lifestart/Lifestop instructions (llvm#98475)
This PR fixes emission of valid OpLifestart/OpLifestop instructions. According to https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpLifetimeStart: "Size must be 0 if Pointer is a pointer to a non-void type or the Addresses [capability](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Capability) is not declared.". The `Size` argument is set the corresponding intrinsics arguments, so Size is not zero we must ensure that Pointer has the required type by inserting a bitcast if needed.
Configuration menu - View commit details
-
Copy full SHA for 281f59f - Browse repository at this point
Copy the full SHA 281f59fView commit details -
[SPIR-V] Rework usage of virtual registers' types and classes (llvm#1…
…01732) This PR contains changes in virtual register processing aimed to improve correctness of emitted MIR between passes from the perspective of MachineVerifier. This potentially helps to detect previously missed flaws in code emission and harden the test suite. As a measure of correctness and usefulness of this PR we may use a mode with expensive checks set on, and MachineVerifier reports problems in the test suite. In order to satisfy Machine Verifier requirements to MIR correctness not only a rework of usage of virtual registers' types and classes is required, but also corrections into pre-legalizer and instruction selection logics. Namely, the following changes are introduced: * scalar virtual registers have proper bit width, * detect register class by SPIR-V type, * add a superclass for id virtual register classes, * fix Tablegen rules used for instruction selection, * fixes of minor existed issues (missed flag for proper representation of a null constant for OpenCL vs. HLSL, wrong usage of integer virtual registers as a synonym of any non-type virtual register).
Configuration menu - View commit details
-
Copy full SHA for f9c9806 - Browse repository at this point
Copy the full SHA f9c9806View commit details -
Configuration menu - View commit details
-
Copy full SHA for 34514ce - Browse repository at this point
Copy the full SHA 34514ceView commit details -
Configuration menu - View commit details
-
Copy full SHA for c8a4568 - Browse repository at this point
Copy the full SHA c8a4568View commit details -
Issue bloomberg#26: Fix splices in requires clause (bloomberg#86)
* rename CXXIndeterminateSpliceExpr in the readme too Signed-off-by: delimbetov <1starfall1@gmail.com> * make TryAnnotateOptionalCXXScopeToken work Signed-off-by: delimbetov <1starfall1@gmail.com> * make splice work in requires clause Signed-off-by: delimbetov <1starfall1@gmail.com> * add tests for splice in requires expr Signed-off-by: delimbetov <1starfall1@gmail.com> * add typename and newline at the end of the file Signed-off-by: delimbetov <1starfall1@gmail.com> * add comments Signed-off-by: delimbetov <1starfall1@gmail.com> --------- Signed-off-by: delimbetov <1starfall1@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 058106a - Browse repository at this point
Copy the full SHA 058106aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 78a4192 - Browse repository at this point
Copy the full SHA 78a4192View commit details
Commits on Aug 15, 2024
-
Initial support for importing reflections between modules.
Some work remains: In particular, if this is going to "work" (i.e., supported by P2996), we need to think carefully about reachability, TU-local entities, etc. There probably need to be some constraints around use of imported reflections, and possibly some 'is_reachable' metafunction. Not entirely sure - need to experiment further. Closes issue bloomberg#4.
Configuration menu - View commit details
-
Copy full SHA for dcc8c34 - Browse repository at this point
Copy the full SHA dcc8c34View commit details -
Add 'is_access_specified' metafunction.
TBD whether to keep this, but adding it so it can be played around with.
Configuration menu - View commit details
-
Copy full SHA for 1973df5 - Browse repository at this point
Copy the full SHA 1973df5View commit details
Commits on Aug 19, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ecd638b - Browse repository at this point
Copy the full SHA ecd638bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d897ec - Browse repository at this point
Copy the full SHA 3d897ecView commit details -
Issue bloomberg#88: Add has_{thread,automatic}_storage_duration funct…
…ions (bloomberg#89) * basic impl Signed-off-by: delimbetov <1starfall1@gmail.com> * add test for the new storage duration funcs Signed-off-by: delimbetov <1starfall1@gmail.com> * code style Signed-off-by: delimbetov <1starfall1@gmail.com> * run libcxx generators to pass CI Signed-off-by: delimbetov <1starfall1@gmail.com> * fix identation Signed-off-by: delimbetov <1starfall1@gmail.com> --------- Signed-off-by: delimbetov <1starfall1@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 33bebfb - Browse repository at this point
Copy the full SHA 33bebfbView commit details
Commits on Aug 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b31a899 - Browse repository at this point
Copy the full SHA b31a899View commit details -
Configuration menu - View commit details
-
Copy full SHA for 43e19fb - Browse repository at this point
Copy the full SHA 43e19fbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8d34e90 - Browse repository at this point
Copy the full SHA 8d34e90View commit details