Skip to content

3.23

Latest
Compare
Choose a tag to compare
@alex-aizman alex-aizman released this 28 May 14:49
· 219 commits to main since this release

Version 3.23 arrives three months after the previous one. In addition to datapath optimizations and bug fixes, most of the other changes are enumerated in the following

Table of Contents

  • List Objects; Bucket Inventory
  • Selecting Primary at startup; Restarting cluster when node IPs change (K8s)
  • S3 (backend, frontend)
  • BLOBs
  • Mountpath labels
  • Reading shards; Reading from shards

See also:

List Objects; Bucket Inventory

  • S3 backend: S3 ListObjectsV2 may return a directory !6672
  • list very large buckets using bucket inventory !6682, !6684, !6686, !6689, !6692
  • list-objects: optimize for prefix; add 'dont-optimize' feature flag !6685
  • list very large buckets using bucket inventory (major update, API changes) !6695, !6698
  • list very large buckets using bucket inventory !6704
  • list-objects: support non-recursive operation (new) !6711, !6712
  • refactor and code-generate (message pack) list-objects results !6714
  • bucket inventory; generic no-recursion helper !6715
  • bucket inventory: support arbitrary schema; add validation !6769
  • list-objects: micro-optimize setting custom properties of remote objects !6770
  • list very large buckets using bucket inventory !6775, !6776, !6777, !6778
  • list very large buckets using bucket inventory (major) !6810, !6811
  • list very large buckets using bucket inventory !6815
  • list-objects: skip virtual directories !6835
  • list very large buckets using bucket inventory !6847, !6851, !6853

Selecting Primary at startup; Restarting cluster when node IPs change (K8s)

  • primary role: add 'is-secondary' environment; precedence !6746
  • 'original' & 'discovery' URLs (major) !6747, !6749
  • cluster config: new convention for primary URL; role of the primary during: initial deployment, cluster restart !6752, !6755
  • cluster restart with simultaneous change of primary (major) !6758, !6760, !6761
  • primary startup: always update node net-infos !6762
  • all proxies to store RMD (previously, only primary) !6764
  • node join: remove duplicate IP check (is redundant) !6783
  • K8s startup with proxies change their network infos !6785
  • primary startup: initial version of the cluster map !6787
  • non-primary startup: retry and refactor; factor in !6788
  • K8s: primary startup when net-infos change !6789

S3 (backend, frontend)

  • backend put-object interface; presigned S3 (refactoring & cleanup) !6662
  • default AWS region (cleanup) !6679
  • s3cmd: add negative testing !6681
  • backend: S3 ListObjectsV2 may return a directory !6672
  • backend: consolidate environment and defaults !6678
  • backend: retain S3-specific error code !6688, !6691
  • move presigned URLs code to backend package !6801
  • multipart upload: read and send next part in parallel !6803
  • backend: refactor and simplify !6819
  • new feature flag to enable (older) path-style addressing !6821

BLOBs

  • config change: assorted feature flags now have bucket scope (major) !6664, !6666
  • Python: blob-download API !6687
  • Python: get and prefetch with blob-download !6708
  • blob downloader (minor ref) !6793
  • blob-downloader: finalize control structures; refactor !6812
  • GET via blob-download !6873
  • multiple blob-download jobs (fixes) !6876
  • prefetch via blob-downloader !6882

Mountpath labels

  • override-config, fspaths section (minor ref) !6718
  • config change, API change: mountpath labels (major) !6721, !6722, !6725, !6726, !6733, !6734, !6735, !6736, !6738
  • backward compatibility v3.22 and prior; bump CLI version !6740, !6742
  • log: mountpath labels vs shared filesystems; memory pressure !6744

Reading shards; Reading from shards

  • reading (from) shards: add read-until, read-one, and read-regex methods !6823
  • reading shards: read-until, read-one, read-regex !6824
  • WebDataset: add wds-key; add comments !6826
  • reading .TAR, .TGZ, etc. formatted objects (a.k.a. shards) - multiple selection !6827
  • GET request to select multiple archived files (feature) !6859
  • GET multiple archived files in one shot (feature) !6861, !6862, !6863, !6864, !6866
  • Python: GET multiple files from an archive (shard) !6860

Core

  • backend put-object interface (refactoring & cleanup) !6662
  • get-stats API vs attach/detach mountpaths !6669
  • unwrap URL errors; remove mux.unhandle; CLI: more tips !6673
  • removing a node from a 2-node cluster (in re: rebalance) !6674
  • POST /v1/buckets handler: add one more check to URI validation !6690
  • last byte (minor ref) !6694
  • project layout: move and consolidate all scripts !6699
  • extend RMD to reinforce cluster integrity checking !6702
  • micro-optimize fast-path fqn parsing !6707
  • continued refactoring !6709, !6710
  • security dependabot: fix #15 and #16 !6713
  • aisnode: remove logs from conf !6727
  • extract and unify cluster information; add flags !6741
  • copy shared FS capacity; color high/low usage pct; up cli !6743
  • node flags in a cluster map vs (node | cluster) restart; node equality !6765
  • receive cluster-level metadata (minor ref) !6766
  • dsort: write compressed tar !6771
  • dsort: read compressed tar; add linter !6772
  • backend: uniform naming, common base !6774
  • remove AIS_IS_PRIMARY environment (is obsolete) !6781
  • nlog: allow setting logging to STDERR flag in config !6791
  • feature flags fsync-put will now have (also) bucket scope !6804
  • cold GET: write locally and transmit in parallel (new) !6805, !6807
  • move atomic 'stopping' (ref) !6817
  • aisloader: add 's3-use-path-style' command line, to use older path-style addressing !6822
  • cold GET (fast): fclose and check !6825
  • speed-up batch jobs (prefetch, archive, copy/transform, multi-object evict/delete) !6830
  • LOM: add open-file method !6836
  • nlog: while stopping !6837
  • multi-object TCB/TCO; not in-cluster objects; multi-page fix !6840, !6842
  • xaction registry: when hk call is premature !6843
  • add metrics: get-size and put-size !6849
  • memsys/SGL: add compliant 'write-to' interface impl.; amend fast/simplified 'write-to' !6854, !6856, !6857
  • stats and metrics: report cumulative GET and PUT sizes in bytes !6855
  • datapath query parameters: preparse, reduce size !6858
  • stats: fix Prometheus label for total size !6871
  • imports (ref) !6878
  • move and rename 'node-state-info' and 'node-state-flags' (ref) !6879
  • new metric: node-state-flags (bitwise, gauge) !6880
  • add management alerts: out-of-space & low-capacity (major) !6883
  • add management alerts: out-of-memory & low-on-memory !6885
  • microbench: use math/rand/v2 !6886
  • transition to Go 1.22 math/rand/v2; crypto/rand reader !6887
  • dsort test: use rand.v2 !6888
  • transition to Go 1.22 math/rand/v2; add seeded-reader !6890
  • cleanup 'cos/math' (ref) !6891
  • tests: fix prefix-test for remote ais cluster !6893

CLI

  • 'more' fixes !6665
  • more tips !6673
  • warn when switching cluster to operate in reverse proxy mode !6703
  • show feature flags symbolically !6705
  • backward compatibility v3.22 and prior; bump CLI version !6740
  • 'ais show cluster' to highlight nodes that are low on memory !6745
  • 'ls' and 'show object' to support size units (raw, SI, IEC) !6795
  • progress bar decorators; elapsed time !6797
  • fix used and available capacity !6806
  • fix 'show throughput' to not show throughput when !6813
  • quiet 'show cluster', 'show performance'; misplaced flags !6814
  • 'ais ls' help and inline examples; native GET: add query params !6816
  • copying remote objects; progress bar; usability !6839
  • extend 'ais gen-shards' to generate WD-formatted shards !6865
  • add '--count-and-time-only' option !6868, !6869
  • max-pages and limit !6870
  • stopping jobs !6875

Python

  • add test for invalid bucket name !6683
  • blob-download API !6687
  • add timeout option to client + version bump !6693
  • get and prefetch with blob-download !6708
  • tests constants and refactoring !6717
  • prefetch blob-download tests !6719
  • cluster performance API !6724
  • remote enabled tests cleanup refactored !6731
  • add missing job tests !6737
  • fix formatting issues !6753
  • PyTorch: add Iterable-style datasets for AIS Backend !6759
  • writer for image dataset !6767
  • AISSource: list all objects !6779
  • add example for dataset_writer !6794
  • add tests for dataset writer !6799
  • log missing attributes in write_dataset !6820
  • update docs !6844
  • add MultiShard Stream to PyTorch !6852
  • GET multiple files from an archive !6860

Build, CI

  • transition to Go 1.22 !6675
  • upgrade OSS packages !6680, !6750, !6768
  • lint: upgrade; Go 1.22 int range !6728, !6732
  • CI: MacOS fix !6729
  • remove HDFS backend !6773
  • upgrade golang.org/x/net !6831
  • lint; min/max shadow !6850
  • build: transition to Go 1.22 math/rand/v2 !6892
  • CI: maintenance !6838
  • lint: golangci-lint !6894

Documentation

  • docs: fix https getting-started !6668
  • docs: amend getting started !6670
  • docs: fix the broken table of contents link !6677
  • blog: Very large !6874