Version 0.0.2 Release
This release introduces several major changes:
- Support data import from parquet and arrow IPC files.
- Support data export to PyG, NetworkX, Pandas, Arrow.
- Support UTF-8 strings and regular expression matching for strings.
- Support ALTER TABLE DDL and SET/DROP for node/rel properties.
- Support multi/un -labelled queries.
- New expressions/functions. (CASE expression, string regular expression match ~=, etc.)
We've written a blog post to explain our major new features in this release.
In case you missed it, Semih also had written two nice blog posts to explain the goals and vision of Kùzu and how we implemented factorization inside Kùzu.
Enjoy your reading and don't forget to pip install kuzu
!
All Changes
- Fix issue 948 by @aziz-mu in #1018
- Change default logging level by @acquamarin in #1036
- Fix issue 938 by @aziz-mu in #1035
- Add logging level option by @acquamarin in #1038
- Change default bm size for testing to 64MB by @acquamarin in #1039
- Remove parser unit test and unstr related operators by @andyfengHKU in #1043
- Fix issue 967 by @aziz-mu in #1040
- Add setValue and getValue for ValueVector by @ray6080 in #1045
- Rework srcDstTableID by @acquamarin in #1041
- UTF-8 string by @anuchak in #1037
- Remove unstructured properties by @acquamarin in #1051
- Plan delete rel by @andyfengHKU in #1050
- Refactor implicit cast and logical scan of unstructured properties by @andyfengHKU in #1052
- Fix multi-line query slow pasting issue by @aziz-mu in #1048
- Enable multi-thread testing by @andyfengHKU in #1060
- Fix issue 968 by @aziz-mu in #1054
- Move from bazel to cmake by @ray6080 in #1064
- Update build guideline in README by @ray6080 in #1067
- Multi labeled node scan by @andyfengHKU in #1057
- Scan multi-labeled node properties by @andyfengHKU in #1069
- Remove excessive logging, remove json from buffer manager by @andyfengHKU in #1071
- Add
getNextTupleInternal
by @andyfengHKU in #1072 - Scan rel ID blindly by @andyfengHKU in #1074
- Add init global state interface by @andyfengHKU in #1077
- Fix failing tests by @acquamarin in #1081
- Fix memory sanitizer issues by @acquamarin in #1084
- Integrate pytest with ctest by @acquamarin in #1090
- Unlabeled rel in match patterns by @andyfengHKU in #1087
- Fix issue 606 by @andyfengHKU in #1091
- Delete rels without transaction inside storage by @acquamarin in #1075
- Init local state by @andyfengHKU in #1093
- Fix CLI utf8 issue by @aziz-mu in #1080
- Remove
getPositionOfCurrIdx
by @ray6080 in #1095 - Unlabeled rel property by @andyfengHKU in #1096
- Fix issue 1047 by @aziz-mu in #1094
- Add installation guideline and simple examples to README by @ray6080 in #1102
- Multi labeled graph pattern by @andyfengHKU in #1104
- Extend property reading by @andyfengHKU in #1108
- Rework logical operator type by @andyfengHKU in #1097
- Rework physical operator type by @andyfengHKU in #1110
- Unlabeled query by @andyfengHKU in #1114
- Refactoring filtering and flatten operator by @ray6080 in #1118
- Move schema to operator by @andyfengHKU in #1119
- Fix issue 941 by @aziz-mu in #1120
- Remove mapper context by @andyfengHKU in #1121
- Wrap pybind11 API and and Fix #1106 by @mewim in #1124
- Numeric ops by @aziz-mu in #1123
- Add transaction to rel deletions by @acquamarin in #1126
- Delete rels from many-one, one-one tables by @acquamarin in #1132
- Source sink op interface by @andyfengHKU in #1127
- Case expression by @andyfengHKU in #1125
- Plan rel property update by @andyfengHKU in #1136
- Simplify initListReadingState by @ray6080 in #1138
- Rework table scans by @ray6080 in #1141
- Enable large lists scan to copy multi-pages sequentially by @ray6080 in #1143
- Fix issue 1092 by @ray6080 in #1144
- Avoid setting result expression state at runtime by @andyfengHKU in #1137
- Sink subset of expressions by @andyfengHKU in #1148
- Fix issue 1033 by @anuchak in #1145
- Refactor property variable expression by @andyfengHKU in #1152
- Update rel property by @acquamarin in #1149
- Refactor benchmark script by @mewim in #1151
- Clean up unstructured related code and force using property_id_t by @ray6080 in #1147
- Arrow node copier by @printfCalvin in #1146
- Fix issue 1129 by @anuchak in #1153
- Build Arrow from source by @mewim in #1157
- Add dependencies to CI workflow by @mewim in #1158
- Fix issue 1100 by @anuchak in #1155
- Node map agg by @andyfengHKU in #1160
- Fix for CMake linking issue on Ubuntu 18.04 by @mewim in #1164
- Return node and rel data type by @andyfengHKU in #1168
- Delete node/rel properties without transactions by @acquamarin in #1169
- Fix issue 1161 by @andyfengHKU in #1171
- Fix issue 1073 by @andyfengHKU in #1174
- Add Slack workspace to README by @mewim in #1176
- Add transaction to update rel by @acquamarin in #1159
- Add transaction to drop property statement by @acquamarin in #1175
- Arrow rel copier by @weipang142857 in #1154
- Update rel properties stored as columns by @acquamarin in #1177
- Fix issue 1112 by @ray6080 in #1179
- Fix arrow path search issue for RHEL-based Linux distros by @mewim in #1180
- Separate compilation of source and tests by @ray6080 in #1181
- Compile CI test in parallel by @ray6080 in #1183
- Value literal by @andyfengHKU in #1178
- Add address sanitizer to CI pipeline with
LD_PRELOAD
by @mewim in #1185 - Alter table add column by @acquamarin in #1186
- Add Python binding for NODE & REL types; output query results to NetworkX by @mewim in #1192
- Export query result (fixed sized values only) to arrow by @ray6080 in #1193
- Add transaction to add property statement by @acquamarin in #1194
- Alter table rename by @acquamarin in #1198
- Export query result to arrow: string data type by @ray6080 in #1199
- Add ldbc snb IS and IC benchmark queries by @anuchak in #1197
- Split expression binding to multiple cpp files by @andyfengHKU in #1210
- Implment
get_as_torch_geometric
by @mewim in #1200 - Add clang-format python script by @ray6080 in #1211
- Fix tests for add property by @acquamarin in #1214
- Rework RelID from global to local by @acquamarin in #1207
- Add scripts to generate cypher parser by @andyfengHKU in #1215
- Return unconverted properties for PyG converter by @mewim in #1213
- Temp patch Arrow to remove hash check by @mewim in #1217
- Revert arrow patch by @mewim in #1219
- Regex match by @anuchak in #1208
- Fix PyG converter crash when no property is extracted at all by @mewim in #1225
- Remove in-memory-mode by @acquamarin in #1224
- Rework bm frames with mmap by @ray6080 in #1218
- Export query result to arrow: NodeID, Node, Rel, LIST data types by @ray6080 in #1209
- Fix issue 1203 by @acquamarin in #1223
- Index nested loop join optimizer by @andyfengHKU in #1226
- Fix bm mmap size by @ray6080 in #1232
- Flatten mark join probe keys by @ray6080 in #1233
- Rel label function by @andyfengHKU in #1230
- Greedy search for large joins by @andyfengHKU in #1231
- Continue benchmark in same dir on benchmark failure by @mewim in #1234
- rewrite ldbc snb queries by @anuchak in #1227
- Fix PyG conversion issues by @mewim in #1238
- Fix PyG test errors by @mewim in #1239
- CMake Python wheel pipeline for macOS and Linux by @mewim in #1240
- Expose edge properties for PyG converter by @mewim in #1247
- Rework header file to get rid of
using namespace
by @ray6080 in #1248 - Fix package script for Python on Linux by @mewim in #1252
- Remove multiple src/dst nodetable support by @acquamarin in #1241
- Return primary key instead of offset for networkx by @mewim in #1256
- Fix alter table bug by @acquamarin in #1253
- Fix rel-update bug by @acquamarin in #1258
- Rework headers for APIs by @ray6080 in #1257
- Fix issue 1255 by @andyfengHKU in #1261
- Add script to collect and merge headers by @mewim in #1262
- Fix join benchmark by @andyfengHKU in #1235
- Add contributing guideline by @ray6080 in #1260
- Change
setLoggingLevel
to take string as the input param by @ray6080 in #1264 - Fix pipeline for building C++ lib and CLI by @mewim in #1266
- Add c++ api documentation by @acquamarin in #1265
- Disallow user to execute copy commands twice on a relt able by @acquamarin in #1268
- Fix order by bugs by @acquamarin in #1269
- Remove DatabaseConfig by @ray6080 in #1270
- Demo db test update by @andyfengHKU in #1267
- Generate python documents on CI by @mewim in #1272
- Fix Python document generation by @mewim in #1274
- Add std to fix gcc-12 compatibility by @mewim in #1275
- Update cpp api docs by @ray6080 in #1271
- Add algorithm include by @mewim in #1277
- Fix delete nodes error bug by @acquamarin in #1276
- Fix incorrect size calculation for LIST by @ray6080 in #1278
- Fix incorrect substring over list by @ray6080 in #1279
- Error when returning empty list. by @andyfengHKU in #1280
- Default to multi-thread execution by @andyfengHKU in #1283
- Upload C++ lib and cli separately for ci build by @mewim in #1281
- Update python documentation by @andyfengHKU in #1284
Full Changelog: 0.0.1...v0.0.2