Skip to content

Commit

Permalink
feat: Support creating c++ stored procedure via gsctl (#4222)
Browse files Browse the repository at this point in the history
<!--
Thanks for your contribution! please review
https://github.com/alibaba/GraphScope/blob/main/CONTRIBUTING.md before
opening an issue.
-->

## What do these changes do?

- [x] Support creating c++ stored procedure via gsctl
```
gsctl create storedproc -f procedure.yaml

# cat procedure.yaml
name: "cpp-procedure-name"
description: "This is a cpp test procedure"
query: "@/home/graphscope/sample_app.cc"
type: "cpp"
```

```cpp
# cat sample_app.cc
namespace gs {
    class ExampleQuery : public CypherReadAppBase<int32_t> {}
}
```

<!-- Please give a short brief about these changes. -->

## Related issue number

<!-- Are there any issues opened that will be resolved by merging this
change? -->

Fixes #4216
  • Loading branch information
lidongze0629 authored Sep 12, 2024
1 parent eeececf commit 1cc6569
Show file tree
Hide file tree
Showing 10 changed files with 487 additions and 88 deletions.
386 changes: 338 additions & 48 deletions docs/_static/coordinator_restful_api.html

Large diffs are not rendered by default.

14 changes: 8 additions & 6 deletions docs/analytical_engine/dev_and_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,9 @@ With `gs` command-line utility, you can build analytical engine of GraphScope wi

```bash
# Clone a repo if needed
# git clone https://github.com/alibaba/graphscope
# cd graphscope
python3 gsctl.py make analytical
# git clone https://github.com/alibaba/GraphScope
# cd GraphScope
make analytical
```

The code of analytical engine is a cmake project, with a `CMakeLists.txt` in the its root directory (`/analytical_engine`). After the building with `gs`, you may found the built artifacts in `analytical_engine/build/grape_engine`.
Expand All @@ -95,7 +95,7 @@ Together with the `grape_engine` are shared libraries, or there may have a bunch
You could install it to a location by

```bash
python3 gsctl.py make analytical-install --install-prefix /usr/local
make analytical-install INSTALL_PREFIX=/usr/local
```

````{note}
Expand All @@ -116,7 +116,9 @@ export GRAPHSCOPE_HOME=`pwd`
See more about `GRAPHSCOPE_HOME` in [run tests](../development/how_to_test.md#run-tests)

```bash
python3 gsctl.py test analytical
# git clone -b master --single-branch --depth=1 https://github.com/7br/gstest.git ${GS_TEST_DIR}
cd analytical_engine/build
../test/app_tests.sh --test_dir ${GS_TEST_DIR}
```

It would download the test dataset to the `/tmp/gstest` (if not exists) and run multiple algorithms against various graphs, and compare the result with the ground truth.
It would download the test dataset to the `${GS_TEST_DIR}` (if not exists) and run multiple algorithms against various graphs, and compare the result with the ground truth.
45 changes: 31 additions & 14 deletions docs/development/dev_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,25 +34,29 @@ To use the dev containers for GraphScope development, you can follow these steps

### Install deps on local

To install all dependencies on your local, use the GraphScope command-line utility [gs](https://github.com/alibaba/GraphScope/blob/main/gs) with the subcommand `install-deps` like this
To install all dependencies on your local, use the GraphScope command-line utility [gsctl](../utilities/gs.md) with the subcommand `install-deps` like this

```bash
python3 gsctl.py install-deps dev
# pip3 install gsctl
gsctl install-deps dev

# for more usage, try
# python3 gsctl.py install-deps -h
# gsctl install-deps -h
```

You could download the `gs` directly or clone the [GraphScope](https://github.com/alibaba/GraphScope) to local, the `gs` is located in the root directory of GraphScope.

You could download the `gsctl` directly or clone the [GraphScope](https://github.com/alibaba/GraphScope) to local, the `gsctl` is located in the root directory of GraphScope.

```
# git clone https://github.com/alibaba/GraphScope.git
cd GraphScope && python3 gsctl.py install-deps dev
```

## Build All Targets for GraphScope

With `gs` command-line utility, you can build all targets for GraphScope with a single command.
You can build all targets for GraphScope with a single command.

```bash
python3 gsctl.py make all
cd GraphScope && make
```

This would build all targets sequentially, here we
Expand All @@ -65,10 +69,10 @@ You may found the built artifacts in several places according to each components
And you could install them to one place by

```bash
python3 gsctl.py make install [--prefix=/opt/graphscope]
make install [INSTALL_PREFIX=/opt/graphscope]
```

By default it would install all artifacts to `/opt/graphscope`, and you could specify another location by assigning the value of `--prefix`.
By default it would install all artifacts to `/opt/graphscope`, and you could specify another location by assigning the value of `INSTALL_PREFIX`.

## Build Components Individually

Expand All @@ -86,27 +90,40 @@ You may find the guides for building and testing each engine as below.

The gscoordinator package is responsible for launching engines, circulating and propagating messages and errors, and scheduling the workload operations.


This will install coordinator package, thus make `import gscoordinator` work

````{tip}
The package would be installed in [editable mode](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-e), which means any changed you made in local directory will take effect.
````

```shell
python3 gsctl.py make coordinator
make coordinator
```

### Build Python Client

The `graphscope` package is the entrypoint for playing with GraphScope.
The `graphscope` package is a python entrypoint for playing with GraphScope.

This will install the graphscope-client package, thus make `import graphscope` work.

````{tip}
This package would also be installed in [editable mode](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-e)
This package would also be installed in [editable mode](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-e), which means any changed you made in local directory will take effect.
````

```shell
make client
```

### Build gsctl

The `gsctl` package is a command-line utility for GraphScope. It provides a set of functionalities to make it easy to use GraphScope.

This will install the gsctl package, thus make `gsctl` work.

````{tip}
This package would also be installed in [editable mode](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-e), which means any changed you made in local directory will take effect.
````

```shell
python3 gsctl.py make client
make gsctl
```
8 changes: 4 additions & 4 deletions docs/learning_engine/dev_and_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ To install all dependencies on your local, use the GraphScope command-line utili
`install-deps` like this

```bash
python3 gsctl.py install-deps dev
gsctl install-deps dev-learning

# for more usage, try
# python3 gsctl.py install-deps -h
# gsctl install-deps -h
```

### Dev on docker container
Expand All @@ -40,13 +40,13 @@ More options about `docker` command can be found [here](https://docs.docker.com/
You can build all targets for GraphScope Learning Engine with a single command.

```bash
python3 gsctl.py make learning
make learning
```

You could install it to a location by:

```bash
python3 gsctl.py make learning-install --install-prefix /opt/graphscope
make learning-install INSTALL_PREFIX=/opt/graphscope
```

## How to Test
Expand Down
27 changes: 17 additions & 10 deletions docs/utilities/gs.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
# Command-line Utility `gsctl`

`gsctl` is a command-line utility for GraphScope. It provides a set of functionalities to make it easy to use GraphScope. These functionalities include building and testing binaries, managing sessions and resources, and more.
`gsctl` is a command-line utility for GraphScope. It provides a set of functionalities to make it easy to use GraphScope. These functionalities include building images and packages, managing sessions and resources, and more.

## Install/Update `gsctl`

```bash
$ pip3 install gsctl
# or force reinstall gsctl by:
$ pip3 install gsctl --force-reinstall -U
```

In some cases, such as development on `gsctl`, you may want to build it from source.
To do this, navigate to the directory where the source code is located and run the following command:

```bash
$ cd REPO_HOME
$ cd ${REPO_HOME}
# If you want to develop gsctl,
# please note the entry point is located on:
# /python/graphscope/gsctl/gsctl.py
# please note the entry point is located on /python/graphscope/gsctl/gsctl.py
$ make gsctl
```
This will install `gsctl` in an editable mode, which means that any changes you make to the source code will be reflected in the installed version of `gsctl`.
Expand All @@ -32,20 +33,19 @@ Default, the `gsctl` provide helper functions and utilities that can be run usin
`gsctl` acts as the command-line entrypoint for GraphScope. Some examples of utility scripts are:

- `gsctl install-deps`, install dependencies for building GraphScope.
- `gsctl make`, build GraphScope executable binaries and artifacts.
- `gsctl make-image`, build GraphScope docker images.
- `gsctl test`, trigger test suites.
- `gsctl connect`, connect to the launched coordinator by ~/.gs/config.
- `gsctl connect`, connect to the launched coordinator with configuration file ~/.gsctl.
- `gsctl close`, Close the connection from the coordinator.
- `gsctl flexbuild`, Build docker image for Interactive, Insight product.
- `gsctl version`, Print the client version information.
- `gsctl instance`, Deploy, destroy a GraphScope Flex instance.

### Client/Server Mode

To switch to the client/server mode, use the `gsctl connect` command. By default, this command connects gsctl to a launched coordinator using the configuration file located at `${HOME}/.gsctl`; If `--coordinator-endpoint` parameter is specified, it will treat it as current context and override the configuration file.

Once connected, you can use `gsctl` to communicate with the coordinator which serves the specific Flex product behind it.

#### Change scope
#### Change Scope

In `gsctl`, you can run commands on a global scope or a local scope. When you connect to a coordinator, you are in the global scope. To change to local scope of a graph, run the `gsctl use GRAPH <graph_identifier>` command. You can find the graph identifier with `gsctl ls` command.

Expand All @@ -62,6 +62,13 @@ Using GLOBAL
```
Different scopes have different commands. Always remember to use `--help` on a command to get more information.

#### Close the connection
#### Close the Connection

To disconnect from the coordinator and switch back to the utility scripts mode, you can use the `gsctl close` command. This command closes the connection from the coordinator and allows you to use `gsctl` as a standalone utility again.


## What's the Next

- [FLEX Coordinator](../flex/coordinator.md)
- [Install dependencies on local environment](../development/dev_guide.md)
- [Manage GraphScope Interactive by gsctl](../flex/interactive_intro.md)
7 changes: 4 additions & 3 deletions k8s/dockerfiles/flex-interactive.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,19 +34,20 @@ RUN . ${HOME}/.cargo/env && cd ${HOME}/GraphScope/flex && \
cp ~/GraphScope/interactive_engine/executor/ir/target/release/libir_core.so /opt/flex/lib/

# build coordinator
RUN mkdir -p /opt/flex/wheel
RUN if [ "${ENABLE_COORDINATOR}" = "true" ]; then \
export PATH=${HOME}/.local/bin:${PATH} && \
cd ${HOME}/GraphScope/flex/interactive/sdk && \
./generate_sdk.sh -g python && cd python && \
python3 -m pip install --upgrade pip && python3 -m pip install -r requirements.txt && \
python3 setup.py build_proto && python3 setup.py bdist_wheel && \
mkdir -p /opt/flex/wheel && cp dist/*.whl /opt/flex/wheel/ && \
cp dist/*.whl /opt/flex/wheel/ && \
cd ${HOME}/GraphScope/python && \
export WITHOUT_LEARNING_ENGINE=ON && python3 setup.py bdist_wheel && \
mkdir -p /opt/flex/wheel && cp dist/*.whl /opt/flex/wheel/ && \
cp dist/*.whl /opt/flex/wheel/ && \
cd ${HOME}/GraphScope/coordinator && \
python3 setup.py bdist_wheel && \
mkdir -p /opt/flex/wheel && cp dist/*.whl /opt/flex/wheel/; \
cp dist/*.whl /opt/flex/wheel/; \
fi


Expand Down
7 changes: 7 additions & 0 deletions python/graphscope/gsctl/commands/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from graphscope.gsctl.utils import err
from graphscope.gsctl.utils import info
from graphscope.gsctl.utils import succ
from graphscope.gsctl.version import __version__


@click.group()
Expand All @@ -36,6 +37,12 @@ def cli():
pass


@cli.command()
def version():
"""Print the client version information"""
info(__version__)


@cli.command()
@click.option(
"-c",
Expand Down
5 changes: 5 additions & 0 deletions python/graphscope/gsctl/impl/stored_procedure.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@


def create_stored_procedure(graph_identifier: str, stored_procedure: dict) -> str:
# path begin with "@" represents the local file
if stored_procedure["query"].startswith("@"):
location = stored_procedure["query"][1:]
with open(location, "r") as f:
stored_procedure["query"] = f.read()
context = get_current_context()
with graphscope.flex.rest.ApiClient(
graphscope.flex.rest.Configuration(context.coordinator_endpoint)
Expand Down
71 changes: 69 additions & 2 deletions python/graphscope/gsctl/tests/test_interactive.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,51 @@
COORDINATOR_ENDPOINT = "http://127.0.0.1:8080"


sample_cc = """
#include "flex/engines/hqps_db/app/interactive_app_base.h"
#include "flex/engines/hqps_db/core/sync_engine.h"
#include "flex/utils/app_utils.h"
namespace gs {
class ExampleQuery : public CypherReadAppBase<int32_t> {
public:
using Engine = SyncEngine<gs::MutableCSRInterface>;
using label_id_t = typename gs::MutableCSRInterface::label_id_t;
using vertex_id_t = typename gs::MutableCSRInterface::vertex_id_t;
ExampleQuery() {}
// Query function for query class
results::CollectiveResults Query(const gs::GraphDBSession& sess,
int32_t param1) override {
LOG(INFO) << "param1: " << param1;
gs::MutableCSRInterface graph(sess);
auto ctx0 = Engine::template ScanVertex<gs::AppendOpt::Persist>(
graph, 0, Filter<TruePredicate>());
auto ctx1 = Engine::Project<PROJ_TO_NEW>(
graph, std::move(ctx0),
std::tuple{gs::make_mapper_with_variable<INPUT_COL_ID(0)>(
gs::PropertySelector<int64_t>("id"))});
auto ctx2 = Engine::Limit(std::move(ctx1), 0, 5);
auto res = Engine::Sink(graph, ctx2, std::array<int32_t, 1>{0});
LOG(INFO) << "res: " << res.DebugString();
return res;
}
};
} // namespace gs
extern "C" {
void* CreateApp(gs::GraphDBSession& db) {
gs::ExampleQuery* app = new gs::ExampleQuery();
return static_cast<void*>(app);
}
void DeleteApp(void* app) {
gs::ExampleQuery* casted = static_cast<gs::ExampleQuery*>(app);
delete casted;
}
}
"""

modern_graph = {
"name": "modern_graph",
"description": "This is a test graph",
Expand Down Expand Up @@ -305,7 +350,7 @@ def test_bulk_loading(self, tmpdir):
assert not ds["edge_mappings"]
delete_graph_by_id(graph_id)

def test_procedure(self):
def test_cypher_procedure(self):
stored_procedure_dict = {
"name": "procedure_name",
"description": "This is a test procedure",
Expand All @@ -319,7 +364,7 @@ def test_procedure(self):
new_procedure_exist = False
procedures = list_stored_procedures(graph_id)
for p in procedures:
if p.id == stored_procedure_id and p.name == "procedure_name":
if p.id == stored_procedure_id and p.name == stored_procedure_dict["name"]:
new_procedure_exist = True
assert new_procedure_exist
# test update a procedure
Expand All @@ -339,6 +384,28 @@ def test_procedure(self):
assert not new_procedure_exist
delete_graph_by_id(graph_id)

def test_cpp_procedure(self, tmpdir):
# generate sample_app.cc
cpp_procedure_file = tmpdir.join("sample_app.cc")
cpp_procedure_file.write(sample_cc)
# test create a new cpp stored procedure
stored_procedure_dict = {
"name": "cpp_stored_procedure_name",
"description": "This is a cpp test stored procedure",
"query": f"@{str(cpp_procedure_file)}",
"type": "cpp",
}
graph_id = create_graph(modern_graph)
stored_procedure_id = create_stored_procedure(graph_id, stored_procedure_dict)
assert stored_procedure_id is not None
new_procedure_exist = False
procedures = list_stored_procedures(graph_id)
for p in procedures:
if p.id == stored_procedure_id and p.name == stored_procedure_dict["name"]:
new_procedure_exist = True
assert new_procedure_exist
delete_graph_by_id(graph_id)

def test_service(self):
original_graph_id = None
status = list_service_status()
Expand Down
5 changes: 4 additions & 1 deletion python/graphscope/gsctl/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,9 +188,12 @@ def create_stored_procedure_node(self, graph, stored_procedures):
parent=graph.id,
)
for p in stored_procedures:
query = p.query.replace("\n", "\\\\n")
if len(query) > 100:
query = query[:100] + "..."
self.tree.create_node(
tag="StoredProc(identifier: {0}, type: {1}, runnable: {2}, query: {3}, description: {4})".format(
p.id, p.type, p.runnable, p.query, p.description
p.id, p.type, p.runnable, query, p.description
),
identifier=f"{stored_procedure_identifier}_{p.id}",
parent=stored_procedure_identifier,
Expand Down

0 comments on commit 1cc6569

Please sign in to comment.