Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration Testing #331

Closed
20 tasks
distributedstatemachine opened this issue Apr 16, 2024 · 11 comments · Fixed by #434
Closed
20 tasks

Integration Testing #331

distributedstatemachine opened this issue Apr 16, 2024 · 11 comments · Fixed by #434
Assignees
Labels
blue team defensive programming, CI, etc

Comments

@distributedstatemachine
Copy link
Contributor

Description

To ensure continuous reliability between our Subtensor and the Bittensor package, we need to implement a comprehensive GitHub Actions workflow. This workflow will automate the entire testing process, from building the blockchain node using the localnet.sh script, to installing the Bittensor package from a configurable branch, and finally running the test_subtensor_integration.py integration test.

The primary objective of this setup is to verify that any changes introduced to the subtensor codebase do not break or introduce regressions in the Bittensor Python code. By parameterizing the Bittensor repository branch, we can test against various development stages and release candidates, ensuring compatibility and robustness across different versions.

Acceptance Criteria

  • The GitHub Actions workflow should accept the Bittensor repository branch as a configurable input parameter.
  • The workflow should trigger automatically on push or pull request events to specified branches, with an option for manual triggers.
  • It should build the latest version of our blockchain node using the localnet.sh script.
  • The Bittensor package should be fetched and installed from the specified branch of the GitHub repository.
  • The test_subtensor_integration.py integration test should be executed after successful installation of the Bittensor package.
  • Test results, including any failures or errors, should be prominently reported within the GitHub Actions workflow.
  • The workflow should gracefully handle and report any errors encountered during the build, installation, or testing process.

Tasks

  • Design and implement a GitHub Actions workflow file in the .github/workflows directory.
    • Define a workflow trigger based on push or pull request events to specified branches.
    • Include an option for manual workflow triggers.
    • Add an input parameter for specifying the Bittensor repository branch.
  • Integrate the localnet.sh script into the workflow for building and starting the blockchain nodes.
    • Ensure the script is executed with the appropriate parameters and environment variables.
    • Handle any errors or failures during the node build and startup process.
  • Implement steps to fetch and install the Bittensor package from the specified branch.
    • Use the input parameter to dynamically set the branch for Bittensor package installation.
    • Handle any dependencies or setup required for the Bittensor package.
  • Configure the environment and execute the test_subtensor_integration.py integration test.
    • Set up any necessary environment variables or configurations for the test.
    • Run the integration test and capture the test results.
  • Implement comprehensive error handling and reporting throughout the workflow.
    • Catch and handle any errors or exceptions that may occur during each step.
    • Provide clear and informative error messages in the workflow logs.
  • Optimize the workflow for performance and efficiency.
    • Implement caching mechanisms for dependencies and build artifacts.
    • Parallelize independent tasks wherever possible.
  • Optional: Implement notifications or integrate with monitoring tools for test failures or critical issues.

Additional Considerations

  • Utilize Docker containers within the GitHub Actions workflow to provide a consistent and isolated environment for building, installing, and testing.
  • Implement a matrix strategy to test against multiple versions or configurations of the Bittensor package and subtensor node.
  • Consider integrating code coverage reporting to monitor and maintain high test coverage.
  • Explore opportunities for performance optimization, such as parallel test execution or selective test runs based on changed files.

Related Links

@distributedstatemachine distributedstatemachine added the blue team defensive programming, CI, etc label Apr 16, 2024
@sam0x17 sam0x17 self-assigned this Apr 18, 2024
@distributedstatemachine
Copy link
Contributor Author

TODO:

  • Update issue with discussion from today ; use a simiply python script that connects to the node and executes the tests. Should this live in bittensor ?

@sam0x17 sam0x17 removed their assignment May 2, 2024
@sam0x17
Copy link
Contributor

sam0x17 commented May 2, 2024

@distributedstatemachine I think you mentioned you had started some of this somewhere? Any code that can be re-used or naw?

@distributedstatemachine
Copy link
Contributor Author

@sam0x17 @orriin #332 . Although, this description is probably outdated since we wont be using https://github.com/opentensor/bittensor/blob/stao/tests/integration_tests/test_subtensor_integration.py and would have to write a integration test that doesnt use mocks i.e we will have to spin up the local node in the ci.

@orriin
Copy link
Contributor

orriin commented May 4, 2024

@sam0x17
Copy link
Contributor

sam0x17 commented May 5, 2024

yeah worth mentioning, as long as you specify you are using SubtensorCI like the other github workflows do it will be a very beefy node with 128 gigs of ram

@orriin
Copy link
Contributor

orriin commented May 13, 2024

I've been exploring the bittensor codebase, and have some questions about how to proceed with this issue.

1. What to test?

After speaking with @distributedstatemachine, it has become apparent that test_subtensor_integration.py is only designed for working with a mocked version of substrate making it unsuitable for the integration tests described in this issue.

Therefore, an entirely new test suite will need to be written for these e2e tests.

For basic tests, I could work through each file one-by-one in bittensor/commands and write e2e tests for all the combinations of logic in each subcommand. Is this something we want to do? Are there any commands higher priority than others? It feels like it will take quite a long time to write tests for every subcommand, maybe we only want to write for some?

For multi-step tests, there was already a scenario described here: https://discord.com/channels/799672011265015819/1176889736636407808/1236057424134144152 . Are there any other complex e2e cases we should test?

2. How do I call into the CLI?

I suggest directly call run on the commands exported from bittensor/commands. That way, it's easier to mock the command args compared to directly calling the cli binary.

3. How to clear state between each e2e test?

Since from inside the bittensor script we have no way to restart the chain (necessary between e2e tests to prevent polluting state and potential weird race conditions) we will need a test harness which runs a new ./localhost.sh instance for every test.

I'm thinking to create an orchestrator file, which will spin up a localhost.sh instance, run a test, close the instance, repeat for each e2e test.

There may also be some voodoo possible with beforeeach and aftereach pytest hooks, but I'm not sure if it would be worth the extra effort to get those working.

4. How to structure the test files?

I'm thinking of creating a new dir tests/e2e_tests for these as they are more e2e than integration, and there's already a dir tests/integration_tests which is used for mocked substrate testing.

My proposed structure of the new testing dir is

.
└── tests/
    └── e2e_tests/
        ├── subcommands/
        │   ├── subnets/
        │   │   ├── list.py
        │   │   └── ... # other subnets commands here
        │   └── ... # other subcommands here
        ├── multistep/
        │   ├── tx_rate_limit_exceeded.py
        │   └── ... # other multi-step e2e tests here
        ├── common.py # common utils 
        └── run.py # test orchestrator which will spin up a new `localhost.sh`, run e2e test, repeat for each e2e test defined

@distributedstatemachine
Copy link
Contributor Author

  1. How do I call into the CLI?
    I suggest directly call run on the commands exported from bittensor/commands. That way, it's easier to mock the command args compared to directly calling the cli binary.

I think we can use the call the cli the same way its currently done :

https://github.com/opentensor/bittensor/blob/master/tests/integration_tests/test_cli.py#L445-L5

  1. How to clear state between each e2e test?
    Since from inside the bittensor script we have no way to restart the chain (necessary between e2e tests to prevent polluting state and potential weird race conditions) we will need a test harness which runs a new ./localhost.sh instance for every test.
    I'm thinking to create an orchestrator file, which will spin up a localhost.sh instance, run a test, close the instance, repeat for each e2e test.
    There may also be some voodoo possible with beforeeach and aftereach pytest hooks, but I'm not sure if it would be worth the extra effort to get those working.

I like this , i think we can call purge chain on the binary . Alternatively , we can write a long test that covers all the happy paths , and not have to purge the chain at all

@orriin
Copy link
Contributor

orriin commented May 13, 2024

@sam0x17 DM'd me about 1., saying that it's more important for this PR to get a nice process / structure / examples in place to write e2e tests than to get full test coverage.

So I'll start with 2 examples

  1. a very basic one (single CLI call) and
  2. a more complex one involving multiple cli calls (described here https://discord.com/channels/799672011265015819/1176889736636407808/1236057424134144152 )

@sam0x17
Copy link
Contributor

sam0x17 commented May 13, 2024

also some key requirements I would like to meet with this if possible:

  • each test runs with a completely clean state, meaning there is no run-order dependencies between tests, and this state is thrown away at the end of the test regardless of whether it fails or succeeds. So for example even if the test somehow wipes out all balances or something, the scope of that is completely limited to that test. AKA no side effects
  • (if possible) tests can run in parallel, meaning we have to do some kind of process forking most likely. the mental model is we want to simulate starting and then reverting a db transaction. This will allow us to eventually comfortably have thousands of integration tests without the CI taking 30+ minutes to run

@orriin
Copy link
Contributor

orriin commented May 14, 2024

I have a PoC using pytest fixures (https://docs.pytest.org/en/6.2.x/fixture.html) to spin up and spin down localnet nodes between tests.

The initial adaptation will need to run in serial, but with some additional logic to find free ports it should be possible to upgrade in the future with the ability to run in parallel.

@sam0x17
Copy link
Contributor

sam0x17 commented May 14, 2024

I have a PoC using pytest fixures (https://docs.pytest.org/en/6.2.x/fixture.html) to spin up and spin down localnet nodes between tests.

The initial adaptation will need to run in serial, but with some additional logic to find free ports it should be possible to upgrade in the future with the ability to run in parallel.

awesome, this is a great first stab at this 💯

can you link the PR to this with a fixes #331?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blue team defensive programming, CI, etc
Projects
None yet
3 participants