DISCLAIMER: This should only be used for educational and legal purposes such as publicly archiving data. Koii does not condone using tasks to steal data or infringe on personal privacy
One of the best use cases for Koii tasks is web crawling. While there are many webpages that allow users to endlessly scroll without having an account, most content-based websites only give a small snippet of content without logging in. An autonomous web crawler has limited usefulness without a means to login.
This is where secrets become very handy!
If you have explored the Node enough, you may have noticed that some tasks require Task Extensions before you're able to run them. These task extensions are secrets, similar to environment variable in a .env file.
Secrets allow a user to save private information so that we can use it without ever having access to the actual values. With the help of secrets, your crawler task can easily utilize a user's account to login, get full access to a webpage's content, and start archiving information.
In Lesson 2, we added something to our task-config.yml
requirements section to make UPnP configuration work. Now, we can see that a task developer can also use the requirements section to specify any number of secrets they may need for a task to function! For example, this is Archive Twitter's
requirements section:
requirementsTags:
- type: TASK_VARIABLE
value: 'TWITTER_USERNAME'
description: 'The username of your volunteer Twitter account.'
- type: TASK_VARIABLE
value: 'TWITTER_PASSWORD'
description: 'The password of your volunteer Twitter account.'
- type: TASK_VARIABLE
value: 'TWITTER_PHONE'
description: 'If verification is required, will use your phone number to login.'
- type: CPU
value: '4-core'
- type: RAM
value: '5 GB'
- type: STORAGE
value: '5 GB'
(You can look at the Twitter Archive
task more in depth here.)
As you can see, each task extension has a corresponding value and description, which will be used by the Node. These task extensions will be automatically linked as environment variables for your use and can be accessed just like an entry in your .env, e.g. process.env.TWITTER_USERNAME
One final thing to note is that during local development, you must specify these secrets in your .env file. For example, if you have a secret called FIRST_NAME
in your config-task.yml, you would need to have a FIRST_NAME
entry in your local .env file.
Effectively, adding requirements allows you to build a .env on the user's computer, which can be used when they run the task code.
Let's take a deeper look at the option in the config-task.yml
file. Part II. The Task Config File