Kaskada is an event-processing engine comprised of two main components:
kaskada-enginecomponent is a stateful compute engine for transforming event data.
kaskada-manageris an (optionally) stateful API for communicating with clients
To use Kaskada with Python you’ll need to have Python version 3.6 or above installed. Once you have Python installed ensure that you can run it. Open a terminal in your OS (command line prompt on Windows) and check the output of the following commands
python --version # Python 3.10.6
pip install kaskada
Pip and pip3 and permissions
Depending on you Python installation and configuration you may have
If you get a permission error when running the
Kaskada is now installed. Kaskada can be used locally by creating a local session.
from kaskada.api.session import LocalBuilder session = LocalBuilder().build()
Creating a local session will download and run the Kaskada service as a non-persistent Python subprocess. To connect to an existing Kaskada process, initialize the client with the appropriate endpoint.
from kaskada import client client.init( # The host and port of the Kaskada manager API. endpoint = "localhost:50051", # If true, TLS encryption will be required for the API connection. is_secure = False, # An (optional) string identifying the client. client_id = "python-client" )
For more information, please check out the Kaskada Python Client documentation.
IPython is an interactive Python runtime used by Jupyter and other notebooks to evaluate Python code blocks. IPython supports "magic extensions" for customizing how code blocks are interpreted. Kaskada provides a magic extension that simplifies querying Kaskada.
To use Kaskada within a Jupyter notebook you’ll need to have the following pieces of software installed
Once you have both prerequisites installed ensure that you can run them. Open a terminal in your OS (command line prompt on Windows) and check the output of the following commands
python --version # Python 3.10.6 jupyter --version # Selected Jupyter core packages... # IPython : 7.34.0 # ipykernel : 6.17.0 # bipywidgets : 8.0.2 # jupyter_client : 7.4.4 # jupyter_core : 4.11.2 # jupyter_server : 1.21.0 # jupyterlab : 3.6.1 # nbclient : 0.7.0 # nbconvert : 7.2.3 # nbformat : 5.7.0 # notebook : 6.5.2 # qtconsole : 5.3.2 # traitlets : 5.5.0
Kaskada’s python library includes notebook customizations that allow us to write queries in the Fenl language but also receive and render the results of our queries in our notebooks. We need to enable these customizations first before we can use them.
This will load the extension into the IPython context. You can verify the install worked by initializing the extension:
%fenl [--as-view AS_VIEW] [--data-token DATA_TOKEN] [--debug DEBUG] [--output OUTPUT] [--preview-rows PREVIEW_ROWS] [--result-behavior RESULT_BEHAVIOR] [--var VAR] fenl query magic optional arguments: --as-view AS_VIEW adds the body as a view with the given name to all subsequent fenl queries. --data-token DATA_TOKEN A data token to run queries against. Enables repeatable queries. --debug DEBUG Shows debugging information --output OUTPUT Output format for the query results. One of "df" (default), "json", "parquet" or "redis-bulk". "redis- bulk" implies --result-behavior "final-results" --preview-rows PREVIEW_ROWS Produces a preview of the data with at least this many rows. --result-behavior RESULT_BEHAVIOR Determines which results are returned. Either "all- results" (default), or "final-results" which returns only the final values for each entity. --var VAR Assigns the QueryResponse to a local variable with the given name. The QueryResponse contains result_url, query and dataframe.
For more information, please check out the Fenlmagic Client documentation.
To use Kaskada on the command line, you’ll need to install three components:
The Kaskada command-line executable
The Kaskada manager, which serves the Kaskada API
The Kaskada engine, which executes queries
Each of these are available as pre-compiled binaries in the Releases section of Kaskada’s Github repository.
This example assumes you have installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/kaskada-ai/kaskada/main/install/install.sh)"
To simplify running the Kaskada components you can move them to a directory in your path.
First, print a colon-separated list of the directories in your
Move the Kaskada binaries to one of the listed locations.
This command assumes that the binaries are currently in your working directory and that your
/usr/local/bin, but you can customize it if your locations are different.
mv kaskada-* /usr/local/bin/
For more information about adding binaries to your path, see this StackOverflow article.
Authorizing applications on OSX
If you’re using OSX, you may need to unblock the applications. OSX prevents applications you download from running as a security feature. You can remove the block placed on the file when it was downloaded with the following command:
You should now be able to run all three components. To verify they’re installed correctly and executable, try running the following command:
You should see output similar to the following:
A CLI tool for interacting with the Kaskada API Usage: cli [command] Available Commands: completion Generate the autocompletion script for the specified shell help Help about any command load A set of commands for loading data into kaskada query A set of commands for running queries on kaskada sync A set of commands for interacting with kaskada resources as code Flags: --config string config file (default is $HOME/.cli.yaml) -d, --debug get debug log output -h, --help help for cli --kaskada-api-server string Kaskada API Server --kaskada-client-id string Kaskada Client ID --use-tls Use TLS when connecting to the Kaskada API (default true)
You can start a local instance of the Kaskada service by running the manager and engine:
kaskada-manager 2>&1 > manager.log 2>&1 & kaskada-engine serve > engine.log 2>&1 &