Locally
Kaskada is an event-processing engine comprised of two main components:
-
The
kaskada-engine
component is a stateful compute engine for transforming event data. -
The
kaskada-manager
is an (optionally) stateful API for communicating with clients
Using Kaskada with Python
To use Kaskada with Python you’ll need to have Python version 3.6 or above installed. Once you have Python installed ensure that you can run it. Open a terminal in your OS (command line prompt on Windows) and check the output of the following commands
python --version
# Python 3.10.6
pip install kaskada
Pip and pip3 and permissions
Depending on you Python installation and configuration you may have If you get a permission error when running the
|
Kaskada is now installed. Kaskada can be used locally by creating a local session.
from kaskada.api.session import LocalBuilder
session = LocalBuilder().build()
Creating a local session will download and run the Kaskada service as a non-persistent Python subprocess. To connect to an existing Kaskada process, initialize the client with the appropriate endpoint.
from kaskada import client
client.init(
# The host and port of the Kaskada manager API.
endpoint = "localhost:50051",
# If true, TLS encryption will be required for the API connection.
is_secure = False,
# An (optional) string identifying the client.
client_id = "python-client"
)
For more information, please check out the Kaskada Python Client documentation.
Using Kaskada with IPython (Jupyter)
IPython is an interactive Python runtime used by Jupyter and other notebooks to evaluate Python code blocks. IPython supports "magic extensions" for customizing how code blocks are interpreted. Kaskada provides a magic extension that simplifies querying Kaskada.
To use Kaskada within a Jupyter notebook you’ll need to have the following pieces of software installed
Once you have both prerequisites installed ensure that you can run them. Open a terminal in your OS (command line prompt on Windows) and check the output of the following commands
python --version
# Python 3.10.6
jupyter --version
# Selected Jupyter core packages...
# IPython : 7.34.0
# ipykernel : 6.17.0
# bipywidgets : 8.0.2
# jupyter_client : 7.4.4
# jupyter_core : 4.11.2
# jupyter_server : 1.21.0
# jupyterlab : 3.6.1
# nbclient : 0.7.0
# nbconvert : 7.2.3
# nbformat : 5.7.0
# notebook : 6.5.2
# qtconsole : 5.3.2
# traitlets : 5.5.0
Kaskada’s python library includes notebook customizations that allow us to write queries in the Fenl language but also receive and render the results of our queries in our notebooks. We need to enable these customizations first before we can use them.
%load_ext fenlmagic
This will load the extension into the IPython context. You can verify the install worked by initializing the extension:
%%fenl?
%fenl [--as-view AS_VIEW] [--data-token DATA_TOKEN] [--debug DEBUG]
[--output OUTPUT] [--preview-rows PREVIEW_ROWS]
[--result-behavior RESULT_BEHAVIOR] [--var VAR]
fenl query magic
optional arguments:
--as-view AS_VIEW adds the body as a view with the given name to all
subsequent fenl queries.
--data-token DATA_TOKEN
A data token to run queries against. Enables
repeatable queries.
--debug DEBUG Shows debugging information
--output OUTPUT Output format for the query results. One of "df"
(default), "json", "parquet" or "redis-bulk". "redis-
bulk" implies --result-behavior "final-results"
--preview-rows PREVIEW_ROWS
Produces a preview of the data with at least this many
rows.
--result-behavior RESULT_BEHAVIOR
Determines which results are returned. Either "all-
results" (default), or "final-results" which returns
only the final values for each entity.
--var VAR Assigns the QueryResponse to a local variable with the given
name. The QueryResponse contains result_url, query and dataframe.
For more information, please check out the Fenlmagic Client documentation.
Using Kaskada with the command line (CLI)
To use Kaskada on the command line, you’ll need to install three components:
-
The Kaskada command-line executable
-
The Kaskada manager, which serves the Kaskada API
-
The Kaskada engine, which executes queries
Each of these are available as pre-compiled binaries in the Releases section of Kaskada’s Github repository.
This example assumes you have installed curl
.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/kaskada-ai/kaskada/main/install/install.sh)"
To simplify running the Kaskada components you can move them to a directory in your path.
First, print a colon-separated list of the directories in your PATH
.
echo PATH
Move the Kaskada binaries to one of the listed locations.
This command assumes that the binaries are currently in your working directory and that your PATH`
includes /usr/local/bin
, but you can customize it if your locations are different.
mv kaskada-* /usr/local/bin/
For more information about adding binaries to your path, see this StackOverflow article.
Authorizing applications on OSX
If you’re using OSX, you may need to unblock the applications. OSX prevents applications you download from running as a security feature. You can remove the block placed on the file when it was downloaded with the following command:
|
You should now be able to run all three components. To verify they’re installed correctly and executable, try running the following command:
kaskada-cli -h
You should see output similar to the following:
A CLI tool for interacting with the Kaskada API
Usage:
cli [command]
Available Commands:
completion Generate the autocompletion script for the specified shell
help Help about any command
load A set of commands for loading data into kaskada
query A set of commands for running queries on kaskada
sync A set of commands for interacting with kaskada resources as code
Flags:
--config string config file (default is $HOME/.cli.yaml)
-d, --debug get debug log output
-h, --help help for cli
--kaskada-api-server string Kaskada API Server
--kaskada-client-id string Kaskada Client ID
--use-tls Use TLS when connecting to the Kaskada API (default true)
You can start a local instance of the Kaskada service by running the manager and engine:
kaskada-manager 2>&1 > manager.log 2>&1 &
kaskada-engine serve > engine.log 2>&1 &