FastScore DevHub

The FastScore DevOps Hub

Welcome to the FastScore DevOps hub. You'll find comprehensive guides and documentation to help you start integrating FastScore microservices into your infrastructure and deploying analytics as quickly and repeatably as possible, as well as support if you get stuck. Let's jump right in!

You can also contact us by email: support@opendatagroup.com

Get Started

Getting Started with FastScore v1.5

This is a guide for installing and running FastScore. It contains instructions for first-time and novice users, as well as reference instructions for common tasks. This guide was last updated for v1.5 of FastScore.

If you need support or have questions, please email us: support@opendatagroup.com

Installing FastScore

This guide will walk you through installing and running Open Data Group's FastScore microservices suite. The following instructions will generally assume that you're working on a Linux machine. There are slight differences if you're running FastScore on MacOS, which will be indicated by a special note. The differences if you're running Windows have not yet been fully charted.

Prerequisites

The FastScore Microservices Suite is hosted on DockerHub (https://hub.docker.com/u/fastscore/). As such, one must first install Docker. For example, on Ubuntu Linux:

$ sudo apt-get install docker-engine

It's also useful (but not mandatory) to have Docker Compose installed. Installation instructions can be found here: docs.docker.com/compose/install/.

FastScore Microservices Suite

If you've installed Docker Compose, you can skip this step—the docker-compose up -d command will automatically pull the correct Docker images. To do it manually, the FastScore microservices can be installed by pulling the images from DockerHub:

$ docker pull fastscore/model-manage:1.5
$ docker pull fastscore/connect:1.5
$ docker pull fastscore/engine:1.5
$ docker pull fastscore/engine-x:1.5
$ docker pull fastscore/model-manage-mysql:1.5
$ docker pull fastscore/dashboard:1.5

Note | MacOS Alert

On MacOS, Docker actually runs inside of a virtual machine (see Docker's documentation here: https://docs.docker.com/machine/ ). In order to make sure all of the ports and IP addresses are handled correctly, you'll need to run the commands from inside this virtual machine.

To start the virtual machine and give it the name "default", use the following command:

$ docker-machine create --driver=virtualbox default

This uses VirtualBox as the driver for the virtual machine. If you don't have it already, you should download the VirtualBox client to manage the docker-machine. Among other things, this can be used to set up port forwarding for the virtual machine, which may be needed later.

To switch to this environment for the default virtual machine, use the following command:
$ eval $(docker-machine env default)
The virtual machine's IP address can be retrieved with the docker-machine ip command, e.g.,

$ docker-machine ip
192.168.99.100

This IP address should be used as the FastScore host machine IP address.

It is additionally useful to install the FastScore Command-Line Interface (CLI).

Installing the FastScore Command-Line Interface (CLI)

The FastScore CLI can be downloaded and installed using the following commands:

wget https://s3-us-west-1.amazonaws.com/fastscore-cli/fastscore-cli-1.5.tar.gz
tar xzf fastscore-cli-1.5.tar.gz
cd fastscore-cli-1.5
sudo python setup.py install

This will install the required dependencies. The FastScore CLI is a Python tool, so it doesn't need to be compiled, and the setup script should automatically add the CLI to $PATH.

Note: Linux Users

python-setuptools and python-dev (i.e. header files) are required. These may or may not be already present on your system. If not, you will need to install them.

For example:

$ sudo apt-get install python-setuptools
$ sudo apt-get install python-dev

Once you've installed the FastScore CLI, check that it works by executing the following command in your terminal.

$ fastscore help
FastScore CLI v1.5
Available commands ('help <command>' for more info):
  attachment       Add/remove model attachments
  config           Configure FastScore
  connect          Establish FastScore connection
  fleet            Check FastScore status
  help             Explain commands and options
  job              Run models
  model            Add/remove models
  pneumo           Listen for notifications
  schema           Add/remove Avro schemas
  sensor           Add/remove sensor descriptors
  snapshot         List/restore/remove model snapshots
  stream           Adds/remove stream descriptors
  tap              Install/remove sensors
Use 'help options' to list available options

This displays a list of all of the FastScore CLI commands.

Info

Entering just fastscore (with no arguments) brings up the FastScore interactive command line.

Configuring and Starting FastScore

Once FastScore has been installed, there are only a few steps needed to get it running.

  1. Write a FastScore configuration file (or copy the example below).
  2. Start the FastScore services, either manually, or via Docker Compose (recommended).
  3. Configure FastScore using the FastScore CLI and configuration file.
  4. Connect to the FastScore Dashboard with your browser.

Let's go through each step carefully.

FastScore Configuration Files

FastScore's microservices architecture requires each microservice component to communicate with other components. These communications are managed by the Connect microservice. In order for Connect to connect, it has to be given information about the other microservices components in a configuration file. A sample configuration file is shown below:

fastscore:
  fleet:
    - api: model-manage
      host: localhost
      port: 8002
    - api: engine-x
      host: localhost
      port: 8003

  db:
    type: mysql
    host: localhost
    port: 3306
    username: root
    password: root

  pneumo:
    type: kafka
    bootstrap:
      - localhost:9092
    topic: notify

Configuration files are written in YAML. The configuration file above specifies the host machines and ports for the Model Manage container, the MySQL database container used by Model Manage, and one Engine-X container, all hosted on the same machine. Additionally, Pneumo, an asynchronous notification library used by FastScore, is configured to communicate via Kafka.

Starting and Stopping the FastScore Database

FastScore's Model Manage can use a Docker volume container as a database to store information about current models and data streams. This database can exist independently of the other FastScore microservices, which is a desirable behavior for data persistence. Start the database with the command

docker volume create --name=db

and stop it with

docker volume rm db

(You may name the database whatever you wish---we use the convention db in this document). Note that the database should generally be running before starting any FastScore services which use the database.

Add the database to Model Manage in the usual way for Docker volumes:

  1. With the -v flag when using docker run, e.g.,
    docker run -it -d --net=host --rm -v db:/var/lib/mysql fastscore/model-manage-mysql:1.4
  2. As an option in the docker-compose.yml file:
version: '2'
services:
[...]
  database:
    image: fastscore/model-manage-mysql:1.5
    network_mode: "host"
    volumes:
      - db:/var/lib/mysql
[...]
volumes:
  db:
    external: true

Using FastScore with Docker Compose (recommended)

Docker-Compose is a utility that streamlines the configuration and simultaneous execution of multiple Docker containers. A Docker Compose file is a YAML file defining services, networks and volumes. The default path for a Compose file is ./docker-compose.yml, but custom filenames are supported with the -f <filename> flag. The interested reader is directed to Docker's documentation for more information on Compose files. Docker Compose files can be used to initialize or halt all of the FastScore microservices components with a single command. For example,

docker-compose up -d

will start a full suite of FastScore services specified in this Docker Compose file:

version: '2'
services:
  dashboard:
    image: fastscore/dashboard:1.5
    network_mode: "host"
    stdin_open: true
    tty: true
    environment:
      CONNECT_PREFIX: https://127.0.0.1:8001

  connect:
    image: fastscore/connect:1.5
    network_mode: "host"
    stdin_open: true
    tty: true

  engine-1:
    image: fastscore/engine-x:1.5
    network_mode: "host"
    stdin_open: true
    tty: true
    environment:
        CONNECT_PREFIX: https://127.0.0.1:8001

  database:
    image: fastscore/model-manage-mysql:1.5
    network_mode: "host"
    volumes:
      - db:/var/lib/mysql
 
  model-manage:
    image: fastscore/model-manage:1.5
    network_mode: "host"
    stdin_open: true
    tty: true
    depends_on:
      - connect
      - database
    environment:
      CONNECT_PREFIX: https://127.0.0.1:8001

volumes:
  db:
    external: true

Similarly, all of the specified microservices can be stopped with the command

docker-compose down -v

(Here, the -v flag instructs docker-compose to remove any lingering volumes created by the compose file. This does not include external Docker volumes, such as those created by docker volume create).

Note

The example file above is designed for use on a Linux machine with all services running in network=host mode; you will need to modify it for use in other settings (e.g. MacOS). And, as discussed above, you must create the db volume with docker volume create db (or remove the volume link from database).

Starting FastScore Manually

Sometimes, whether for testing purposes or to satisfy your own hardy can-do spirit, you may want to start FastScore manually. The process is somewhat more complicated than using Docker Compose because each service must be managed individually.

docker run -it -d --net=host --rm fastscore/connect:1.5
docker run -it -d --net=host --rm -e "CONNECT_PREFIX=https://127.0.0.1:8001" fastscore/dashboard:1.5
docker run -it -d --net=host --rm -e "CONNECT_PREFIX=https://127.0.0.1:8001" fastscore/engine-x:1.5
docker run -it -d --net=host --rm -e "CONNECT_PREFIX=https://127.0.0.1:8001" fastscore/model-manage:1.5
docker run -it -d --net=host --rm -v db:/var/lib/mysql fastscore/model-manage-mysql:1.5

Connecting to and Configuring FastScore with the FastScore CLI

Once the FastScore suite of services is running, we have to configure Connect using the file we created earlier. Before doing this, check that all the Docker containers are running with the docker ps command. The output should look something like this:

CONTAINER ID                                IMAGE                                        COMMAND                             CREATED
b06bafae34c6       fastscore/model-manage:1.5           "/bin/sh -c 'bin/mode"   38 min
e1f102689af9      fastscore/dashboard:1.5             "httpd-foreground"       38 min
b06234494675   fastscore/model-manage-mysql:1.5  "/bin/sh -c '/sbin/my"   38 min
2fa05d7aa5c5              fastscore/connect:1.5        "/bin/sh -c 'bin/conn"   38 min
53b3f53b75a2        fastscore/engine:1.5         "/bin/sh -c 'java -ja"   38 min

Configure the FastScore CLI using the following command:

$ fastscore connect https://localhost:8000
Proxy prefix set

Then, use the config set command to set the configuration file for Connect:

$ fastscore config set config.yaml
Configuration set

config.yml is the configuration file described earlier in this document.

We can then check the status of our containers using the fleet command:

$ fastscore fleet
Name            API           Health
--------------  ------------  --------
engine-x-1      engine-x      ok
model-manage-1  model-manage  ok

Now we're ready to start scoring.

Using the FastScore Dashboard

FastScore's Dashboard provides a convenient user interface for reviewing engine status and managing models and streams. However, as compared to the FastScore CLI, it requires a few additional setup steps to get things running.

First, if you are not running FastScore on your local machine (for example, if you have FastScore running on a cloud service platform), you will need to allow incoming and outgoing traffic on port 8000 (used by the FastScore Dashboard). You will also need to have configured FastScore as described in the previous section.

To access the Dashboard, take your browser to the FastScore host machine at port 8000. If all goes well , you will be greeted by this screen:

On the left-hand side of the Dashboard are two tabs: engine-1 and model-manage-1. These correspond to the Engine microservice and the Model Manage microservice. The green dot indicates that they are currently running correctly. If you have configured additional engine containers, they will also appear on the side.

On the left-hand side of the Dashboard are two tabs: engine-1 and model-manage-1. These correspond to the Engine microservice and the Model Manage microservice. The green dot indicates that they are currently running correctly. If you have configured additional engine containers, they will also appear on the side.

Note

If, instead, you get an "Application Initialization Error," check your configuration file for any errors, and verify that you have followed all of the FastScore CLI configuration steps. If the fastscore fleet command shows both Model Manage and your Engine containers working properly, then the problem most likely has to do with Dashboard's proxy service or your host machine's network traffic settings.

Working with Models and Streams

FastScore is a streaming analytic engine: its core functionality is to read in records from a data stream, score them, and output that score to another data stream. As such, running any model consists of four steps:

  1. Loading the model
  2. Configuring input and output streams
  3. Setting Engine parameters
  4. Running the model

Creating and Loading Models

Version 1.5 of FastScore supports models in Python, R, Java, PFA, PrettyPFA and C formats. Some setup steps differ slightly between Python/R models and PFA, Java, or C models. As a model interchange format, PFA can provide some benefits in performance, scalability, and security relative to R and Python. PrettyPFA is a human-readable equivalent to PFA. However, as the majority of users will be more familiar with R and Python, we focus on these two languages in this section.

Models via FastScore CLI

The FastScore CLI allows a user to load models directly from the command line. The list of models currently loaded in FastScore can be viewed using the model list command:

$ fastscore model list
Name    Type
------  ------
MyModel Python

Models can be added with model add <name> <file>, and removed with model remove <name>. Additionally, the fastscore model show <name> command will display the named model.

Models via the Dashboard

The Dashboard provides functionality to add and manage models. To upload a model, under the Engine tab, select the "Upload model" button, and choose a model from your local machine. Alternatively, "select model", depicted below, allows you to select an existing model from the model manager by name.

Additionally, models can be added, removed, inspected, and edited from the Models tab under Model Manage:

The screenshot above shows the model manager tab, and an existing "neural_net.py" model. Models can be removed, saved, created, or uploaded from this view. Note that after creating or modifying a model in this view, it must still be selected for use from the Engine tab.

The screenshot above shows the model manager tab, and an existing "neural_net.py" model. Models can be removed, saved, created, or uploaded from this view. Note that after creating or modifying a model in this view, it must still be selected for use from the Engine tab.

Models in Python and R

All models are added to FastScore and executed using the same CLI commands, namely:

fastscore model add <modelname> <path/to/model.extension>

Note that, in order to determine whether a model is Python or R, Engine-X requires that it have an appropriate file extension (.py for Python, .R for R, .pfa for PFA, and .ppfa for PrettyPFA). Also, in order to score a Python/R model, there are certain constraints on the form the model must take.

FastScore includes both a Python2 and Python3 model runner. By default, .py files are interpreted as Python2 models---to load a Python3 model, use the file extension .py3, or the flag -type:python3 option with fastscore model add:

fastscore model add -type:python3 my_py3_model path/to/model.py

to add a Python3 model.

Python Models

Python models must declare a one-argument action() function. The minimal example of a Python model is the following:

# fastscore.input: input-schema
# fastscore.output: output-schema

def action(datum):
    yield 0

This model will produce a 0 for every input.

Additionally, Python models may declare begin() and end() functions, which are called at initialization and completion of the model, respectively. A slightly more sophisticated example of a Python model is the following:

# fastscore.input: input-schema
# fastscore.output: output-schema

import cPickle as pickle

def begin(): # perform any initialization needed here
  	global myObject
    myObject = pickle.load(open('object.pkl'))
    pass # or do something with the unpickled object

def action(datum): # datum is expected to be of the form '{"x":5, "y":6}'
    record = datum
    x = record['x']
    y = record['y']
    yield x + y 

def end():
    pass

This model returns the sum of two numbers. Note that we are able to import Python's standard modules, such as the pickle module. Non-default packages can also be added using Import Policies, as described here. Custom classes and packages can be loaded using attachments, as described in the Gradient Boosting Regressor tutorial.

R Models

R models feature much of the same functionality as Python models, as well as the same constraint: the user must define an action function to perform the actual scoring. For example, the analogous model to the Python model above is

# fastscore.input: input-schema
# fastscore.output: output-schema

# Sample input: {"x":5.0, "y":6.0}
action <- function(datum) {
  x <- datum$x
  y <- datum$y
  emit(x + y)
}

Input and Output Schema

FastScore enforces strong typing on both the inputs and outputs of its models using AVRO schema. For R and Python models, this typing is enforced by specifying schema names in a smart comment at the top of the model file:

# fastscore.input: array-double
# fastscore.output: double

Python and R models must specify schemas for their inputs and outputs. PrettyPFA and PFA models already contain the input and output schema as part of the model definition, so they do not require a schema attachment.

For example, a model that expects to receive records of two doubles as inputs might have the following schema:

{
  "name": "Input",
  "type": "record",
  "fields" : [
    {"name": "x", "type": "double"},
    {"name": "y", "type": "double"}
  ]
}

The model might then produce a stream of doubles as its output:

{
  "name": "Output",
  "type": "double"
}

Input and output schema must be uploaded separately to FastScore. To upload the schema to FastScore with the CLI, use the following commands:

fastscore schema add input input.avsc
fastscore schema add output output.avsc

Attachments can also be managed from within the Dashboard, using the Model Manage view.

Input and Output Streams

Before a model can be run, it has to have some data to run on. Input and output streams are used to supply the incoming data to the model, and to return the corresponding scores. Currently, eight types of stream transports are supported: file, Kafka, HTTP, TCP, UDP, ODBC, debug, and console streams. All of these types are configured using a Stream Descriptor file.

Stream Descriptors are small JSON files containing information about the stream. An example of a Stream Descriptor for a Kafka stream is displayed below:

{
  "Description": "read Avro-typed JSON records from a Kafka stream",
  "Transport": {
    "Type": "kafka",
    "BootstrapServers": ["127.0.0.1:9092"],
    "Topic": "data-feed-1",
    "Partition": 0
  },
  "Encoding": "json",
  "Schema": { type: "record", ... }
}

Stream descriptors are documented in more detail on the stream descriptor page. The easiest type of stream to use is a file stream, which reads or writes records directly from/to a file inside of the FastScore engine container. Here is an example of such a stream:

{
  "Description": "read input from the specified file",
  "Loop": false,
  "Transport": {
    "Type": "file",
    "Path": "/root/data/neural_net_input.jsons"
  },
  "Envelope": "delimited",
  "Encoding": "json",
  "Schema": {"type": "array", "items": "double"}
}

This file stream expects each line of the neural_net_input.jsons file to be a vector of doubles, encoded as a JSON object, and delimitated by newlines. The file is located in the /root/data/ directory of the engine container. The "Loop": false line tells FastScore to stop reading the file after reaching the end of the file, as opposed to looping over the lines in the file.

Streams via FastScore CLI

The FastScore CLI can be used to configure data streams. The stream list command displays a list of existing streams:

$ fastscore stream list
demo-1
demo-2

By default, two demo file streams are included in FastScore. The demo-1 data set consists of random numbers. The demo-2 dataset consists of lists of JSONS with the following AVRO schema:

{
  "type":"array", 
  "items": { 
    "type": "record", 
    "fields": [
      {"name":"x", "type":"double"}, 
      {"name":"y", "type":"string"}] 
  }
}

These demo streams can be used to test whether or not a simple model is working correctly.
Additional streams can be added using the fastscore stream add <stream-name> <stream-descriptor-file> command. Existing streams can be sampled (displaying the most recent items of the stream) with fastscore stream sample <stream-name>.

For filestreams, it is easiest to manage container input and output by linking a directory on the host machine to the engine container. This can be done in the Docker-Compose file by modifying the engine service to the following:

[...]

  engine-1:
    image: fastscore/engine-x:1.4
    network_mode: "host"
    stdin_open: true
    tty: true
    environment:
        CONNECT_PREFIX: https://127.0.0.1:8001
    volumes:                           # new volume section
      - ./data:/root/data


[...]

This will link the ./data directory on the host machine to the /root/data directory of the engine container. A filestream from the file "mydata.jsons" located in data on the host machine can then be accessed by FastScore using the stream descriptor

{
  "Loop": false,
  "Transport": {
    "Type": "file",
    "Path": "/root/data/mydata.jsons"
  },
  "Envelope": "delimited",
  "Encoding": "json",
  "Schema": [...]
}

A similar stream descriptor can be used for the output stream to write the output scores to a file in the same directory.

Warning

When using Docker volume linking to link a directory on the host machine to the Engine instance, Docker must have privileges to read and write from the specified directory. Additionally, the directory on the container must be chosen carefully, as its contents will be overwritten with the contents of the corresponding host directory upon linking. /root/data is safe (as it only contains the demo datafiles), but other directories on the container (e.g., /usr) may not be.

Streams via the Dashboard

Analogously to models, streams can also be manipulated from the Dashboard. Selecting the "Streams" tab under Model Manage displays the following view:

On the left, existing Stream Descriptors are displayed. New Stream Descriptors can be added and existing ones edited from this view. The example above displays a simple file stream, which will load the `neural_net_input.jsons` file located in the `/root/data` directory of the Engine Docker container.

On the left, existing Stream Descriptors are displayed. New Stream Descriptors can be added and existing ones edited from this view. The example above displays a simple file stream, which will load the neural_net_input.jsons file located in the /root/data directory of the Engine Docker container.

Engine Parameters

Engine parameters, such as the number of Engine instances currently running, as well as information about the model, are displayed on the Dashboard Engine tab.

Running a Model in FastScore

When using the Dashboard, models will begin scoring as soon as both the model and input/output streams are set from the Engine tab, and no further action from the user is required. Various statistics about performance and memory usage are displayed on the Engine tab.
To run a model using the FastScore CLI, use the fastscore job sequence of commands:

  • fastscore job run <model-name> <input-stream-name> <output-stream-name> runs the model named <model-name> with the specified input and output streams.
  • fastscore job stop halts the currently running model.
  • fastscore job status and fastscore job statistics display various information about the currently running job.
    Some of the statistics displayed by the fastscore job statistics command, such as memory usage, are also shown on the Dashboard.

This concludes the FastScore Getting Started guide. Additional FastScore API documentation is available at http://docs.opendatagroup.com/. Happy scoring!

Getting Started with FastScore v1.5

This is a guide for installing and running FastScore. It contains instructions for first-time and novice users, as well as reference instructions for common tasks. This guide was last updated for v1.5 of FastScore.

If you need support or have questions, please email us: support@opendatagroup.com