Integrate Ant Group open source machine learning tool SQLFlow into the web-based notebook tool Apache Zeppelin to expand the interpreter language supported by Zeppelin.
Integrate SQLFlow into Apache Zeppelin:
- Rich SQLFlow usage scenarios. Users can also use SQLFlow in this notebook environment;
- Expanded the interpreter language supported by Zeppelin;
- SQLFlow can be used in conjunction with other languages or data processors on this platform to fully demonstrate its convenience in AI applications.
Here are examples for training a TensorFlow DNNClassifier model using sample data Iris.train, and running prediction using the trained model in Zeppelin environment.
The installation is divided into the following steps:
-
Prepare the interpreter project code jar package
zeppelin-sqlflow-0.9.0.jar
and the configuration fileinterpreter-setting.json
; -
Switch to the installation directory of Zeppelin and create the
interpreter/sqlflow
subdirectory:
mkdir ~/zeppelin-0.9.0-SNAPSHOT/interpreter/sqlflow
- Copy the jar package and configuration file in step 1 to the directory created in step 2:
# This is the path on my machine, for reference only.
cp ~/zeppelin-sqlflow-0.9.0.jar ~/zeppelin-0.9.0-SNAPSHOT/interpreter/sqlflow
cp ~/interpreter-setting.json ~/zeppelin-0.9.0-SNAPSHOT/interpreter/sqlflow
- Start zeppelin
cd ~/zeppelin-0.9.0-SNAPSHOT/bin/
./zeppelin-daemon.sh start
- Refer to the manual OperationGuide for specific use steps.
The project was initiated by the foundation platform development team of the IT department in PCCC. It aims to strengthen the function expansion of the Zeppelin in data exploration and model training,which provides data scientists with more comprehensive and rich data processing tools.