Simply add basket4py to any python (especially data science) project. And you will be able to run jupyter notebooks from a virtual vagrant enabled development environment. Of couse you can use basket4py to get an easy start on starting with python and jupyter notebooks.
This project gives you an easy start with python:
- Develop your python scripts within jupyter notebooks
- Use a full blown anaconda stack for data science task
- Visualize you data with
- Matplotlib
- seaborn
- The whole environment is setup within a virtual linux box, so your computer won't be impacted by any installation of the python environment.
- The provisioning is done with chef. So the chef recipes can easily be customized
- Everything is based on virtualbox and vagrant. So the whole setup is portable from one computer to the next and works independently from your OS (i.e. it works as well with OSX as with windows)
If you know how to insall programs on a Mac or PC, you should be able to get everything up and running. If not, ask someone to help you.
Follow these simple steps to install everything you need to start programming:
- install virtual box.
- install vargrant.
- Copy the zipfile from Github. And extract it somewhere.
- Use a terminal (dos-prompt / cmd) and navigate to the folder that contains the extracted files. You should find a file named
vagrantfile
. - type in
vagrant up
. This command will prepare a "virtual computer" on your pc or mac. Everything will be installed within this "virtual computer" so there won't be any interferences with other programs on your mashine. - type in
vagrant provision
this command may take even longer (leave it for the night). It will install a modern python development environment.
Thx @zyx954 for this instruction:
- Go to https://github.com/rreben/basket4py
- Download this repo
- do a vagrant up (just like you did with the other two repos (original and fork)
- vagrant provision get the python stack installed.
- Now you should have a fully functional anaconda stack.
- open a browser (safari) type in
192.168.33.12:8888
You should see a jupyter notebook now - type in
vagrant ssh
in your terminal. Now the command prompt will change. You are now logged in to your linux virtual guest machine. - Use
sudo -i pip install twitter
twat the command line from within the guest machine to add the twitter framework - Use
sudo -i pip install prettytable
- Now you should be able to use the code examples from the book.
- You can either type them in or you can copy the notebooks: Do a copy of the
*.ipny
(ipython notebook files) from the directory ipnb in the mining-the-social-web folder to the notebooks folder in the basket4py repo
After the installation. Use http://192.168.33.12:8888 in your web browser, to start the environment. Click on the notebook and run the code blocks in the order in which they occur in the notebook.
- Use
vagrant status
to check whether the vagrant machine is up and running. - start and stop vagrant via
vagrant up
andvagrant halt
(do not usevagrant suspend
in most cases) - Use
vagrant destory
if you have to restart completly from scratch or have to reuse the disk space.
- Vagrant is used to install python 3, jupyter and some other tool from the Anaconda eco system to a virtual mashine.
- Vagrant is instructed to use ansible installer.
- The virtual machine is provided via Oracles virtual box.
- A web server is running on the virtual (guest) computer. This server serves the jupyter notebooks.
- These notebooks can be accessed via port forwading from the host computer.
- This way all the tutorials are brought to the users browser.
This work is inspired by Matthew A. Russel's work on Mining the social Web, where I found out about iPython (now jupyter) and how to use Vagrant and chef to prepare an easy to deploy development environment.
- Right now I switched to ansible, thanks to @fhenri for doint the work of porting the project to ansible.
- The project is based on the anaconda installer from @andrewrothstein
I used the following chef recipes to cook up the development environment in former versions of this repo:
- anaconda
- apt
- bzip2 chef cookbook from John Bellone
- compat chef cookbook from John Keiser
- packagecloud
- runit chef cookbook from the Heavy Water Operations, LLC.
- tar chef cookbook from the Cramer Development, Inc.
- The vagrantfile is done, so setting up the development environment is working.
- The anaconda stack is working
You might see a warning while vagrant up, telling you that guest additions do not match the version of the virtual box.
The effect might be that the directories with the jupyter notebooks are not mounted correctly. In this case you will see that jupyter is running (192.168.33.12:8888 will show a webpage), however you will not see any meaningful tutorials.
If this happens, you have to update your virtualbox installation to the newest version. Use vagrant destroy
to restart from scratch, use vagrant up
to install again (do this in a strong wifi network). This should fix everything.
In most cases, this should solve your problems. But if the message "The guest additions on this VM do not match the installed version of VirtualBox! ..." persists, you might try to issue. vagrant plugin install vagrant-vbguest
and restart vagrant. This might indicate further problems with the guest additions.
Use vagrant ssh
to login to your guest mashine. Here you might issue ipython notebook --help
to learn more about starting the jupyter service.
Your stuck with the installation. Please create an issue on Github, I will try to help you then.
It might happen that the guest machine is not working with the correct system time. This will lead issues with various APIs especially the twitter API. Just do a vagrant halt
followed with vagrant up
to sync the time of the guest machine again.
On my Windows 7 mashine the VMBox takes up about 1.5 to 5 GB of diskspace, vagrant uses around 750 MB. As I have a SSD as my first disk, I need to move this to my secondary disk. To achieve this:
- Create a new directory on your target disk. Set the VAGRANT_HOME environment variable to point to this directory. On Windows go to explorer (right click) -> "Erweiterte Systemeinstellungen" -> "Umgebungsvariablen".
- Create a different new directory on your target disk for your VirtualBox. Open the VirtualBox app. In the settings, specify this directory to store the VirtualBox-files.
On my MacBook I need to have ghe VirtualBox on a flash-drive. This leads however to some obstacles: vagrant will not be able to provision the virtual mashine, because the certificate to log in to the virtual box is fully accessible. For security reasons ssl
will not accept a fully accessible certificate, so vagrant can not log in to its created guest machine.
So after using vagrant up
to download and install the virtual machine (takes 20 min) there might be an error with the permissions on the private-key file for the ssh to the virtual machine.
In this case do the following:
- let us assume that /project is the folder where the vagrant file lives.
- So then goto /project/.vagrant/mashines/default/virtualtbox and copy the file to a local folder /home folder (let us assume /Users/username/certificates/), where you can change the file permissions via
chmod 0600 key_file
- now set the vagrant system to find the file in this folder:
- open the vagrant file and add the last line below the two lines (so this block look as follows:)
override.vm.box = "precise64"
override.vm.box_url = "http://files.vagrantup.com/precise64.box"
config.ssh.private_key_path = "/Users/username/certificates/private_key"
- Note: with version 1.8.5 of vagrant the behaviour changed a little:
- The key file will now be named
insecure_private_key
- Vagrant will try to substitute this key file with a different secure key file, however this will lead to unrecoverable errors. So you have to add the following lines to the vagrant (or uncomment and adapt the lines in the vagrant file).
override.vm.box = "ubuntu/trusty64" config.ssh.private_key_path = "/Users/username/certificates/insecure_private_key" config.ssh.insert_key = false
- The key file will now be named
- Note: After using
vagrant destroy
you first have to deactivate config.ssh.private_key_path again in the Vagrantfile, because the nextvagrant up
will create a new guest virtual machine, with a new and different certificate.
- install virtual box. The target directory is fixed Unfortunately (Mac).
- install vargrant. The can be changed to point to a flash drive (Mac).
- install git client for Windows. Do check (it's guarded by a warining) the git and bash command linen tools. Otherwise
vagrant ssh
will not work at all (Windows). Alternatively you could use putty to login to your guest machine. - clone the github repo from GitHub
- Use
vagrant up
to download and install the guest machine (also use this to bring the virtual machine up after halt or suspend) - Use
vagrant status
to check whether the vagrant machine is up and running. - you might have to update via
vagrant box update
- start and stop vagrant via
vagrant up
andvagrant halt
(do not usevagrant suspend
in most cases) - use
vagrant provision
to start the provisioning of the machine. In our case this will start the chef machinery to install the python environment. You can restart this command. - Use
vagrant destory
if you have to restart completly from scratch or have to reuse the disk space.
- go to the folder that contains the vagrantfile and isue
vagrant plugin install vagrant-vbguest
- see this blog for details.
- Use Github to open tickets for support questions.
- Follow me on Twitter
@r_rbn
- Tweet using
#basket4py
. Or send me a DM. - Forking, starring, following the github repo would be great.