Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vega renames my columns #1313

Closed
HarvsG opened this issue Jan 27, 2019 · 8 comments
Closed

Vega renames my columns #1313

HarvsG opened this issue Jan 27, 2019 · 8 comments

Comments

@HarvsG
Copy link

HarvsG commented Jan 27, 2019

My chart code is as below

cols_for_chart = ['column_A', 'column_B', 'column_C', 'number', 'location']

chart = alt.Chart(df[cols_for_chart]).mark_point().encode(
        x = 'column_B:Q',
        y = 'column_A:Q',
        color = 'location:N',
        tooltip = 'number:N'
)
chart

However, when I run the code I get a blank chart that doesn't seem to be any of the issues here. When I look at the underlying vega I see that some, but not all, of the columns have been renamed in the format _x for no discernible reason.

I am running this install of Jupyter on a raspberry pi 3 running python 3.5

 "values": [
        {
          "number": 74,
          "_1": 0.407222977488288,
          "location": "XXX_IP",
          "column_C": "severe",
          "_0": 0.868440022280233
        },

However, when I run the same code on my ubuntu install (anaconda) then I get a perfect graph :) and the vega is as below. Python 3.6

      "values": [
        {
          "column_A": 0.868443337280233,
          "column_B": 0.407303973338288,
          "column_C": "severe",
          "number": 76,
          "location": "XXX_IP"
        },

When I put the whole df in to see what the vega output was I saw that about 2 thirds of the columns had been renamed in this fashion. It looked as if it was all column names with whitespace in, but replacing this with underscores did not fix the issue.

Edit: it seems to be doing this with any special character - so far I have identified ' ', ')', '(', '/' as characters that cause this behaviour.

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 28, 2019

Several characters have special meanings in vega-lite. If your columns contain those, they should be renamed. See the description of the "field" property here: https://altair-viz.github.io/user_guide/encoding.html#encoding-channel-options

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 28, 2019

Looking closer at this, it seems like the issue is in different versions of the vega/vega-lite javascript dependencies. These are packaged with the jupyterlab/jupyter notebook installation itself, rather than with Altair. I suspect fixing this will require an update of the version of jupyter notebook used in https://github.com/kleinee/jns.

@HarvsG
Copy link
Author

HarvsG commented Jan 29, 2019

On the pi running jns (graphing error)

(jns) $ jupyter notebook --version
> 5.7.4
(jns) $ jupyter --version
> 4.4.0
(jns) $ jupyter lab --version
> 0.35.4
(jns) $ python --version
> 3.5.3

On my laptop running conda (no graphing error)

(py36) $ jupyter notebook --version
> 5.7.0
(py36) $ jupyter --version
> 4.4.0
(py36) $ jupyter lab --version
> 0.35.4
(py36) $ python --version
> 3.6.4

Not sure how to check the versions of the javascript dependencies

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 29, 2019

Unfortunately, it's currently quite difficult to inspect the javascript versions in your currently-installed extensions. See vega/ipyvega#97

But if your vega JS is behaving differently between machines, I think it's reasonable to guess that you have different library versions installed on each machine.

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 29, 2019

More details: if you're using JupyterLab, the vega JS libraries are bundled with JupyterLab itself, and are determined by the version of Jupyterlab you are using. If you're using jupyter notebook, the vega JS libraries must be provided by the vega package. Different versions of the vega package imply different versions of the vega/vega-lite javascript dependencies:

>>> import vega
>>> vega.__version__

@HarvsG
Copy link
Author

HarvsG commented Jan 29, 2019

pi:

import vega
vega.__version__
'2.0.0'

laptop:

import vega
vega.__version__
--------------------------------------------------------------------------
ModuleNotFoundError                      Traceback (most recent call last)
<ipython-input-4-ab83a66286da> in <module>()
----> 1 import vega
      2 vega.__version__

ModuleNotFoundError: No module named 'vega'

weird.

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 30, 2019

I suspect you're using JupyterLab rather than Jupyter notebook on your laptop. The vega package is only required for the classic notebook.

Vega version 2.0 contains a prerelease of vegalite.js version 3.0, which has a number of bugs. If you downgrade the vega python package to 1.4, it should work identically to JupyterLab on your own laptop.

@jakevdp
Copy link
Collaborator

jakevdp commented Jan 30, 2019

So, to summarize the reason for your issue: on the machine where your chart worked, you were rendering the plot with vegalite.js v2.6 via JupyterLab. On the machine where your chart didn't work, you were rendering the plot with the unstable vegalite.js v3.0 prerelease candidate, via a mistakenly-released version of the Jupyter notebook extension.

Downgrade the vega package (which includes the vega notebook extension) to version 1.4, and things should work properly.

@jakevdp jakevdp closed this as completed Jan 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants