Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing a dim does not violate the constraint on that dim #64

Open
bencottier opened this issue May 28, 2021 · 2 comments
Open

Removing a dim does not violate the constraint on that dim #64

bencottier opened this issue May 28, 2021 · 2 comments

Comments

@bencottier
Copy link

I thought this would throw an error.

MWE:

julia> ds = KeyedDataset(
           :train => KeyedArray(zeros(5, 2); target=1:5, id=[:a, :b]),
           :predict => KeyedArray(zeros(3, 2); target=6:8, id=[:a, :b]),
           constraints=Pattern[(:_, :id)]
       )
KeyedDataset with:
  2 components
    (:train,) => 5x2 KeyedArray{Float64} with dimension target, id[1]
    (:predict,) => 3x2 KeyedArray{Float64} with dimension target, id[1]
  1 constraints
    [1] (:_, :id)  2-element Vector{Symbol}

julia> map(A -> A(id=:a), ds, (:predict, :_))
KeyedDataset with:
  2 components
    (:train,) => 5x2 KeyedArray{Float64} with dimension target, id[1]
    (:predict,) => 3 KeyedArray{Float64} with dimension target
  1 constraints
    [1] (:_, :id)  2-element Vector{Symbol}

Whereas it throws an error if the dim preserved:

julia> map(A -> A(id=AxisKeys.Interval(:a, :a)), ds, (:predict, :_))
ERROR: KeyAlignmentError: Misaligned dimension keys on constraint Pattern((:_, :id))
  Tuple[(:predict, :id)]  1-element view(::Vector{Symbol}, [1]) with eltype Symbol
  Tuple[(:train, :id)]  2-element Vector{Symbol}

Stacktrace:
 [1] validate(ds::KeyedDataset, constraint::Pattern{Tuple{Symbol, Symbol}}, paths::Set{Tuple})
   @ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/dataset.jl:293
 [2] validate(ds::KeyedDataset)
   @ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/dataset.jl:267
 [3] map!(f::var"#5#6", dest::KeyedDataset, src::KeyedDataset)
   @ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/functions.jl:52
 [4] map(f::Function, ds::KeyedDataset, keys::Tuple{Symbol, Symbol})
   @ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/functions.jl:41
 [5] top-level scope
   @ REPL[9]:1
@rofinn
Copy link
Member

rofinn commented May 29, 2021

Yeah, that's intentional cause it's pretty common to wanna drop dimensions and not have it error. We could have map impose that all shared dims remain... and require users to reconstruct the KeyedDataset from scratch if they want to change that, but that seems like it might get annoying.

@bencottier
Copy link
Author

bencottier commented Jun 2, 2021

it's pretty common to wanna drop dimensions and not have it error. We could have map impose that all shared dims remain...

Dropping dimensions is common but I thought it should error because the :train and :predict components (in my example) become inconsistent. I don't think we should impose that all shared dims remain. Rather the same dim must be dropped across components, if there is a constraint on that dim between components.

So it would force the user to do map(A -> A(id=:a), ds, (:_, :id)) rather than map(A -> A(id=:a), ds, (:predict, :_)).

But maybe there is a good reason to only drop a dim on one component, which would make that annoying?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants