Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ListArray::try_new rejects {Union,Dictionary}Array incorrectly if field is not nullable #6538

Open
kawadakk opened this issue Oct 10, 2024 · 1 comment
Labels

Comments

@kawadakk
Copy link
Contributor

Describe the bug
ListArray::try_new() uses Array::is_nullable() to check the presence of null values if field is not marked as nullable.

if !field.is_nullable() && values.is_nullable() {
return Err(ArrowError::InvalidArgumentError(format!(
"Non-nullable field of {}ListArray {:?} cannot contain nulls",
OffsetSize::PREFIX,
field.name()
)));
}

While this works for most types, this can cause a false positive for DictionaryArray and UnionArray because Array::is_nullable() is allowed to return true if it's expensive to prove the absence of logical nulls.

/// Implementations will return `true` unless they can cheaply prove no logical nulls
/// are present. For example a [`DictionaryArray`] with nullable values will still return true,
/// even if the nulls present in [`DictionaryArray::values`] are not referenced by any key,
/// and therefore would not appear in [`Array::logical_nulls`].
fn is_nullable(&self) -> bool {

To Reproduce

let offsets = OffsetBuffer::new(vec![0, 1, 4, 5].into());
let mut builder = UnionBuilder::new_dense();
builder.append::<Int32Type>("a", 1).unwrap();
builder.append::<Int32Type>("b", 2).unwrap();
builder.append::<Int32Type>("b", 3).unwrap();
builder.append::<Int32Type>("a", 4).unwrap();
builder.append::<Int32Type>("a", 5).unwrap();
let values = builder.build().unwrap();
let field = Arc::new(Field::new("element", values.data_type().clone(), false));
ListArray::new(field.clone(), offsets, Arc::new(values), None);

This produces:

thread 'foo' panicked at foo.rs:10:46:
called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Non-nullable field of ListArray \"element\" cannot contain nulls")

Expected behavior
Successful execution

Additional context

@kawadakk kawadakk added the bug label Oct 10, 2024
@tustvold
Copy link
Contributor

This looks to have been introduced for UnionArray in #6303. This isn't so much of a problem for DictionaryArray given how rare nulls in dictionary values are

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants