Skip to content

Commit

Permalink
Merge branch 'bk/multiinsert' of https://github.com/JuliaData/DataFra…
Browse files Browse the repository at this point in the history
…mes.jl into bk/multiinsert
  • Loading branch information
bkamins committed Sep 26, 2023
2 parents 5f57ac8 + f213d3f commit a2c4873
Show file tree
Hide file tree
Showing 8 changed files with 67 additions and 22 deletions.
12 changes: 12 additions & 0 deletions CITATION.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
@article{JSSv107i04,
title={DataFrames.jl: Flexible and Fast Tabular Data in Julia},
volume={107},
url={https://www.jstatsoft.org/index.php/jss/article/view/v107i04},
doi={10.18637/jss.v107.i04},
abstract={DataFrames.jl is a package written for and in the Julia language offering flexible and efficient handling of tabular data sets in memory. Thanks to Julia’s unique strengths, it provides an appealing set of features: Rich support for standard data processing tasks and excellent flexibility and efficiency for more advanced and non-standard operations. We present the fundamental design of the package and how it compares with implementations of data frames in other languages, its main features, performance, and possible extensions. We conclude with a practical illustration of typical data processing operations.},
number={4},
journal={Journal of Statistical Software},
author={Bouchet-Valat, Milan and Kamiński, Bogumił},
year={2023},
pages={1--32}
}
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
* Allow passing multiple values to add in `push!`, `pushfirst!`,
`append!`, and `prepend!`
([#3372](https://github.com/JuliaData/DataFrames.jl/pull/3372))
* `rename` and `rename!` now allow to apply a function transforming
column names only to a subset of the columns specified by the `cols`
keyword argument
([#3380](https://github.com/JuliaData/DataFrames.jl/pull/3380))

# DataFrames.jl v1.6.1 Release Notes

Expand Down
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,9 @@ that is available on GitHub.
[docs-stable-img]: https://img.shields.io/badge/docs-stable-blue.svg
[docs-stable-url]: http://dataframes.juliadata.org/stable/

**Citing**: For now, the best way of citing this package is using the [Zenodo link](https://doi.org/10.5281/zenodo.7632427).
**Citing**: We encourage you to cite our work if you have used DataFrames.jl package.
Starring the DataFrames.jl repository on GitHub is also appreciated.

The citation information may be found in the [CITATION.bib](CITATION.bib) file within the repository:

> Bouchet-Valat, M., & Kamiński, B. (2023). DataFrames.jl: Flexible and Fast Tabular Data in Julia. Journal of Statistical Software, 107(4), 1–32. https://doi.org/10.18637/jss.v107.i04
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ running with tabular data manipulation using the DataFrames.jl package.
For more illustrations of DataFrames.jl usage, in particular in conjunction with
other packages you can check-out the following resources
(they are kept up to date with the released version of DataFrames.jl):
* [DataFrames.jl: Flexible and Fast Tabular Data in Julia](https://www.jstatsoft.org/article/view/v107i04) article published in the *Journal of Statistical Software*
* [Data Wrangling with DataFrames.jl Cheat Sheet](https://www.ahsmart.com/pub/data-wrangling-with-data-frames-jl-cheat-sheet/)
* [DataFrames Tutorial using Jupyter Notebooks](https://github.com/bkamins/Julia-DataFrames-Tutorial/)
* [Julia Academy DataFrames.jl tutorial](https://github.com/JuliaAcademy/DataFrames)
Expand Down Expand Up @@ -277,7 +278,7 @@ missing please kindly report an issue
during which it is deprecated. The situations where such a breaking change
might be allowed are (still such breaking changes will be avoided if
possible):

* the affected functionality was previously clearly identified in the
documentation as being subject to changes (for example in DataFrames.jl 1.4
release propagation rules of `:note`-style metadata are documented as such);
Expand Down
42 changes: 29 additions & 13 deletions src/abstractdataframe/abstractdataframe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Compat.hasproperty(df::AbstractDataFrame, s::AbstractString) = haskey(index(df),
rename!(df::AbstractDataFrame, (from => to)::Pair...)
rename!(df::AbstractDataFrame, d::AbstractDict)
rename!(df::AbstractDataFrame, d::AbstractVector{<:Pair})
rename!(f::Function, df::AbstractDataFrame)
rename!(f::Function, df::AbstractDataFrame; cols=All())
Rename columns of `df` in-place.
Each name is changed at most once. Permutation of names is allowed.
Expand All @@ -132,8 +132,10 @@ Each name is changed at most once. Permutation of names is allowed.
- `df` : the `AbstractDataFrame`
- `d` : an `AbstractDict` or an `AbstractVector` of `Pair`s that maps
the original names or column numbers to new names
- `f` : a function which for each column takes the old name as a `String`
and returns the new name that gets converted to a `Symbol`
- `f` : a function which for each column selected by the `cols` keyword argument
takes the old name as a `String`
and returns the new name that gets converted to a `Symbol`; the `cols`
column selector can be any value accepted as column selector by the `names` function
- `vals` : new column names as a vector of `Symbol`s or `AbstractString`s
of the same length as the number of columns in `df`
- `makeunique` : if `false` (the default), an error will be raised
Expand Down Expand Up @@ -194,6 +196,14 @@ julia> rename!(uppercase, df)
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 2 3
julia> rename!(lowercase, df, cols=contains('A'))
1×3 DataFrame
Row │ a B a_1
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 2 3
```
"""
function rename!(df::AbstractDataFrame, vals::AbstractVector{Symbol};
Expand Down Expand Up @@ -252,12 +262,8 @@ end

rename!(df::AbstractDataFrame, args::Pair...) = rename!(df, collect(args))

function rename!(f::Function, df::AbstractDataFrame)
rename!(f, index(df))
# renaming columns of SubDataFrame has to clean non-note metadata in its parent
_drop_all_nonnote_metadata!(parent(df))
return df
end
rename!(f::Function, df::AbstractDataFrame; cols=All()) =
rename!(df, [n => Symbol(f(n)) for n in names(df, cols)])

"""
rename(df::AbstractDataFrame, vals::AbstractVector{Symbol};
Expand All @@ -267,7 +273,7 @@ end
rename(df::AbstractDataFrame, (from => to)::Pair...)
rename(df::AbstractDataFrame, d::AbstractDict)
rename(df::AbstractDataFrame, d::AbstractVector{<:Pair})
rename(f::Function, df::AbstractDataFrame)
rename(f::Function, df::AbstractDataFrame; cols=All())
Create a new data frame that is a copy of `df` with changed column names.
Each name is changed at most once. Permutation of names is allowed.
Expand All @@ -277,8 +283,10 @@ Each name is changed at most once. Permutation of names is allowed.
only allowed if it was created using `:` as a column selector.
- `d` : an `AbstractDict` or an `AbstractVector` of `Pair`s that maps
the original names or column numbers to new names
- `f` : a function which for each column takes the old name as a `String`
and returns the new name that gets converted to a `Symbol`
- `f` : a function which for each column selected by the `cols` keyword argument
takes the old name as a `String`
and returns the new name that gets converted to a `Symbol`; the `cols`
column selector can be any value accepted as column selector by the `names` function
- `vals` : new column names as a vector of `Symbol`s or `AbstractString`s
of the same length as the number of columns in `df`
- `makeunique` : if `false` (the default), an error will be raised
Expand Down Expand Up @@ -350,14 +358,22 @@ julia> rename(uppercase, df)
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 2 3
julia> rename(uppercase, df, cols=contains('x'))
1×3 DataFrame
Row │ i X y
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 2 3
```
"""
rename(df::AbstractDataFrame, vals::AbstractVector{Symbol};
makeunique::Bool=false) = rename!(copy(df), vals, makeunique=makeunique)
rename(df::AbstractDataFrame, vals::AbstractVector{<:AbstractString};
makeunique::Bool=false) = rename!(copy(df), vals, makeunique=makeunique)
rename(df::AbstractDataFrame, args...) = rename!(copy(df), args...)
rename(f::Function, df::AbstractDataFrame) = rename!(f, copy(df))
rename(f::Function, df::AbstractDataFrame; cols=All()) = rename!(f, copy(df); cols=cols)

"""
size(df::AbstractDataFrame[, dim])
Expand Down
2 changes: 0 additions & 2 deletions src/other/index.jl
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,6 @@ function rename!(x::Index, nms::AbstractVector{Pair{Symbol, Symbol}})
return x
end

rename!(f::Function, x::Index) = rename!(x, [(n=>Symbol(f(string(n)))) for n in x.names])

# we do not define keys on purpose;
# use names to get keys as strings with copying
# or _names to get keys as Symbols without copying
Expand Down
13 changes: 13 additions & 0 deletions test/dataframe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1112,6 +1112,19 @@ end
df = DataFrame(A=1)
asview && (df=view(df, :, :))
@test rename(x -> 1, df) == DataFrame(Symbol("1") => 1)

for cols in (:B, Not("A"), Cols(2), Char, contains('B'))
df = DataFrame(A=1:3, B='A':'C')
asview && (df = view(df, :, :))
@test names(rename(lowercase, df, cols=cols)) == ["A", "b"]
@test names(df) == ["A", "B"]
rename!(lowercase, df, cols=cols)
@test names(df) == ["A", "b"]
end
df = DataFrame(A=1:3, B='A':'C')
asview && (df = view(df, :, :))
@test names(rename(lowercase, df, cols=[:A, :B])) == ["a", "b"]
@test names(rename(lowercase, df, cols=Not(:))) == ["A", "B"]
end

sdf = view(DataFrame(ones(2, 3), :auto), 1:2, 1:3)
Expand Down
6 changes: 1 addition & 5 deletions test/index.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ using DataFrames: Index, SubIndex, fuzzymatch
@test_throws ArgumentError i[Not(:x)]
@test_throws ArgumentError i[Not("x")]
@test_throws BoundsError i[Not(1:3)]

@test i[Not([1, 1])] == [2]
@test i[Not([:A, :A])] == [2]
@test i[Not(["A", "A"])] == [2]
Expand Down Expand Up @@ -84,10 +84,6 @@ end
@test rename!(copy(i), [:a => :A]) == Index([:A, :b])
@test rename!(copy(i), [:a => :a]) == Index([:a, :b])
@test rename!(copy(i), [:a => :b, :b => :a]) == Index([:b, :a])
@test rename!(x -> Symbol(uppercase(string(x))), copy(i)) == Index([:A, :B])
@test rename!(x -> Symbol(lowercase(string(x))), copy(i)) == Index([:a, :b])
@test rename!(uppercase, copy(i)) == Index([:A, :B])
@test rename!(lowercase, copy(i)) == Index([:a, :b])

@test delete!(i, :a) == Index([:b])
push!(i, :C)
Expand Down

0 comments on commit a2c4873

Please sign in to comment.