Releases · JuliaData/DataFrames.jl

DataFrames.jl now requires Julia 1.10 or later
DataFrames.jl supports PrettyTables.jl v3
Data frame hashing is now compatible with changes in upcoming Julia 1.13 release. Additionally currently column names are taken into account when hashing a data frame.

Merged pull requests:

update hashing to Julia 1.13 and use column names in data frame hashing (#3507) (@bkamins)
PrettyTables.jl v3 (#3510) (@ronisbr)
Bump actions/checkout from 4 to 5 (#3511) (@dependabot[bot])
Prepare for 1.8 release (#3512) (@bkamins)
Adjust codebase to the fact that we require at least Julia 1.10 (#3513) (@bkamins)

Closed issues:

Reconsider hash due to Julia 1.13 changes (#3505)
Column types should be avaliable right away. (#3509)

Contributors

ronisbr, bkamins, and dependabot

Assets 2

25 Aug 17:18

github-actions

v1.7.1

3f2e837

v1.7.1

DataFrames v1.7.1

Diff since v1.7.0

Ecosystem changes:

CompatHelper: bump compat for DataStructures to 0.19, (keep existing compat) (#3503) (@github-actions[bot])
Bump codecov/codecov-action from 4 to 5 (#3481) (@dependabot[bot])

Documentation changes:

Updated Basic Usage of Manipulation Functions (#3360) (@nathanrboyer)
docs for aggregation over grouped array-like elements (#3425) (@huangyxi)
Stabilize random number reproducibility in doctests (#3472) (@nathanrboyer)
Docs: Fix typo (#3474) (@agdestein)
dcast instead of SDcols (#3475) (@tdhock)
typo, df was d (#3477) (@rOsemium)
compare stack/unstack to data.table melt/dcast (#3478) (@tdhock)
Small formatting tweaks to #3360 after reviewing online (#3483) (@nathanrboyer)
Update querying_frameworks.md adding TidierData on introduction (#3488) (@indymnv)
Document DataFrame definition in code file using CSV.jl (#3501) (@MagicMuscleMan)
Update categorical.md after CategoricalArrays.jl release (#3504) (@bkamins)

Contributors

MagicMuscleMan, tdhock, and 7 other contributors

Assets 2

23 Sep 21:37

github-actions

v1.7.0

85815e4

v1.7.0

DataFrames v1.7.0

Diff since v1.6.1

Merged pull requests:

allow push!/pushfirst!/append!/prepend! with multiple values (#3372) (@bkamins)
add cols kwarg to rename/rename! (#3380) (@bkamins)
Add JSS citation information (#3381) (@bkamins)
fix typos (#3384) (@spaette)
Fix @spawn_or_run_task with interactive threads (#3385) (@nalimilan)
add cols to mapcols and mapcols! (#3386) (@bkamins)
add example of using Tables.dictcolumntable (#3387) (@bkamins)
fix nonunique bug (#3393) (@bkamins)
remove unnecessary @time in tests (#3394) (@bkamins)
fix first and last for negative row count (#3402) (@bkamins)
Fix eachrow and eachcol indexing with CartesianIndex (#3413) (@bkamins)
Update for Documenter.jl v1 and Julia v1.10 (#3416) (@hyrodium)
Change big to BigInt calls (#3419) (@bkamins)
Update docs on Juliacon (#3420) (@hyrodium)
Import groupby from DataAPI, remove by and aggregate (#3422) (@bkamins)
Advanced transformation examples (#3433) (@bkamins)
disambiguate allunique signature (#3434) (@bkamins)
do not pass empty vector to Tables.columntable (#3435) (@bkamins)
Explain the role of querying frameworks for DataFrames.jl (#3438) (@bkamins)
Typo fix (#3439) (@nathanrboyer)
Add TidierData to frameworks docs page (#3447) (@drizk1)
add ? suffix to show on all return paths (#3448) (@adienes)
Update ci.yml (#3449) (@ViralBShah)
Create dependabot.yml (#3450) (@ViralBShah)
Bump julia-actions/cache from 1 to 2 (#3453) (@dependabot[bot])
fix vcat type piracy (#3457) (@bkamins)
Remove REPL dependency (#3459) (@topolarity)
Update filter docs, Fixes #3460 (#3461) (@sprig)
fix tests on nightly and 32-bit (#3463) (@bkamins)
Improve names docs (#3464) (@bkamins)
CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#3465) (@github-actions[bot])
Fix codecov badge in README.md (#3466) (@ViralBShah)

Closed issues:

rand(::GroupedDataFrame) sampler? (#2097)
Investigate performance of innerjoin between large tables (#2974)
Make row lookup easier (#3051)
website: https://juliadata.org (#3338)
Feature Request: Allow naming function in rename operation pairs. (#3361)
Would adding support for JLD2.jl allow Type preservation? (#3364)
Add support for multiple positional arguments in push!/pushfirst!/append!/prepend! (#3371)
"RowNumber by Partition" function (#3374)
Not with non-existing columns (#3375)
leftjoin! is actually copying reference instead of value?! (#3379)
Tests of describe and multithreading fail in Julia-1.10.0-beta3 (#3383)
error when unique! a empty dataframe (#3392)
combine on grouped df return empty df when args is empty (#3399)
Inconsistent Mean Calculation in Grouped DataFrame Compared to Overall DataFrame (#3405)
What is the best way to write large DataFrames efficiently and with high performance in Julia while minimizing memory usage? (#3406)
Segmentation Fault when reading compressed file (#3407)
Better error message when forming a DataFrame from a vector of dictionaries with missing data. (#3410)
describe is slow (#3411)
CartesianIndex error in Julia 1.11 (#3412)
DataFrame(x=Int[], y=Int) (#3414)
unique fails with column-type FixedDecimal (#3418)
Grouped DataFrame with array elements fails to combine (#3424)
error when combining a grouped empty dataframe using first (#3426)
Short circuit && on subset? (#3427)
Document custom generation of column names in manual (#3430)
using propertynames on GroupedDataFrame (#3443)
Very slow to convert DBInterface (DuckDB) result (#3444)
Add Tidier.jl to docs/src/man/querying_frameworks.md (#3446)
Type piracy of reduce(vcat) (#3456)
filter performance (#3460)
[POSSIBLE REGRESSION] DataFrames.jl Currently Failing on Nightly? (#3467)

Contributors

time, ViralBShah, and 10 other contributors

Assets 2

22 Jul 21:53

github-actions

v1.6.1

e341cc7

v1.6.1

DataFrames v1.6.1

Diff since v1.6.0

Closed issues:

sort missing data placement (#2267)
Dependency on DataStructures should be explicit and versioned (#3358)

Merged pull requests:

Improve error message when pushing/appending with promote=true (#3356) (@bkamins)
more descriptive error message for only (#3357) (@ssfrr)
Bk/fix faster orderings (#3359) (@bkamins)
Add vector of names method to rename docstring (#3362) (@nathanrboyer)

Contributors

ssfrr, bkamins, and nathanrboyer

Assets 2

10 Jul 06:39

github-actions

v1.6.0

8ba2288

v1.6.0

DataFrames v1.6.0

Diff since v1.5.0

Closed issues:

sort! to give warning if resulting sorting order is not fully determined (#2159)
More flexible Not column selector (#3288)
DataFrame not print correctly (#3292)
transpose method errors (#3295)
juliadata.org website pointing to random blog about martial arts? (#3296)
When partitioned, partition might lose the missingness eltype (in Tables.schema) (#3298)
transform should expand a data frame when it has 0 rows. (#3301)
Base.reduce(::typeof(vcat), ...) on DataFrames does not support init (#3309)
DimensionMismatch when checking if the cell value (not) belong to a collection (#3316)
Rename SubDataFrame columns (#3317)
Accepting array element in rows specificed by named tuples, in combine (#3335)
unstack error message for missing values (#3339)
Bounds error when sorting a column after select (#3340)
Don't print all data in huge columns (#3343)
Show problem columns for "ArgumentError: missing values in key columns are not allowed when matchmissing == :error" (#3345)
Don't truncate UUID columns (#3346)
Cannot vcat DataFrames with ReadStatTables.LabeledArrays (#3351)
Join memory usage workaround issues (#3355)

Merged pull requests:

Fix typo in the manual (#3287) (@bkamins)
Use pkgdir instead of pathof (#3289) (@rikhuijzer)
Update README.md (#3297) (@aramirezreyes)
add Iterators.partition for DataFrameRows (#3299) (@bkamins)
add support for Not with multiple positional indices (#3302) (@bkamins)
add :sum to describe (#3303) (@alecloudenback)
deleteat! where drop is a column (#3304) (@gustafsson)
Correct documentation typos (#3305) (@Naunet)
Fix some typos (#3308) (@goggle)
add init kwarg to vcat (#3310) (@bkamins)
add nrow, ncol, and Tables.subset for eachcol and eachrow (#3311) (@bkamins)
Simple uniqueness checks for sorting-related functions (#3312) (@alonsoC1s)
Document use of isequal for comparisons (#3313) (@knuesel)
Add support for renamecols keyword argument in crossjoin (#3314) (@bkamins)
Update reshape.jl (#3319) (@alancummings)
Allow to always pass column names in DataFrame constructor (#3320) (@bkamins)
Allow CI failure on Julia nightly (#3321) (@bkamins)
Use DataAPI.rownumber instead of DataFrames' rownumber (#3322) (@VEZY)
copy more constructors from type doc to getting started (#3323) (@xgdgsc)
[@ref] => (@ref) (#3325) (@likanzhan)
SnoopPrecompile -> PrecompileTools (#3326) (@timholy)
Update documentation of how to disable precompilation (#3329) (@bkamins)
Stop using internal [inv]permute!! as sentinel (#3330) (@LilithHafner)
optimize reverse! for small data frames and factor out _foreach_unique_column (#3332) (@LilithHafner)
Add "Julia for Data Analysis" reference in manual (#3333) (@bkamins)
Add test for issue #3340 which exposed upstream issues with the use of TimSort (#3341) (@LilithHafner)
fix dispatch errors in tests on Julia 1.10 (#3342) (@bkamins)
improve unstack error messages (#3344) (@bkamins)
Do not crop columns with type Base.UUID (#3347) (@ronisbr)
Correctly handle Tables.AbstractRow in operation specficiation (#3348) (@bkamins)
improve error messages in joins (#3349) (@bkamins)
Fix typo (#3350) (@ronisbr)
Prepare for 1.6 release (#3352) (@bkamins)
fix tests on 32-bit (#3353) (@bkamins)

Contributors

gustafsson, alecloudenback, and 14 other contributors

Assets 2

11 Feb 14:37

github-actions

v1.5.0

1b9fa19

v1.5.0

DataFrames v1.5.0

Diff since v1.4.4

Closed issues:

New contents about handing missing values in DataFrame (#1662)
Functions taking collections of column names always require them to be in AbstractVectors (#1769)
Stack/Melt over multiple sets of variables (#1839)
Allow unstack to take multiple columns to unstack on (#2148)
Feature request: unstack multiple :values columns (#2215)
Add all keyword argument to nonunique (#2238)
special case percentage in combine (#2272)
Add a pushfirst! method (#2275)
add filter example to docs on taking subsets (#2318)
Some code blocks missing syntax highlighting in docs (#2319)
Stacking multiple groups of columns (#2414)
Add more keyword arguments to stack and unstack (#2422)
Add reverse and reverse! functions similar to sort and sort! (#2438)
Allow keeping first or last observation with unique function (#2443)
Add insert! (#2446)
Improve inline documentation of select to include examples of multiple columns not to be included (#2513)
Transposing DataFrame (#2743)
add a keyword to allow specifying target row order in joins (#2753)
Improve flatten (slightly breaking) (#2767)
Add manual part for indexing and selection (#2887)
a new method of the flatten function in DataFrames (#2890)
Generalization of the value parameter in the unstack function (#3066)
resolve circular reference issue when printing (#3148)
Support allunique with column selectors? (#3205)
Add support for Tables.AbstractRow to functions that take row (#3244)
Stack Overflow during type inference with large dataframes (#3246)
innerjoin fast path where join column is allequal? (#3247)
Invalidations when loading CSV (#3248)
Improve groupby sort (#3251)
improve performance of dropmissing (#3254)
Let DataFrame behave more like GroupedDataFrame with one zero-key group (#3257)
Lifecycle annotations (#3259)
String display quotation missing (#3261)
Bool columns are printed as 0/1 in HTML, but not in plain (#3265)
sum doesn't work with Missing column (#3267)
Views of DataFrame design issue (#3272)
Multi-threading hangs combine on Julia nightly (#3275)
Check CompatHelper setup (#3278)
Add get function for AbstractDataFrame (#3281)
Rename Iterators.partition (#3284)

Merged pull requests:

add Iterators.partition (#3212) (@bkamins)
add an option to intersect arguments passed to Cols (#3224) (@bkamins)
Add allunique and improve nonunique and describe (#3232) (@bkamins)
Add an option in joins to specify row order (#3233) (@bkamins)
Improve examples in the manual in basics.md (#3236) (@bkamins)
Add hints to use macro packages for new users (#3238) (@bkamins)
improve error message when used selector is incorrect (#3242) (@bkamins)
add support for Tables.AbstractRow in push!, pushfirst!, and insert! (#3245) (@bkamins)
fix deleteat! and subset! performance (#3249) (@bkamins)
Fix typo in documentation (#3250) (@bkamins)
Mention ReadStatTables.jl in documentation (#3252) (@junyuan-chen)
Add sorting options to groupby (#3253) (@bkamins)
Improve performance of dropmissing (#3256) (@svilupp)
add keep to nonunique, unique, and unique! (#3260) (@bkamins)
document breaking change policy (#3262) (@bkamins)
improve error message in operation specification syntax (#3263) (@bkamins)
Fix bug in subset[!] when handling no conditions case (#3264) (@bkamins)
Fix error in fast aggregation of missing only columns for sum and mean (#3268) (@bkamins)
add information about TableMetadaTools.jl to docs (#3269) (@bkamins)
Update TagBot.yml (#3271) (@bkamins)
correctly index into a SubDataFrame with no columns (#3273) (@bkamins)
Reduce size of multi-threading enablement to 100_000 (#3274) (@bkamins)
Improve allcombinations docstring + minor cleanups after #3256 (#3276) (@bkamins)
Allow to pass multiple predicates in Cols and mix them with other selectors (#3279) (@bkamins)
update CompatHelper.jl setup (#3280) (@bkamins)
add haskey and get support for DataFrameColumns (#3282) (@bkamins)
Add scalar keyword argument to flatten (#3283) (@bkamins)
improve precompilation coverage (#3285) (@bkamins)

Contributors

bkamins, junyuan-chen, and svilupp

Assets 2

01 Dec 17:46

github-actions

v1.4.4

6fab523

v1.4.4

DataFrames v1.4.4

Diff since v1.4.3

Closed issues:

Segmentation fault Julia 1.8.2, DataFrames v1.4.3 (#3227)
sizeof() not working correctly with Dataframes (#3229)
subset / subset! AbstractVector restriction inconvenient (#3230)

Merged pull requests:

Explain column-independent operations (#3225) (@bkamins)
Fix unstack docstring (#3226) (@bkamins)
fix select bug with copycols=false on SubDataFrame (#3231) (@bkamins)
fix markdown tests (#3234) (@bkamins)

Contributors

bkamins

Assets 2

13 Nov 07:09

github-actions

v1.4.3

3935888

v1.4.3

DataFrames v1.4.3

Diff since v1.4.2

Closed issues:

docs for groupindices has wrong example (#3210)
(Possible) Bug with shuffle when shuffling DataFrame rows (#3211)
Improve combine documentation (#3214)
ERROR: AssertionError: length(res) > 0 (#3217)
Column metadata anchored to wrong column after insertion of new colums (#3218)

Merged pull requests:

Make sure we use MIME when calling repr in GroupedDataFrame printing (#3213) (@bkamins)
add default style to metadata! and colmetadata! (#3216) (@bkamins)
fix insertcols! bug (not shifting column metadata) (#3220) (@bkamins)
fix HTML printing tests after PrettyTables.jl 2.2 release (#3221) (@bkamins)
make aggregation of empty GroupedDataFrame correct with AsTable (#3222) (@bkamins)

Contributors

bkamins

Assets 2

27 Oct 10:44

github-actions

v1.4.2

abbed11

v1.4.2

DataFrames v1.4.2

Diff since v1.4.1

Closed issues:

Make docstrings method specific (#2015)
Additional functions supported for DataFrame.jl (#2088)
OffsetArray Compatibility (#2123)
Return data frame unaltered when Not only includes columns that are not in data frame (#2197)
Kwarg to choose missing values for unstack (#2205)
Allow DF() as a selector in select and combine (#2220)
no method matching InvertedIndex(::String, ::String) (#2227)
add view::Bool kwarg to first and last (#2845)
Inconsistency in push!ing an empty row into a DataFrame (#2953)
Flatten errors on empty dataframe (#3197)
10 seconds to show(df) of size (120764, 22) (#3202)
Ignoring ENV["LINES"] in 1.4.x (#3203)
JET.JL problem with v1.4.1 (#3204)
Speed of filter (#3208)
Allow end to select last column. (#3209)

Merged pull requests:

Mention DataFrameMacros.jl in the docs (#3195) (@jkrumbiegel)
make sure flatten works corretly on a data frame with zero rows (#3198) (@bkamins)
improve manual entry of assignment to a data frame (#3201) (@bkamins)

Contributors

bkamins and jkrumbiegel

Assets 2

Releases: JuliaData/DataFrames.jl

v1.8.1

DataFrames v1.8.1

Contributors

Uh oh!

v1.8.0

DataFrames v1.8.0

Contributors

Uh oh!

v1.7.1

DataFrames v1.7.1

Contributors

Uh oh!

v1.7.0

DataFrames v1.7.0

Contributors

Uh oh!

v1.6.1

DataFrames v1.6.1

Contributors

Uh oh!

v1.6.0

DataFrames v1.6.0

Contributors

Uh oh!

v1.5.0

DataFrames v1.5.0

Contributors

Uh oh!

v1.4.4

DataFrames v1.4.4

Contributors

Uh oh!

v1.4.3

DataFrames v1.4.3

Contributors

Uh oh!

v1.4.2

DataFrames v1.4.2

Contributors

Uh oh!