Releases: JuliaData/DataFrames.jl
Releases ยท JuliaData/DataFrames.jl
v1.8.1
DataFrames v1.8.1
Merged pull requests:
- avoid defining a one arg hash since it has some invalidation issues (#3516) (@KristofferC)
v1.8.0
DataFrames v1.8.0
Important changes in this release
- DataFrames.jl now requires Julia 1.10 or later
- DataFrames.jl supports PrettyTables.jl v3
- Data frame hashing is now compatible with changes in upcoming Julia 1.13 release. Additionally currently column names are taken into account when hashing a data frame.
Merged pull requests:
- update hashing to Julia 1.13 and use column names in data frame hashing (#3507) (@bkamins)
- PrettyTables.jl v3 (#3510) (@ronisbr)
- Bump actions/checkout from 4 to 5 (#3511) (@dependabot[bot])
- Prepare for 1.8 release (#3512) (@bkamins)
- Adjust codebase to the fact that we require at least Julia 1.10 (#3513) (@bkamins)
Closed issues:
v1.7.1
DataFrames v1.7.1
Ecosystem changes:
- CompatHelper: bump compat for DataStructures to 0.19, (keep existing compat) (#3503) (@github-actions[bot])
- Bump codecov/codecov-action from 4 to 5 (#3481) (@dependabot[bot])
Documentation changes:
- Updated Basic Usage of Manipulation Functions (#3360) (@nathanrboyer)
- docs for aggregation over grouped array-like elements (#3425) (@huangyxi)
- Stabilize random number reproducibility in doctests (#3472) (@nathanrboyer)
- Docs: Fix typo (#3474) (@agdestein)
- dcast instead of SDcols (#3475) (@tdhock)
- typo, df was d (#3477) (@rOsemium)
- compare stack/unstack to data.table melt/dcast (#3478) (@tdhock)
- Small formatting tweaks to #3360 after reviewing online (#3483) (@nathanrboyer)
- Update querying_frameworks.md adding TidierData on introduction (#3488) (@indymnv)
- Document
DataFramedefinition in code file usingCSV.jl(#3501) (@MagicMuscleMan) - Update categorical.md after CategoricalArrays.jl release (#3504) (@bkamins)
v1.7.0
DataFrames v1.7.0
Merged pull requests:
- allow push!/pushfirst!/append!/prepend! with multiple values (#3372) (@bkamins)
- add cols kwarg to rename/rename! (#3380) (@bkamins)
- Add JSS citation information (#3381) (@bkamins)
- fix typos (#3384) (@spaette)
- Fix
@spawn_or_run_taskwith interactive threads (#3385) (@nalimilan) - add cols to mapcols and mapcols! (#3386) (@bkamins)
- add example of using Tables.dictcolumntable (#3387) (@bkamins)
- fix nonunique bug (#3393) (@bkamins)
- remove unnecessary @time in tests (#3394) (@bkamins)
- fix first and last for negative row count (#3402) (@bkamins)
- Fix eachrow and eachcol indexing with CartesianIndex (#3413) (@bkamins)
- Update for Documenter.jl v1 and Julia v1.10 (#3416) (@hyrodium)
- Change big to BigInt calls (#3419) (@bkamins)
- Update docs on Juliacon (#3420) (@hyrodium)
- Import groupby from DataAPI, remove by and aggregate (#3422) (@bkamins)
- Advanced transformation examples (#3433) (@bkamins)
- disambiguate allunique signature (#3434) (@bkamins)
- do not pass empty vector to Tables.columntable (#3435) (@bkamins)
- Explain the role of querying frameworks for DataFrames.jl (#3438) (@bkamins)
- Typo fix (#3439) (@nathanrboyer)
- Add TidierData to frameworks docs page (#3447) (@drizk1)
- add
?suffix to show on all return paths (#3448) (@adienes) - Update ci.yml (#3449) (@ViralBShah)
- Create dependabot.yml (#3450) (@ViralBShah)
- Bump julia-actions/cache from 1 to 2 (#3453) (@dependabot[bot])
- fix vcat type piracy (#3457) (@bkamins)
- Remove REPL dependency (#3459) (@topolarity)
- Update filter docs, Fixes #3460 (#3461) (@sprig)
- fix tests on nightly and 32-bit (#3463) (@bkamins)
- Improve names docs (#3464) (@bkamins)
- CompatHelper: add new compat entry for Statistics at version 1, (keep existing compat) (#3465) (@github-actions[bot])
- Fix codecov badge in README.md (#3466) (@ViralBShah)
Closed issues:
- rand(::GroupedDataFrame) sampler? (#2097)
- Investigate performance of innerjoin between large tables (#2974)
- Make row lookup easier (#3051)
- website: https://juliadata.org (#3338)
- Feature Request: Allow naming function in
renameoperation pairs. (#3361) - Would adding support for JLD2.jl allow Type preservation? (#3364)
- Add support for multiple positional arguments in push!/pushfirst!/append!/prepend! (#3371)
- "RowNumber by Partition" function (#3374)
- Not with non-existing columns (#3375)
- leftjoin! is actually copying reference instead of value?! (#3379)
- Tests of
describeandmultithreadingfail in Julia-1.10.0-beta3 (#3383) - error when unique! a empty dataframe (#3392)
combineon grouped df return empty df when args is empty (#3399)- Inconsistent Mean Calculation in Grouped DataFrame Compared to Overall DataFrame (#3405)
- What is the best way to write large DataFrames efficiently and with high performance in Julia while minimizing memory usage? (#3406)
- Segmentation Fault when reading compressed file (#3407)
- Better error message when forming a DataFrame from a vector of dictionaries with missing data. (#3410)
describeis slow (#3411)- CartesianIndex error in Julia 1.11 (#3412)
DataFrame(x=Int[], y=Int)(#3414)- unique fails with column-type FixedDecimal (#3418)
- Grouped DataFrame with array elements fails to combine (#3424)
- error when combining a grouped empty dataframe using
first(#3426) - Short circuit && on subset? (#3427)
- Document custom generation of column names in manual (#3430)
- using propertynames on GroupedDataFrame (#3443)
- Very slow to convert DBInterface (DuckDB) result (#3444)
- Add Tidier.jl to docs/src/man/querying_frameworks.md (#3446)
- Type piracy of
reduce(vcat)(#3456) - filter performance (#3460)
- [POSSIBLE REGRESSION] DataFrames.jl Currently Failing on Nightly? (#3467)
v1.6.1
v1.6.0
DataFrames v1.6.0
Closed issues:
- sort! to give warning if resulting sorting order is not fully determined (#2159)
- More flexible
Notcolumn selector (#3288) - DataFrame not print correctly (#3292)
- transpose method errors (#3295)
- juliadata.org website pointing to random blog about martial arts? (#3296)
- When partitioned, partition might lose the missingness eltype (in Tables.schema) (#3298)
transformshould expand a data frame when it has 0 rows. (#3301)Base.reduce(::typeof(vcat), ...)on DataFrames does not supportinit(#3309)- DimensionMismatch when checking if the cell value (not) belong to a collection (#3316)
- Rename SubDataFrame columns (#3317)
- Accepting array element in rows specificed by named tuples, in
combine(#3335) unstackerror message for missing values (#3339)- Bounds error when sorting a column after
select(#3340) - Don't print all data in huge columns (#3343)
- Show problem columns for "ArgumentError: missing values in key columns are not allowed when matchmissing == :error" (#3345)
- Don't truncate UUID columns (#3346)
- Cannot
vcatDataFrames withReadStatTables.LabeledArrays (#3351) - Join memory usage workaround issues (#3355)
Merged pull requests:
- Fix typo in the manual (#3287) (@bkamins)
- Use
pkgdirinstead ofpathof(#3289) (@rikhuijzer) - Update README.md (#3297) (@aramirezreyes)
- add Iterators.partition for DataFrameRows (#3299) (@bkamins)
- add support for Not with multiple positional indices (#3302) (@bkamins)
- add
:sumtodescribe(#3303) (@alecloudenback) - deleteat! where drop is a column (#3304) (@gustafsson)
- Correct documentation typos (#3305) (@Naunet)
- Fix some typos (#3308) (@goggle)
- add init kwarg to vcat (#3310) (@bkamins)
- add nrow, ncol, and Tables.subset for eachcol and eachrow (#3311) (@bkamins)
- Simple uniqueness checks for sorting-related functions (#3312) (@alonsoC1s)
- Document use of isequal for comparisons (#3313) (@knuesel)
- Add support for
renamecolskeyword argument incrossjoin(#3314) (@bkamins) - Update reshape.jl (#3319) (@alancummings)
- Allow to always pass column names in DataFrame constructor (#3320) (@bkamins)
- Allow CI failure on Julia nightly (#3321) (@bkamins)
- Use DataAPI.rownumber instead of DataFrames'
rownumber(#3322) (@VEZY) - copy more constructors from type doc to getting started (#3323) (@xgdgsc)
[@ref]=>(@ref)(#3325) (@likanzhan)- SnoopPrecompile -> PrecompileTools (#3326) (@timholy)
- Update documentation of how to disable precompilation (#3329) (@bkamins)
- Stop using internal [inv]permute!! as sentinel (#3330) (@LilithHafner)
- optimize reverse! for small data frames and factor out _foreach_unique_column (#3332) (@LilithHafner)
- Add "Julia for Data Analysis" reference in manual (#3333) (@bkamins)
- Add test for issue #3340 which exposed upstream issues with the use of TimSort (#3341) (@LilithHafner)
- fix dispatch errors in tests on Julia 1.10 (#3342) (@bkamins)
- improve unstack error messages (#3344) (@bkamins)
- Do not crop columns with type Base.UUID (#3347) (@ronisbr)
- Correctly handle Tables.AbstractRow in operation specficiation (#3348) (@bkamins)
- improve error messages in joins (#3349) (@bkamins)
- Fix typo (#3350) (@ronisbr)
- Prepare for 1.6 release (#3352) (@bkamins)
- fix tests on 32-bit (#3353) (@bkamins)
v1.5.0
DataFrames v1.5.0
Closed issues:
- New contents about handing missing values in DataFrame (#1662)
- Functions taking collections of column names always require them to be in AbstractVectors (#1769)
- Stack/Melt over multiple sets of variables (#1839)
- Allow unstack to take multiple columns to unstack on (#2148)
- Feature request: unstack multiple :values columns (#2215)
- Add
allkeyword argument tononunique(#2238) - special case
percentageincombine(#2272) - Add a
pushfirst!method (#2275) - add
filterexample to docs on taking subsets (#2318) - Some code blocks missing syntax highlighting in docs (#2319)
- Stacking multiple groups of columns (#2414)
- Add more keyword arguments to
stackandunstack(#2422) - Add reverse and reverse! functions similar to sort and sort! (#2438)
- Allow keeping first or last observation with unique function (#2443)
- Add
insert!(#2446) - Improve inline documentation of select to include examples of multiple columns not to be included (#2513)
- Transposing DataFrame (#2743)
- add a keyword to allow specifying target row order in joins (#2753)
- Improve flatten (slightly breaking) (#2767)
- Add manual part for indexing and selection (#2887)
- a new method of the flatten function in DataFrames (#2890)
- Generalization of the value parameter in the unstack function (#3066)
- resolve circular reference issue when printing (#3148)
- Support
alluniquewith column selectors? (#3205) - Add support for Tables.AbstractRow to functions that take row (#3244)
- Stack Overflow during type inference with large dataframes (#3246)
innerjoinfast path where join column is allequal? (#3247)- Invalidations when loading CSV (#3248)
- Improve groupby sort (#3251)
- improve performance of dropmissing (#3254)
- Let DataFrame behave more like GroupedDataFrame with one zero-key group (#3257)
- Lifecycle annotations (#3259)
Stringdisplay quotation missing (#3261)- Bool columns are printed as 0/1 in HTML, but not in plain (#3265)
- sum doesn't work with Missing column (#3267)
- Views of DataFrame design issue (#3272)
- Multi-threading hangs combine on Julia nightly (#3275)
- Check CompatHelper setup (#3278)
- Add
getfunction for AbstractDataFrame (#3281) - Rename Iterators.partition (#3284)
Merged pull requests:
- add Iterators.partition (#3212) (@bkamins)
- add an option to intersect arguments passed to Cols (#3224) (@bkamins)
- Add allunique and improve nonunique and describe (#3232) (@bkamins)
- Add an option in joins to specify row order (#3233) (@bkamins)
- Improve examples in the manual in basics.md (#3236) (@bkamins)
- Add hints to use macro packages for new users (#3238) (@bkamins)
- improve error message when used selector is incorrect (#3242) (@bkamins)
- add support for Tables.AbstractRow in push!, pushfirst!, and insert! (#3245) (@bkamins)
- fix deleteat! and subset! performance (#3249) (@bkamins)
- Fix typo in documentation (#3250) (@bkamins)
- Mention ReadStatTables.jl in documentation (#3252) (@junyuan-chen)
- Add sorting options to groupby (#3253) (@bkamins)
- Improve performance of dropmissing (#3256) (@svilupp)
- add keep to nonunique, unique, and unique! (#3260) (@bkamins)
- document breaking change policy (#3262) (@bkamins)
- improve error message in operation specification syntax (#3263) (@bkamins)
- Fix bug in subset[!] when handling no conditions case (#3264) (@bkamins)
- Fix error in fast aggregation of missing only columns for sum and mean (#3268) (@bkamins)
- add information about TableMetadaTools.jl to docs (#3269) (@bkamins)
- Update TagBot.yml (#3271) (@bkamins)
- correctly index into a SubDataFrame with no columns (#3273) (@bkamins)
- Reduce size of multi-threading enablement to 100_000 (#3274) (@bkamins)
- Improve allcombinations docstring + minor cleanups after #3256 (#3276) (@bkamins)
- Allow to pass multiple predicates in
Colsand mix them with other selectors (#3279) (@bkamins) - update CompatHelper.jl setup (#3280) (@bkamins)
- add haskey and get support for DataFrameColumns (#3282) (@bkamins)
- Add
scalarkeyword argument toflatten(#3283) (@bkamins) - improve precompilation coverage (#3285) (@bkamins)
v1.4.4
v1.4.3
DataFrames v1.4.3
Closed issues:
- docs for
groupindiceshas wrong example (#3210) - (Possible) Bug with
shufflewhen shufflingDataFramerows (#3211) - Improve combine documentation (#3214)
- ERROR: AssertionError: length(res) > 0 (#3217)
- Column metadata anchored to wrong column after insertion of new colums (#3218)
Merged pull requests:
- Make sure we use MIME when calling repr in GroupedDataFrame printing (#3213) (@bkamins)
- add default style to metadata! and colmetadata! (#3216) (@bkamins)
- fix insertcols! bug (not shifting column metadata) (#3220) (@bkamins)
- fix HTML printing tests after PrettyTables.jl 2.2 release (#3221) (@bkamins)
- make aggregation of empty GroupedDataFrame correct with AsTable (#3222) (@bkamins)
v1.4.2
DataFrames v1.4.2
Closed issues:
- Make docstrings method specific (#2015)
- Additional functions supported for DataFrame.jl (#2088)
- OffsetArray Compatibility (#2123)
- Return data frame unaltered when Not only includes columns that are not in data frame (#2197)
- Kwarg to choose missing values for unstack (#2205)
- Allow DF() as a selector in select and combine (#2220)
- no method matching InvertedIndex(::String, ::String) (#2227)
- add view::Bool kwarg to first and last (#2845)
- Inconsistency in
push!ing an empty row into a DataFrame (#2953) - Flatten errors on empty dataframe (#3197)
- 10 seconds to
show(df)of size (120764, 22) (#3202) - Ignoring ENV["LINES"] in 1.4.x (#3203)
- JET.JL problem with v1.4.1 (#3204)
- Speed of filter (#3208)
- Allow
endto select last column. (#3209)
Merged pull requests: