-
-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
I'm using
- vegafusion 2.0.1
- altair 5.5.0
- duckdb 1.1.3
After loading a dudkdb table from a csv file (some 20K lines),
import duckdb
housing = duckdb.read_csv("housing.csv")I encountered the MaxRowsError when trying to draw a histogram with altair, even after enabling the "vegafusion" data transformer. The code works when I convert the DuckDBPyRelation to a polars dataframe (alt.Chart(housing.pl())), though.
import altair as alt
alt.data_transformers.enable("vegafusion")
alt.renderers.enable("jupyter")
(
alt.Chart(housing).mark_bar()
.encode(
alt.X('population').bin(maxbins=50).title(None),
alt.Y('count()').title(None))
.properties(width=200, height=100)
)The error message was:
---------------------------------------------------------------------------
MaxRowsError Traceback (most recent call last)
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/vegalite/v5/api.py:1998, in TopLevelMixin.to_dict(self, validate, format, ignore, context)
1995 except TypeError:
1996 # Non-narwhalifiable type supported by Altair, such as dict
1997 data = original_data
-> 1998 copy.data = _prepare_data(data, context)
1999 context["data"] = data
2001 # remaining to_dict calls are not at top level
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/vegalite/v5/api.py:283, in _prepare_data(data, context)
281 elif not isinstance(data, dict) and _is_data_type(data):
282 if func := data_transformers.get():
--> 283 data = func(nw.to_native(data, pass_through=True))
285 # convert string input to a URLData
286 elif isinstance(data, str):
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/utils/_vegafusion_data.py:105, in vegafusion_data_transformer(data, max_rows)
100 return {"url": VEGAFUSION_PREFIX + table_name}
101 else:
102 # Use default transformer for geo interface objects
103 # # (e.g. a geopandas GeoDataFrame)
104 # Or if we don't recognize data type
--> 105 return default_data_transformer(data)
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/vegalite/data.py:42, in default_data_transformer(data, max_rows)
39 return pipe
41 else:
---> 42 return to_values(limit_rows(data, max_rows=max_rows))
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/utils/data.py:165, in limit_rows(data, max_rows)
162 values = data
164 if max_rows is not None and len(values) > max_rows:
--> 165 raise_max_rows_error()
167 return data
File ~/.cache/pypoetry/virtualenvs/lab-home-wtptIbf4-py3.12/lib/python3.12/site-packages/altair/utils/data.py:148, in limit_rows.<locals>.raise_max_rows_error()
135 def raise_max_rows_error():
136 msg = (
137 "The number of rows in your dataset is greater "
138 f"than the maximum allowed ({max_rows}).\n\n"
(...)
146 "on how to plot large datasets."
147 )
--> 148 raise MaxRowsError(msg)
MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000).
Try enabling the VegaFusion data transformer which raises this limit by pre-evaluating data
transformations in Python.
>> import altair as alt
>> alt.data_transformers.enable("vegafusion")
Or, see https://altair-viz.github.io/user_guide/large_datasets.html for additional information
on how to plot large datasets.
Metadata
Metadata
Assignees
Labels
No labels