Spice v0.14-alpha (June 17, 2024)
Categories:
The v0.14-alpha release focuses on enhancing accelerated dataset performance and data integrity, with support for configuring primary keys and indexes. Additionally, the GraphQL data connector been introduced, along with improved dataset registration and loading error information.
Highlights
Accelerated Datasets: Ensure data integrity using primary key and unique index constraints. Configure conflict handling to either upsert new data or drop it. Create indexes on frequently filtered columns for faster queries on larger datasets.
GraphQL Data Connector: Initial support for using GraphQL as a data source.
Example Spicepod showing how to use primary keys and indexes with accelerated datasets:
datasets:
- from: eth.blocks
name: blocks
acceleration:
engine: duckdb # Use DuckDB acceleration engine
primary_key: '(hash, timestamp)'
indexes:
number: enabled # same as `CREATE INDEX ON blocks (number);`
'(number, hash)': unique # same as `CREATE UNIQUE INDEX ON blocks (number, hash);`
on_conflict:
'(hash, number)': drop # possible values: drop (default), upsert
'(hash, timestamp)': upsert
Primary Keys, constraints, and indexes are currently supported when using SQLite, DuckDB, and PostgreSQL acceleration engines.
Learn more with the indexing quickstart and the primary key sample.
Read the Local Acceleration documentation.
Breaking Changes
None.
Contributors
- @phillipleblanc
- @ewgenius
- @sgrebnov
- @Jeadie
- @digadeesh
- @gloomweaver
- @y-f-u
- @lukekim
- @edmondop
What’s Changed
Dependencies
- Apache DataFusion: Upgraded from 38.0.0 to 39.0.0
- Apache Arrow/Parquet: Upgraded from 51.0.0 to 52.0.0
- Rust: Upgraded from 1.78.0 to 1.79.0
Commits
- Update Helm chart for v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1671
- Bump version to v0.14.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1673
- Dependency upgrades: DataFusion 39, Arrow/Parquet 52, object_store 0.10.1, arrow-odbc 11.1.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1674
- Generate unique runtime instance name and store in
runtime.metrics
table by @ewgenius in https://github.com/spiceai/spiceai/pull/1678 - Proper support for Snowflake TIMESTAMP_NTZ by @sgrebnov in https://github.com/spiceai/spiceai/pull/1677
- Enable tpch_q2 and tpch_q21 in the benchmark queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1679
- Start runtime metrics recorder after loading secrets and extensions by @ewgenius in https://github.com/spiceai/spiceai/pull/1680
- Validate table constraints (Primary Keys/Unique Index) on accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1658
- Store labels as JSON string in
runtime.metrics
by @ewgenius in https://github.com/spiceai/spiceai/pull/1681 - Atomic updates for DuckDB tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1682
- Rename metrics column
labels
toproperties
and make it nullable by @ewgenius in https://github.com/spiceai/spiceai/pull/1686 - Fix federation_optimizer_rule schema error for
tpch_q7
,tpch_q8
,tpch_q9
,tpch_q14
by @sgrebnov in https://github.com/spiceai/spiceai/pull/1683 - Better prompt for /v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1685
- Support stream in
v1/assist
by @Jeadie in https://github.com/spiceai/spiceai/pull/1653 - Fix cache hit rate chart loading for Grafana v9.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1691
- Update ROADMAP.md to include data connector statuses by @digadeesh in https://github.com/spiceai/spiceai/pull/1684
- Support
primary_key
in Spicepod and create in accelerated table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1687 - Datasets with schema support for availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1690
- Improve dataset registration output by @sgrebnov in https://github.com/spiceai/spiceai/pull/1692
- Readme: update dataset registration traces by @sgrebnov in https://github.com/spiceai/spiceai/pull/1694
- Improved error logging for datasets load error by @edmondop in https://github.com/spiceai/spiceai/pull/1695
- Improve
ArrayDistance
scalar UDF by @Jeadie in https://github.com/spiceai/spiceai/pull/1697 - Implement
on_conflict
behavior for accelerated tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1688 - Fix datasets live update (Spice file watcher) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1702
- Grafana Dashboard: replace Quantile with Percentile filter by @sgrebnov in https://github.com/spiceai/spiceai/pull/1703
- refresh with append overlap by @y-f-u in https://github.com/spiceai/spiceai/pull/1706
- Fix error message on DuckDB constraint violation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1709
- Add warning when configuring indexes/primary_key/on_conflict for Arrow engine. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1710
- ensure schema to be existing when query timestamp during refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1711
- Improve README clarity and add comparison table by @lukekim in https://github.com/spiceai/spiceai/pull/1713
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1716
- Update README.md to include GraphQL data connector in supported table by @digadeesh in https://github.com/spiceai/spiceai/pull/1717
- Fix quoting issue for databricks identifier by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1718
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.3-alpha...v0.14.0-alpha
Resources
Community
Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.
- Twitter: @spice_ai
- Discord: https://discord.gg/kZnTfneP5u
- Telegram: Spice AI Discussion
- Reddit: https://www.reddit.com/r/spiceai
- Email: [email protected]