Announcing the release of v0.6-alpha

Announcing the release of v0.6-alpha! 🏹 now scales to datasets 10-100 larger enabling new classes of uses cases and applications! 🚀 We’ve completely rebuilt’s data processing and transport upon Apache Arrow, a high-performance platform that uses an in-memory columnar format. joins other major projects including Apache Spark, pandas, and InfluxDB in being powered by Apache Arrow. This also paves the way for high-performance data connections to the runtime using Apache Arrow Flight and import/export of data using Apache Parquet. We’re incredibly excited about the potential this architecture has for building intelligent applications on top of a high-performance transport between application data sources the AI engine.

Highlights in v0.6-alpha

Massive improvement in data loading performance and dataset scale

From data connectors, to REST API, to AI engine, we’ve now rebuilt’s data processing and transport on the Apache Arrow project. Specifically, using the Apache Arrow for Go implementation. Many thanks to Matt Topol for his contributions to the project and guidance on using it.

This release includes a change to the runtime to AI Engine transport from sending text CSV over gGPC to Apache Arrow Records over IPC (Unix sockets).

This is a breaking change to the Data Processor interface, as it now uses arrow.Record instead of Observation.

Benchmarking v0.6

Before v0.6, would not scale into the 100s of 1000s of rows.

FormatRow NumberData SizeProcess TimeLoad TimeTransport timeMemory Usage
csv200,00016.31MiB0.2778s0.0000sNA (error)0.000MiB
csv2,000,000164.97MiB0.2573s0.0050sNA (error)0.000MiB
json200,00029.85MiB0.2782s0.0010sNA (error)0.000MiB
json2,000,000300.39MiB0.3353s0.0080sNA (error)0.000MiB

After building on Arrow, now easily scales beyond millions of rows.

FormatRow NumberData SizeProcess TimeLoad TimeTransport timeMemory Usage

New in this release


Community started with the vision to make AI easy for developers. We are building in the open and with the community. Reach out on Discord or by email to get involved. We will also be starting a community call series soon!