Invalid Input Error: There is no query result #70

kimmolinna · 2025-03-14T07:34:29Z

import duckdb
import quak
df = duckdb.sql("from read_parquet('https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet')")
quak.Widget(df)

If I try to run the last command the second time in Jupyter Notebook/Marimo I will get the following error message:

in Widget.__init__(self, data, table)
     52     arrow_table = data.to_arrow()
     53 elif has_pycapsule_stream_interface(data):
     54     # NOTE: for now we materialize the input into an in-memory Arrow table,
     55     # so that we can perform repeated queries on that. In the future, it may
     56     # be better to keep this Arrow stream non-materalized in Python and
     57     # create a new DuckDB table from the stream.
     58     # arrow_table = pa.RecordBatchReader.from_stream(data)
---> 59     arrow_table = pa.table(data)
     60 elif is_arrow_ipc(data):
     61     arrow_table = arrow_table_from_ipc(data)

File ...\site-packages\pyarrow\table.pxi:6159, in pyarrow.lib.table()

My quak version is 0.2.2, pyarrow 19.0.1 and Python 3.13.2. Operating system is a managed Windows 11.

The text was updated successfully, but these errors were encountered:

kimmolinna · 2025-03-14T08:21:14Z

Easy workaround is to use a polars dataframe. df = duckdb.sql("from read_parquet('https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet')").pl()

kylebarron · 2025-03-14T14:28:06Z

This is a known "problem" with the pycapsule interface, where DuckDB exposes an arrow stream, but that stream can only be consumed once. There's still discussion about this upstream, but it's currently intentional that the last line would fail the second time you run it, because the query has already been consumed.

manzt · 2025-03-14T16:27:23Z

Some more discussion here:

Upgrade to duckdb 1.1 & remove required pyarrow dependency? #54

One option to support out of core use is to create a view of the data source and pass in your own duckdb connection:

import duckdb
import quak

conn = duckdb.connect(":memory:")
conn.sql("CREATE VIEW df AS SELECT * FROM 'https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet'")
quak.Widget(conn, table="df")

Although, with the remote dataset the latency here isn't very good and creating a TABLE in memory is probably the best user experience. With local disk views for larger parquet datasets works quite well.

wget https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet

import duckdb
import quak

conn = duckdb.connect(":memory:")
conn.sql("CREATE VIEW df AS SELECT * FROM 'athletes.parquet'")
quak.Widget(conn, table="df")

kimmolinna · 2025-03-14T20:21:53Z

I'm not sure what @mscolnick did with marimo, but If I have two different cells

import duckdb
import quak

df = duckdb.sql("""from read_parquet("https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet")""")
widget = quak.Widget(df)

and the second one is just

widget

I don't have any problem running cells. BTW, I really like the command widget.data() for quick view.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid Input Error: There is no query result #70

Invalid Input Error: There is no query result #70

kimmolinna commented Mar 14, 2025 •

edited

Loading

kimmolinna commented Mar 14, 2025

kylebarron commented Mar 14, 2025

manzt commented Mar 14, 2025 •

edited

Loading

kimmolinna commented Mar 14, 2025

Invalid Input Error: There is no query result #70

Invalid Input Error: There is no query result #70

Comments

kimmolinna commented Mar 14, 2025 • edited Loading

kimmolinna commented Mar 14, 2025

kylebarron commented Mar 14, 2025

manzt commented Mar 14, 2025 • edited Loading

kimmolinna commented Mar 14, 2025

kimmolinna commented Mar 14, 2025 •

edited

Loading

manzt commented Mar 14, 2025 •

edited

Loading