title: 'URL Tables' sidebar_label: 'URL Tables' description: 'Query object store files directly using URLs without pre-registering datasets' sidebar_position: 3 tags:
URL tables enable querying files in object stores directly using their URLs, without pre-registering datasets in a Spicepod. This provides an ad-hoc query capability for exploring data stored in S3, Azure Blob Storage, or HTTP endpoints.
URL tables are disabled by default and must be explicitly enabled in the Spicepod configuration:
| Scheme | Description | Example |
|---|---|---|
s3:// | Amazon S3 | s3://bucket/path/file.parquet |
abfs:// | Azure Blob Storage | abfs://container@account/path/file.parquet |
abfss:// | Azure Data Lake Storage Gen2 | abfss://container@account.dfs.core.windows.net/path/ |
https:// | HTTPS endpoints | https://example.com/data.parquet |
http:// | HTTP endpoints | http://localhost:8080/data.csv |
Query a single file by specifying its full URL:
Query all files under a directory or prefix by including a trailing slash:
Use glob patterns to match specific files:
Hive-style partitions are automatically inferred from the path structure, enabling partition pruning:
URL tables use the same authentication mechanisms as the corresponding data connectors. Credentials are loaded automatically from environment variables or cloud provider defaults.
For S3, credentials are loaded from:
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN~/.aws/credentials)For public buckets, no authentication is required.
For Azure, set the storage account name via environment variable:
Alternatively, include the account name in the URL:
Additional authentication options:
AZURE_STORAGE_KEY for access key authenticationSet the account via environment variable:
Or include the account in the URL:
URL tables can be combined with registered datasets in federated queries: