Deploying a Standalone Databend
Deploying a Standalone Databend
Databend works with both self-hosted and cloud object storage solutions. This topic explains how to deploy Databend with your object storage. For a list of supported object storage solutions, see Understanding Deployment Modes.
It is not recommended to deploy Databend on top of MinIO for production environments or performance testing.
Setting up Your Object Storage
- Amazon S3
- Google GCS
- Azure Blob
- Tencent COS
- Alibaba OSS
- QingCloud QingStor
- Huawei OBS
- Wasabi
- MinIO
- WebHDFS
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket named
databend
. - Get the Google Cloud Storage OAuth2 credential of your account.
For information about how to manage buckets and OAuth2 credentials in Google Cloud Storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
Before deploying Databend, make sure you have successfully set up your object storage environment in the cloud, and the following tasks have been completed:
- Create a bucket or container named
databend
. - Get the endpoint URL for connecting to the bucket or container you created.
- Get the Access Key ID and Secret Access Key for your account.
For information about how to manage buckets and Access Keys for your cloud object storage, refer to the user manual from the solution provider. Here are some useful links you may need:
a. Follow the MinIO Quickstart Guide to download and install the MinIO package to your local machine.
b. Open a terminal window and navigate to the folder where MinIO is stored.
c. Run the command vim server.sh
to create a file with the following content:
~/minio$ cat server.sh
export MINIO_ROOT_USER=minioadmin
export MINIO_ROOT_PASSWORD=minioadmin
./minio server --address :9900 ./data
d. Run the following commands to start the MinIO server:
chmod +x server.sh
./server.sh
e. In your browser, go to http://127.0.0.1:9900 and enter the credentials (minioadmin
/ minioadmin
) to log in to the MinIO Console.
f. In the MinIO Console, create a bucket named databend
.
Before deploying Databend, make sure you have successfully set up your Hadoop environment, and the following tasks have been completed:
- Enable the WebHDFS support on Hadoop.
- Get the endpoint URL for connecting to WebHDFS.
- Get the delegation token used for authentication (if needed).
For information about how to enable and manage WebHDFS on Apache Hadoop, please refer to the manual of WebHDFS. Here are some links you may find useful:
Downloading Databend
a. Create a folder named databend
in the directory /usr/local
.
b. Download and extract the latest Databend release for your platform from Github Release:
- Linux(x86)
- Linux(arm)
- MacOS(x86)
- MacOS(arm)
curl -LJO https://github.com/datafuselabs/databend/releases/download/${version}/databend-${version}-x86_64-unknown-linux-musl.tar.gz
curl -LJO https://github.com/datafuselabs/databend/releases/download/${version}/databend-${version}-aarch64-unknown-linux-musl.tar.gz
curl -LJO https://github.com/datafuselabs/databend/releases/download/${version}/databend-${version}-x86_64-apple-darwin.tar.gz
curl -LJO https://github.com/datafuselabs/databend/releases/download/${version}/databend-${version}-aarch64-apple-darwin.tar.gz
- Linux(x86)
- Linux(arm)
- MacOS(x86)
- MacOS(arm)
tar xzvf databend-${version}-x86_64-unknown-linux-musl.tar.gz
tar xzvf databend-${version}-aarch64-unknown-linux-musl.tar.gz
tar xzvf databend-${version}-x86_64-apple-darwin.tar.gz
tar xzvf databend-${version}-aarch64-apple-darwin.tar.gz
c. Move the extracted folders bin
, configs
, and scripts
to the folder /usr/local/databend
.
Deploying a Meta Node
a. Open the file databend-meta.toml
in the folder /usr/local/databend/configs
, and replace 127.0.0.1
with 0.0.0.0
within the whole file.
b. Open a terminal window and navigate to the folder /usr/local/databend/bin
.
c. Run the following command to start the Meta node:
./databend-meta -c ../configs/databend-meta.toml > meta.log 2>&1 &
d. Run the following command to check if the Meta node was started successfully:
curl -I http://127.0.0.1:28101/v1/health
Deploying a Query Node
a. Open the file databend-query.toml
in the folder /usr/local/databend/configs
, and replace 127.0.0.1
with 0.0.0.0
within the whole file.
b. In the file databend-query.toml
, set the parameter type
in [storage] block to s3
if you're using a S3 compatible object storage, or azblob
if you're using Azure Blob storage.
[storage]
# fs | s3 | azblob | gcs | obs | webhdfs
type = "s3"
c. Comment out the [storage.fs]
block first, and then uncomment the [storage.s3]
block if you're using a S3 compatible object storage, or uncomment the [storage.azblob]
block if you're using Azure Blob storage.
# Set a local folder to store your data.
# Comment out this block if you're NOT using local file system as storage.
#[storage.fs]
#data_path = "benddata/datas"
# To use S3-compatible object storage, uncomment this block and set your values.
[storage.s3]
bucket = "<your-bucket-name>"
endpoint_url = "<your-endpoint>"
access_key_id = "<your-key-id>"
secret_access_key = "<your-account-key>"
# To use Azure Blob storage, uncomment this block and set your values.
# [storage.azblob]
# endpoint_url = "https://<your-storage-account-name>.blob.core.windows.net"
# container = "<your-azure-storage-container-name>"
# account_name = "<your-storage-account-name>"
# account_key = "<your-account-key>"
# To use Google Cloud Storage, uncomment this block and set your values.
# [storage.gcs]
# bucket = "<your-bucket-name>"
# credential = "<your-credential>"
# To use Huawei Cloud OBS Storage, uncomment this block and set your values.
# [storage.obs]
# bucket = "<your-bucket-name>"
# endpoint_url = "<your-endpoint>"
# access_key_id = "<your-key-id>"
# secret_access_key = "<your-account-key>"
# To use WebHDFS Storage, uncomment this block and set with your values
# [storage.webhdfs]
# endpoint_url = "<your-endpoint>"
# root = "<your-working-directory>"
# delegation = "<delegation-token-for-authentication>"
d. Set your values in the [storage.s3]
, [storage.azblob]
, [storage.gcs]
, [storage.obs]
or [storage.webhdfs]
block. Please note that the field endpoint_url
refers to the service URL of your storage region and varies depending on the object storage solution you use:
- Amazon S3
- Google GCS
- Azure Blob
- Tencent COS
- Alibaba OSS
- QingCloud QingStor
- Huawei OBS
- Wasabi
- MinIO
- WebHDFS
[storage]
# s3
type = "s3"
[storage.s3]
# https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html
bucket = "databend"
endpoint_url = "https://s3.amazonaws.com"
# How to get access_key_id and secret_access_key:
# https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
[storage]
# gcs
type = "gcs"
[storage.gcs]
# How to create a bucket:
# https://cloud.google.com/storage/docs/creating-buckets
bucket = "databend-1.048596"
# GCS also supports changing the endpoint URL
# but the endpoint should be compatible with GCS's JSON API
# default:
# endpoint_url = "https://storage.googleapis.com/"
# working directory of GCS
# default:
# root = "/"
credential = "<your-credential>"
[storage]
# azblob
type = "azblob"
[storage.azblob]
endpoint_url = "https://<your-storage-account-name>.blob.core.windows.net"
# https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal#create-a-container
container = "<your-azure-storage-container-name>"
account_name = "<your-storage-account-name>"
# https://docs.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?tabs=azure-portal#view-account-access-keys
account_key = "<your-account-key>"
[storage]
# s3
type = "s3"
[storage.s3]
# How to create a bucket:
# https://cloud.tencent.com/document/product/436/13309
bucket = "databend-1253727613"
# You can get the URL from the bucket detail page.
endpoint_url = "https://cos.ap-beijing.myqcloud.com"
# How to get access_key_id and secret_access_key:
# https://cloud.tencent.com/document/product/436/68282
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
In this example COS region is ap-beijing
.
[storage]
# s3
type = "s3"
[storage.s3]
# How to create a bucket:
bucket = "databend"
# You can get the URL from the bucket detail page.
# https://help.aliyun.com/document_detail/31837.htm
# https://<bucket-name>.<region-id>[-internal].aliyuncs.com
endpoint_url = "https://oss-cn-beijing-internal.aliyuncs.com"
enable_virtual_host_style = true
# How to get access_key_id and secret_access_key:
# https://help.aliyun.com/document_detail/53045.htm
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
In this example OSS region id is oss-cn-beijing-internal
.
[storage]
# s3
type = "s3"
[storage.s3]
bucket = "databend"
# You can get the URL from the bucket detail page.
# https://docsv3.qingcloud.com/storage/object-storage/intro/object-storage/#zone
endpoint_url = "https://s3.pek3b.qingstor.com"
# How to get access_key_id and secret_access_key:
# https://docs.qingcloud.com/product/api/common/overview.html
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
In this example QingStor region is pek3b
.
[storage]
# obs
type = "obs"
[storage.obs]
# How to create a bucket:
# https://support.huaweicloud.com/intl/en-us/usermanual-obs/en-us_topic_0045853662.html
bucket = "databend"
# You can get the URL from the bucket detail page.
endpoint_url = "https://obs.cn-north-4.myhuaweicloud.com"
# How to get access_key_id and secret_access_key:
# https://support.huaweicloud.com/intl/en-us/api-obs/obs_04_0116.html
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
In this example OBS region is cn-north-4
.
[storage]
# s3
type = "s3"
[storage.s3]
# How to create a bucket:
bucket = "<your-bucket>"
# You can get the URL from:
# https://wasabi-support.zendesk.com/hc/en-us/articles/360015106031-What-are-the-service-URLs-for-Wasabi-s-different-regions-
endpoint_url = "https://s3.us-east-2.wasabisys.com"
# How to get access_key_id and secret_access_key:
access_key_id = "<your-key-id>"
secret_access_key = "<your-access-key>"
In this example Wasabi region is us-east-2
.
[storage]
# s3
type = "s3"
[storage.s3]
bucket = "databend"
endpoint_url = "http://127.0.0.1:9900"
access_key_id = "minioadmin"
secret_access_key = "minioadmin"
[storage]
type = "webhdfs"
[storage.webhdfs]
endpoint_url = "https://hadoop.example.com:9870"
root = "/analyses/databend/storage"
# if your webhdfs needs authentication, uncomment and set with your value
# delegation = "<delegation-token>"
e. Open a terminal window and navigate to the folder /usr/local/databend/bin
.
f. Run the following command to start the Query node:
./databend-query -c ../configs/databend-query.toml > query.log 2>&1 &
g. Run the following command to check if the Query node was started successfully:
curl -I http://127.0.0.1:8080/v1/health
Verifying Deployment
In this section, we will run some queries against Databend to verify the deployment.
a. Download and install a MySQL client on your local machine.
b. Create a connection to 127.0.0.1 from your SQL client. In the connection, set the port to 3307
, and set the username to root
.
Create new users. The root
user only works when you access Databend from localhost. You will need to create new users and grant proper privileges first to connect to Databend remotely. For example,
-- Create a user named "eric" with the password "databend"
CREATE USER eric IDENTIFIED BY 'databend';
-- Grant the ALL privilege on all existing tables in the default database to the user eric:
GRANT ALL ON default.* TO eric;
For more information about creating new users, see CREATE USER.
c. Run the following commands and check if the query is successful:
CREATE TABLE t1(a int);
INSERT INTO t1 VALUES(1), (2);
SELECT * FROM t1;
Starting and Stopping Databend
Each time you start and stop Databend, simply run the scripts in the folder /usr/local/databend/scripts
:
# Start Databend
./scripts/start.sh
# Stop Databend
./scripts/stop.sh
In case you encounter the subsequent error messages while attempting to start Databend:
==> query.log <==
: No getcpu support: percpu_arena:percpu
: option background_thread currently supports pthread only
Databend Query start failure, cause: Code: 1104, displayText = failed to create appender: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }.
Run the following commands and try starting Databend again:
sudo mkdir /var/log/databend
sudo mkdir /var/lib/databend
sudo chown -R $USER /var/log/databend
sudo chown -R $USER /var/lib/databend