Skip to main content

Connect beeline

beeline is a client for running Spark SQL queries on a Thrift Server via the command line with a JDBC connection.

  1. By default, SQL Clusters only accept traffic from within your VPC. To submit queries with SQL clients outside the VPC, such as your local machine, follow these steps to connect through a VPN or bastion host.
  2. On the machine you will use to submit queries, download the beeline client included in the Spark 3.5.2 (with Hadoop3) distribution
  3. Find your SQL Cluster endpoint. In the Onehouse console, open the Clusters page, then click into your SQL Cluster. sql_endpoint_url
  4. Set up the beeline CLI for interactive queries with the following command (make sure to specify a database):
    $ beeline -u jdbc:hive2://<HOST>:10000/<DATABASE-NAME>
    • For the HOST:
      • If connecting through a VPN, use the SQL endpoint URL from the Onehouse console
      • If connecting through a bastion host, use localhost
  5. Run commands with the beeline CLI:
    1. Find an existing table in the Onehouse console or with SQL:
      $ show databases;
      $ use <database>;
      $ show tables;
    2. Query the existing table with beeline
      $ select * from <database>.<table>;
      beeline_query
  6. Use beeline to execute queries from a local file using the following command (make sure to specify a database):
        $ beeline -u jdbc:hive2://<SQL-CLUSTER-ENDPOINT>:10000/<DATABASE-NAME> -f <SQL-FILEPATH-TO-EXECUTE>
    Note: For best performance, run beeline in interactive mode (step 4) to avoid new session startup time with each query.