SAP Datasphere and Snowflake: An (im)possible marriage? (Part II)

 

A few weeks ago I shared part one of this blog, in which I made a connection from SAP S/4HANA to Snowflake via SAP Datasphere and Azure Data Factory (the latter as an ETL tool) with the purpose of loading SAP data into Snowflake. In the second part of this blog, I will demonstrate how to create a connection between SAP Datasphere and Snowflake and vice versa through a direct connection.

Why should you aspire a direct connection between two Cloud data warehouses? First of all, a connection such as this is beneficial because there are no components (e.g. ETL tools) between the two applications and this approach therefore results in a simplified architecture. A possible downside of this set up, is that query performance is dependent on the data source and bandwidth. Hence, if you have a large dataset, this is not the best way to integrate the two systems. Additionally, there are some limitations to this connection, which I will bring up later in this blog.

Blog DJ 1 1

Architectural overview of the setup

Let us get started.

Prerequisites for this setup:

  • SAP Datasphere
  • Snowflake
  • Snowflake JDBC driver and the SAP HANA JDBC driver
  • A functional SAP Data Provisioning Agent (Including some configuration)
  • Configuration in Snowflake (Network rule, secret, external integration and a function)

Datasphere to Snowflake connection

To establish the connection from Datasphere to Snowflake we need to configure the SAP Data Provisioning agent, create a user in Snowflake with the appropriate privileges, and create a connection from Datasphere to Snowflake. Because neither SAP Datasphere or Snowflake offers a generic connection via a wizard, we will have to create them ourselves via JDBC drivers.

Configuring the Data provisioning agent

The Data Provisioning agent needs to be configured; I will walk you through the steps below.

  • Download the Snowflake JDBC driver and put it in the /camel/lib


Blog DJ 1 1a
  • Uncomment the camel jdbc section in the /camel/adapters.xml
Blog DJ 1 2

Also make sure that CAP_LIMIT is added to Capabilities section, see SAP note 3348461 for information.

  • Also, check the delimident=true in the configfile-jdbc.properties file
Blog DJ 1 3
  • Next we have to check the /camel/samples/sample-jdbc-dialect.xml according to SAP note 3428356
  • After these adjustments, we can create a connection to Datasphere in the DP agent.
Blog DJ 1 4
  • Restart the agent and check if it is working in Datasphere. Remember to enable the adapters you want to use, in our case this is the CamelJDBCAdapter (shown below).
Blog DJ 1 5

Create a user in Snowflake

Our next step is to create a dedicated user in Snowflake with the appropriate privileges. You can do this either via SQL commands or via the User Interface. Make sure that you grant usage and operate-rights on a role and assign that role to the above user. Otherwise, you will get errors such as this one:

camelnet.snowflake.client.jdbc.SnowflakeSQLException: No active warehouse selected in the current session. Select an active warehouse with the 'use warehouse' command or internal error: Remote execution error Camel execution failed.

Create the connection in Datasphere

The following action we need to undertake is to create a connection in your desired space. Click Connections -> Space -> Create -> Connection Type: Generic JDBC and fill in your connection information.

Blog DJ 1 6

It is better to create one connection for each database in Snowflake, because otherwise this will result in some difficulties in the future. One of the difficulties as an example: if your user has a default database configured, you could connect to this connection instead of the desired one. Hence, in your JDBC URL connection, specify your database and warehouse after the connection string. Another possible issue is that if you connect without specifying a database, you will see multiple schema’s with the same name and you cannot see to which database it belongs. Finally, choose your Business Name wisely, as my example in the screenshot (Snowflake) is not the best name to use for obvious reasons.

The JDBC driver Class is as follows: net.snowflake.client.jdbc.SnowflakeDriver

The JDBC URL should be in this format: jdbc:snowflake://-snowflakecomputing.com?db=DATABASE_NAME&warehouse=WAREHOUSE_NAME

Validate the connection to see if everything (on a connection level) is correct. As you can see at features, only Remote Tables are connected, so that is a limitation of the connection. We also cannot use Transformation or Data Flows.

Next, in the Data Builder section we can select our connection to Snowflake under Sources.

Blog DJ 1 7

Now you can use tables in, for example, graphical views.

Snowflake to Datasphere connection

Now that we have a connection from SAP Datasphere to Snowflake, we will configure the same connection the other way around. Although the architecture is very simple once again, because there are no intermediate components, it will take a bit more effort to get this working.

We will use External Network Access to access the SAP HANA Cloud database within SAP Datasphere directly, where you configure database access in your space.

Configuration in Datasphere

  • Creation of a space (which I will skip in the context of this blog)
  • A database user in this space
  • IP Addresses from Snowflake to add to your whitelist

Database user in space

In our space we must configure Database Access. To achieve this, in SAP Datasphere, go to Space Management, click on Edit of the space you want to have Database Access configured, go to the section Database Access. Fill in the mandatory information and press Create. Hence, for the configuration in Snowflake we need to have the Database Username, Hostname and Port and of course the password of the # user.

Blog DJ 1 8

After creation of the database user, edit the privileges. Some of the features will not be necessary, but I had to select them for troubleshooting.

Blog DJ 1 9

IP-adresses from Snowflake in the IP Allowlist

Add the IP addresses to the Allowlist (Whitelist). You can find this via SYSTEM$ALLOWLIST.

Blog DJ 1 10

What do we need to configure in Snowflake

  • SAP HANA JDBC driver in an internal stage
  • Network rule with the hostname of the database of Datasphere
  • Secret to hold the credentials of the user we use in Datasphere
  • External access integration
  • An User Defined Function

SAP HANA JDBC driver in an internal stage

Download the SAP HANA JDBC driver and upload it to an internal stage. We will need to use this driver in the UDF:

Blog DJ 1 11

Network rule with the hostname of the Datasphere database

Create a network rule with the hostname and port from the database access, which is configured in datasphere:

Blog DJ 1 12

Configure the secret to hold the credentials of the user we use in Datasphere

Create a secret with the username and password from Datasphere:

Blog DJ 1 14

External access integration

Create an External Access Integration through the method shown below:

Blog DJ 1 15

User Defined Function

Next, we will need to create a Function. The below screenshot shows you a portion of it:

Blog DJ 1 16

(disclaimer: only a small portion of the function)

After creating the function, you can now finally call the function in SQL. The function DATASPHERE_CONNECT has two inputs:

An object containing the driver string and the jdbc url+port from Datasphere (and some additional parameters).
A string with the SQL query you want to execute in Datasphere.
Blog DJ 1 17

Now we have the end result of our configuration. As you can see, the result is passed as a JSON and can be used for further goals with data in Snowflake.

And let’s just do something else. Let’s create another function to also get the metadata.

Blog DJ 1 18

And if you put this function again in a SQL query

Blog DJ 1 19

And you execute this, you get the metadata of the table. This can be handy if you don’t have access to the resource.

Blog DJ 1 21

And with the results, we can execute another query to get results in a more “standard” form.

Blog DJ 1 22

Curious if this or a similar scenario could be helpful for your organization? At Expertum, we look beyond your SAP solutions in order to realize your data ambitions. Contact us to discuss your plans and enjoy a cup of coffee together.

About the author

Photo of Dirk-Jan Kloezeman
Dirk-Jan Kloezeman

Dirk-Jan is a SAP HANA Consultant - SAP Basis Consultant at Expertum.

Read more articles by Dirk-Jan Kloezeman

Related articles