Quantcast
Channel: SAP HANA Developer Center
Viewing all 737 articles
Browse latest View live

Table Distribution impact on Query execution in a HANA Scale out landscape

$
0
0

Background:

We wanted to review our current table distribution in the HANA sidecar and see if the resident reports/models/queries would be negatively/positively affected if we starting “grouping” similar tables on the same node. Much of the replicated tables from ECC are oltp transactional based tables and so even basic reports would need 5-10 tables to be joined or merged together. Before spending a lot of “where used” type investigation for the logical "grouping", we wanted to do some basic analysis on how much overhead not having all the sources of a report reside on the same node is having.

 

 

Our "Test/Perf" Scale out Landscape

 

 

dbversion.PNG

 

Up to now, table distribution was handled by the default round robin assignment, there's also an option through Studio to redistribute using a Wizard type function here. Please also Save the current distribution prior to making any changes.


Execute_Optimize_Distribution.PNG




At this stage it's a good time for folks to look at this good reference document to be found here, but a bit outdated for Studio links etc. As I don't want to go through all the same stuff here.

 

Table Distribution in Hana

 

 

I'm going to focus on just one schema for now, this is where all the ECC tables are replicated to, so let’s see how our current configuration matches up.


Landscape_Qry_Results1.PNG


From above, 02 looks to be carrying the bulk of the tables, so not sure how efficient the default distribution is working. In any case, please also note the 04 node is being kept exclusive for failover and isn't allocated any tables currently.

 

SELECT a.host, a.schema_name, a.table_name, a.record_count, a.loaded, indexserver_actual_role as "Server Role", 'Column Store Tables' as Component1,  round(ESTIMATED_MAX_MEMORY_SIZE_IN_TOTAL/(1024*1024*1024), 2) AS "ESTIMATED MEMORY SIZE(GB)",  round(MEMORY_SIZE_IN_TOTAL/(1024*1024*1024), 2) AS "Currently used MEMORY SIZE(GB)",  round(MEMORY_SIZE_IN_DELTA/(1024*1024*1024), 2) AS "Currently used Delta Store size(GB)"
FROM M_CS_TABLES a, public.m_landscape_host_configuration b
WHERE a.host = b.host
AND a.schema_name IN ('ECC_REPORTING')
AND ESTIMATED_MAX_MEMORY_SIZE_IN_TOTAL > 0
order by 8 desc

 

Going to focus specifically on 3 of the top 10,

 

Table_Qry_Results_new.PNG

 

An initial problem was that HF1 didn't look anything the table distribution in Production, so let's execute the following script on Prod, so we can sync HF1 to match.

 

select 'ALTER TABLE '||a.schema_name||'.'||a.table_name||' MOVE TO '||     replace(a.host,'PROD_HOST_NAME','PERF_HOST_NAME')||':37003'
from M_CS_TABLES a, public.m_landscape_host_configuration b
WHERE a.host = b.host
AND a.schema_name IN ('ECC_REPORTING')

Syntax for re-distribution


  1. 1. ALTERTABLE table_name MOVETO'<host:port>'


Note: Port is the port of the target index server and not the SQL port.

 

The actual alter table statement returns immediately, as during Redistribution the link to the table and not the physical table is moved. You'll also note the table is fully removed from memory and will have the loaded = No. So on the next merge or access, it will take some additional time.

 

A simple Calc View for our analysis

 

calcView.PNG

 

JEST being our largest table has already been hash partitioned equally across the 3 nodes on HF1. So let's work with what we have. The tables all basically reside on the 3 different nodes. Will I see any significant performance overhead due to this configuration?


PriorAlter1.PNG


Qry1:

 

Let's just select equipment number for now, which should prune the right hand side of the calc view above and not hit the partitioned JEST table at all.

 

Select top 1000 equnr
from "_SYS_BIC"."abc.mfg.genealogy.models/CV_EQUIP_GEN_NEW_STATUS"
where erdat > '20140301'

Let's see what the VizPlan has to offer, I do expect to see some Network Data Transfer entries based on current set-up.



qry1_viz_plan2.PNG

Note, there are a few Transfers noted above, however the overhead is relatively small compared to the overall run time.


Qry2:

While we have the table allocation as is, lets add in the partitioned table and see if we see more of these Network Data Transfer entries in the VizPlan. I'll also expand the query with no where clause, but keeping the top 1000 stop clause. This is a Perf box after all and I want to see if having to move around much larger volumes of records will increase the time spend on the network data transfer.

 

Select top 1000 equnr, stat
from "_SYS_BIC"."abc.mfg.genealogy.models/CV_EQUIP_GEN_NEW_STATUS"

qry2_viz_plan2.PNG

 

From above, we can see there is an increase in the number of Network Data Transfer, however even with the larger volume, the overhead seems acceptable, again compared to the overall runtime. We also see the both paths of the calc view are now executed and the JEST partitions are hit in parallel.

 

qry2_viz_plan3.PNG

 

 

Qry3:

 

Let's see if moving EQUI & EQUZ to the same node removes the network data transfer entries.

 

ALTER TABLE ECC_REPORTING.EQUZ MOVE TO 'HF1_HOSTNAMEX02:37003'    ;

Statement 'ALTER TABLE ECC_REPORTING.EQUZ MOVE TO 'hf1hana02:37003''

successfully executed in 259 ms 793 µs  (server processing time: 150 ms 915 µs) - Rows Affected: 0


AfterAlter1.PNG


Note EQUZ is no longer resident in memory, let's do a LOAD so this won't be an issue when we execute the query.


LOAD ECC_REPORTING.EQUZ ALL  
Started: 2014-05-09 15:19:49  
Statement 'LOAD ECC_REPORTING.EQUZ ALL'  
successfully executed in 1:05.285 minutes  (server processing time: 1:05.181 minutes) - Rows Affected: 0


Exec same qry1 again and let's check the VizPlan,

 

Select top 1000 equnr  
from "_SYS_BIC"."abc.mfg.genealogy.models/CV_EQUIP_GEN_NEW_STATUS"  
where erdat > '20140301'

 

qr3_results.PNG

As expected, the Network Data Transfer steps seen in Qry1 VizPlan are now gone.

 

 

Summary:

So I hope this blog is a start for folks that are looking at table distribution, how to check current configuration and how to possibly identify potential issues etc. The results themselves are not very dramatic or resounding, but hopefully this may kick off ffurther discussions. There's also room for a follow up specifically on Partitioning and how it can improve/reduce query efficiency. I only used 3 tables in the queries above, so the relatively small impact of having the tables distributing may increase with even larger tables or queries involving a larger number of tables.

So all comments or observations welcome

 


How to securely mask or hide column data using SQL MAP Function in SAP HANA views

$
0
0

The need for column level security at the database layer can be avoided by masking or hiding data at the application level. However, as a DBA, you may prefer to set up a model where

  1. Only DBAs have access to the physical model
  2. Access to data is only exposed through base views for which DBAs securely manage access.  Once the base views are in place and secured for your highly sensitive data, you can have the confidence you need to sleep well at night

 

SAP HANA offers the ability to manage column security by allowing the creation of additional (secured) views to expose more sensitive columns. In the event that this approach does not fit for your project due to specific administrative requirements, I offer here an alternate approach that may be considered.

 

Benefit of this approach

Lower maintenance due to less views

 

Overview

The following is intended to be a simple example demonstrating how to hide or mask column data based on a HANA user’s assigned privileges or roles. We will store sensitive employee data (social security number) in a HANA table that is not directly exposed to users. The employee names and SSNs will be exposed to users by a single view that exposes the SSN to some users and not others.

Step 1: Create schema and employee table

CREATESCHEMA MYSCHEMA;

 

CREATECOLUMNTABLE"MYSCHEMA"."EMPLOYEE"

("FIRST_NAME"NVARCHAR(32),

        "LAST_NAME"NVARCHAR(32),

        "SSN"NVARCHAR(12),

        "EMPLOYEE_ID"INTEGERNOTNULL,

        PRIMARYKEY ("EMPLOYEE_ID"));

 

insertinto"MYSCHEMA"."EMPLOYEE"values('LOU','JOHNSON','456-78-9123',1);

insertinto"MYSCHEMA"."EMPLOYEE"values('BOB','THOMPSON','345-67-8912',2);

insertinto"MYSCHEMA"."EMPLOYEE"values('CINDY','BENSON','234-56-7891',3);

 

Step 2: Create privilege table

In this example we create a privilege table for users where a one (1) in the HAS_PRIV column indicates that a user has this privilege. So USER1 has the privilege to access social security numbers.

CREATECOLUMNTABLE"MYSCHEMA"."PRIVS"

("USER_ID"NVARCHAR(32)  NOTNULL,

        "PRIV_NAME"NVARCHAR(32) NOTNULL,

        "HAS_PRIV"TINYINTNOTNULL,

        PRIMARYKEY ("USER_ID"));

 

insertinto"MYSCHEMA"."PRIVS"values('USER1', 'READ_SSN', 1);

insertinto"MYSCHEMA"."PRIVS"values('USER2', 'READ_SSN', 0);

 

Step 3: Create privilege view

This view uses the SQL MAP function to list the session_user’s granted privileges from the privilege table as columns.

 

Important Note: You must use the system variable session_user instead of current_user. See the explanation at the end of this post for the reason.

 

CREATEVIEW"MYSCHEMA"."V_PRIVS"AS

select

user_id,

        MAX(READ_SSN_PRIV) AS READ_SSN_PRIV

from ( select p.user_id,

MAP(p.PRIV_NAME, 'READ_SSN', MAP(p.HAS_PRIV, 1, p.HAS_PRIV, NULL), NULL) AS READ_SSN_PRIV

       from"MYSCHEMA"."PRIVS" p

       WHERE p.user_id=session_user )

GROUPBY user_id;

 

When I am logged in as USER1, I see the following privileges when I query the view.

pic1.png

Step 3: Create employee view (option 1)

The employee view will use the privilege view and return a null if the session_user does not have the required privilege granted in the priv tavble.

CREATEVIEW"MYSCHEMA"."V_EMPLOYEE"AS

select"FIRST_NAME",

        "LAST_NAME",

        "EMPLOYEE_ID",

         MAP(p.READ_SSN_PRIV, 1, e.SSN, NULL) AS SSN

from"MYSCHEMA"."EMPLOYEE" e,

        "MYSCHEMA"."V_PRIVS" p;

 

When I am logged in as USER1, I see the complete SSN data when I query the view.

pic2.png

When I am logged in as USER2, I see nulls for the SSN data when I query the view.

pic3.png

 

Step 4: Create employee view (option 2)

Instead of returning nulls, we could mask the first 5 digits of the SSN and display only the last four digits for users without the required privilege.

 

CREATEVIEW"MYSCHEMA"."V_EMPLOYEE_MASK_SSN" ( "FIRST_NAME",

        "LAST_NAME",

        "EMPLOYEE_ID",

        "SSN") AS

select"FIRST_NAME",

        "LAST_NAME",

        "EMPLOYEE_ID",

         MAP(p.READ_SSN_PRIV, 1, e.SSN, 'XXX-XX-' || SUBSTR(e.SSN, 8)) AS SSN

from"MYSCHEMA"."EMPLOYEE" e,

        "MYSCHEMA"."V_PRIVS" p

 

When I am logged in as USER1, I see the full SSN data when I query the view.

pic4.png

 

When I am logged in as USER2, I see only the last four digits of the SSN data when I query the view.

pic5.png

 

Option: Tie the privileges to assigned roles

You can also create a privilege view using the assigned role of a user by querying the sys.granted_roles table and matching the grantee column to the session_user

CREATEVIEW"MYSCHEMA"."V_PRIVS"AS

select

        user_id,

        MAX(READ_SSN_PRIV) as READ_SSN_PRIV

from ( select

        r.grantee as user_id,

        MAP('SSN_ROLE', r.role_name, 1, 0) AS READ_SSN_PRIV

       from sys.granted_roles r

       WHERE r.grantee = session_user )

groupby user_id

 

Considerations

  • In this example I show a SQL view. If using a column view you are limited to using a scripted calc view. As such, any dependent column views must also be calc views, as opposed to attribute or analytic views
  • You may have some performance degradation compared to just adding new views for the columns that need to be secured. But depending on your requirements and the size of your data, this should be tolerable. In my example above, I am able to select a single row from the employee view in 23ms where the underlying employee table has 1 million rows
  • This approach only works on tables with primary keys (tables must have a unique identifier)
  • Field names are visible. To completely hide field names, creating additional (secured) views exposing the hidden columns would be the best approach

 

Session_User vs Current_User

Using this approach you must use the session_user variable, not the current_user variable to filter access. The current_user variable returned from HANA column views is not the invoker’s user ID but that of the definer. Even though the definer’s ID is returned as the current_user, HANA secures column views based on the invoker of the view.

 

MAP Function

More on the map function in the SAP HANA SQL Script Help Guide.

http://help.sap.com

-> SAP In-Memory Computing -> SAP HANA Platform -> Reference Information -> SAP HANA SQL and System Views Reference

The MAP function is documented in the SQL Functions-> Miscellaneous Functions section.

 

Disclaimer

This post describes a possible approach to consider when securing column data in HANA but your specific implementation should be validated by security experts within your organization.

Developers in the Bay Area: Join SAP experts for a hands-on Developer Day on SAP HANA

$
0
0

I-DeveloperDay-PaloAlto-300dpi.jpg

JUNE 11    |    SAP HANA

 

 

Here is an exclusive opportunity for you to learn and get your hands dirty with SAP HANA - SAP’s column-based in-memory platform for big data, real-time apps - with live support from SAP experts. During the event, the experts will show you how the platform works and how to get started. You’ll get hands-on coding experience developing on top of SAP HANA. You’ll have the opportunity to explore the building blocks needed to create SAP HANA apps such as:

 

  • Creating tables and views using Core Data Services
  • Modeling HANA views
  • Creating SQLScript stored procedures
  • Leveraging services such as oData and server side JavaScript
  • Building UIs using SAPUI5

 

This is a bring-your-own-laptop event that will be supported by SAP HANA development experts who will be on hand to provide live support while you’re building your apps. In the days prior to the event, we will send you a checklist including links to download SAP HANA tools. If you run into problems with the installations, experts will be available from 7 AM to 9 AM the day of the event to help you troubleshoot and install the tools. 

 

The event will take place on Wednesday, June 11 at the SAP Labs campus in Palo Alto (3410 Hillview Ave, Bldg 1, Palo Alto, CA 94304). Our leading experts will be developers and product managers Thomas Jung and Rich Heilman. For more details and for an overview of the agenda, click here.

 

This is event is free of charge but space is limited so REGISTER NOW!


We look forward to seeing you there!


Access to Hive from HANA - Section 1 Hadoop Installation

$
0
0

To Access to Hive from HANA, we first should have Hadoop and Hive installed. In the first section and the second section, installation of Hadoop and hive will be introduced. 


1. Download Hadoop and move to directory

Download Hadoop from apache Hadoop mirror: http://hadoop.apache.org/releases.html#Download

In this case, we choose Hadoop-2.2.0.

Unzip the downloaded Hadoop package and put the Hadoop fold to directory where you want it to be installed.

tar -zxvf  hadoop-2.2.0.tar.gz

Switch to your Hana server user:

su hana_user_name.

We need to install Hadoop under Hana user, because Hana server needs to communicate with Hadoop with the same user.

If you just want to set up Hadoop without accessing from Hana, you can simply create a dedicate Hadoop account by “addgroup” and “adduser” (these two command lines depend on system, Suse and Ubuntu seem to have different command lines)

 

2. Check Java

Before we install the Hadoop, we should make we have Java installed.

Use:

java –version 

to check java and find java path by

whereis java

And write the following script in $HOME/.bashrc to add your java path:

export JAVA_HOME=/java/path/

export PATH=$PATH:/java/path/


3. SSH passwordless

Install ssh first if you don’t have.

Type the following commands in console to create a public key and put the key to authorized keys

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 

4. Add path to Hadoop

Write the following script in $HOME/.bashrc if you want to add the Hadoop path permanently.

Open the .bashrc file by

vi $HOME/.bashrc

Add the following script

export HADOOP_INSTALL=/hadoop/path/

For the hadoop path, I put the Hadoop folder under /usr/local,

so I use /usr/local/hadoop instead of /hadoop/path/ in my case

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

 

5. Hadoop configuration

Find the configuration files, core-site.xml, hdfs-core.xml, yarn-site.xml, mapred-site.xml, hadoop-env.sh in Hadoop folder. These files exist in $ HADOOP_INSTALL /etc/hadoop/ under you Hadoop folder. You may simply rename the “template file” in the folder if you can find the xml files. For example:

cp mapred-site.xml.template mapred-site.xml

Some other tutorials said you can find them under /conf/ directory, I guess /conf/ is for older Hadoop version, but in hadoop-2.2.0 the files are under /etc/hadoop/

 

Modify the configuration files as followed:

vi core-site.xml

Put the following between configuration tab

<property>

<name>fs.default.name</name>

<value>hdfs://computer name or IP(localhost would also work):8020</value>

</property>

 

vi hdfs-site.xml

Put the following between configuration tab

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/namenode/dir</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/datanode/dir</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

 

vi yarn-site.xml

Put the following between configuration tab

<property>

<name>yarn.resourcemanager.hostname</name>

<value>yourcomputername or IP</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

  </property>

 

vi mapred-site.xml

Put the following between configuration tab

<property>

<name>mapreduce.framework.name</name>

   <value>yarn</value>

</property>

 

For more information about all the tabs, please check

http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/core-default.xml

 

http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

 

http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

 

http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

 

vi hadoop-env.sh

add the following two statement in the end of this file

export HADOOP_COMMON_LIB_NATIVE_DIR=/hadoop/path/lib/native

export HADOOP_OPTS="-Djava.library.path=/hadoop/path /lib"

 

6. Start Hadoop

       The last thing needs to do before starting your Hadoop is to format your namenode and datanode simply by:

            hadoop namenode -format

            hadoop datanode -format

In the end, you can start Hadoop by calling “start-all.sh”, you may find this file in /hadoop/path/sbin.

To check your Hadoop has started, type

jps

You should see NameNode, NodeManager, DataNode, SecondaryNameNode and ResourceManager are running.

 

JPS.PNG

Alternatively, you can also check if Hadoop is running by visiting localhost:50070 to check Hadoop file system information


namenode.PNG

and localhost:8088 to check cluster information.

cluster.PNG


 

You may find that localhost:50030 contains jobtracker info in some tutorials. However, localhost:50030 does not exist in hadoop-2.2.0, because hadoop-2.2.0 divides the two major functions of the JobTracker: resource management and job life-cycle management into separate components. Don’t worry about localhost:50030 not working.

Access to Hive from HANA - Section 2 Setup Hive

$
0
0

After we installed Hadoop in the machine, we then need to install Hive in the second section.


1. Download Hive

Download Hive-0.13.0 from http://hive.apache.org/downloads.html, and unzip and put Hive package together with Hadoop (not necessary to put Hive together with hadoop, but it is easy to manage in the future).


2. Add path to Hive

Add the statements to $HOME/.bashrc to add path

export HIVE_HOME=/ hive/path

export PATH=$PATH:$HIVE_HOME/bin

export PATH=$PATH:$HIVE_HOME/lib


3. Make file on Hadoop file system

Make file on hadoop file system for hive database

hadoop fs –mkdir /user/hive/warehouse

hadoop fs –mkdir /temp


4. Config.sh file

Go to hive/bin, find config.sh and add:

export HIVE_CONF_DIR=$HIVE_CONF_DIR

export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH

export HADOOP_INSTALL= /hadoop/path (the same path as in section 1)

Start hive by typing in “hive” in console, you will see hive cli and do queries with Hiveql.

 

Notice: the hive default metadata is stored by Derby. You can only access to you previous database from the same location where you start hive last time. Otherwise, you would not be able to see your previous database. Also, hive will create metastore_db and a log file anywhere you start hive. To fix it, you may set the hive metastore with mysql. I will write this instruction later.

SAP HANA- ADVANCE MODELING FEATURES

$
0
0
  • Hierarchies
  • Restricted & calculated measures
  • Input Parameters
  • Currency conversion
  • Filter operations and variables

 

HIERARCHIES: Hierarchies are used to structure and define the relationships among attributes of attribute views used for business analysis.HANA supports two types of hierarchy.

  • Level Hierarchies are hierarchies that are rigid in nature, where the root and the child nodes can be accessed only in the defined order. For example, organizational structures, and so on.
  • Parent/Child Hierarchies are value hierarchies, that is,hierarchies derived from the value of a node. For example, a Bill of Materials(BOM) contains Assembly and Part hierarchies, and an Employee Master record contains Employee and Manager data. The hierarchy can be explored based on a selected parent; there are also cases where the child can be a parent.

This discussion will helps us to create a Level Hierarchy or a Parent Child Hierarchy in order to structure and define relationship between view attributes.

  • In the Hierarchy Type dropdown, select the required option as follows:
    • Level Hierarchy
    • Parent Child Hierarchy
  • In the Node tab page, perform the following based on your selection of hierarchy type:
    • For a Level Hierarchy you can specify the Node Style that determines the unique node ID composition. Also, add various levels and assign attributes to each of them with the Level Type that specifies the formatting instructions for the level attributes.  Specify Order By to control the hierarchy members ordering, and Sort Direction to sort the hierarchy members display in ascending or descending order.
    • For a Parent Child Hierarchy specify the parent and the child attribute. Also, in the Step Parent node specify where to place the orphan parent-child pair.
  • In the Advanced tab page, specify the other properties of the hierarchy which are common to both hierarchy types as follows:

You can set Aggregate All Nodes to true if there is a value posted on the aggregate node and you want to compute that value while aggregating data.

  • Specify the default member
  • You can select Add a Root Node if you want to create a root node if the hierarchy does not have any
  • Specify how to handle orphan nodes using Orphan Nodes dropdown
  • Select Multiple Parent if the hierarchy needs to support multiple parents for its members

LEVEL HIERARCHIES:

  1. Select the Semantics node.
  2. In the Hierarchies panel, choose Create option.
  3. Enter a name and description for the hierarchy.
  4. In the Hierarchy Type dropdown, select Level Hierarchy.
  5. In the Node tab page do the following:
    1. Select the required value from the Node Style dropdown list. Note Node style determines the composition of a unique node ID. The different values for node styles are as:
      • Level Name - the unique node ID is composed of level name and node name, for example "[Level 2].[B2]".
      • Name Only - the unique node ID is composed of level name, for example "B2".
      • Name Path - the unique node ID is composed of the result node name and the names of all ancestors apart from the (single physical) root node. For example "[A1].[B2]".
    2. Add the required columns as levels from the drop-down list. Note You can select columns from the required table fields in the drop-down list to add to the view.
    3. Select the required Level Type. Note The level type is used to specify formatting instructions for the level attributes. For example, a level of the type LEVEL_TYPE_TIME_MONTHS can indicate that the attributes of the level should have a text format such as "January", and LEVEL_TYPE_REGULAR indicates that a level does not require any special formatting.
    4. To control how the memebers of the hierarchy are ordered, select the required column in the OrderBy drop-down list. Note In the MDX client tools, the members will be sorted on this attribute.
    5. To sort the display of the hierarchy members in the ascending or descending order, select the required option from the Sort Direction drop-down list.
  6. In the Advanced tab page do the following:
    1. Select the required value in the Aggregate All Nodes. Note This option indicates that data is posted on aggregate nodes and should be shown in the user interface. For example, if you have the members A with value 100, A1 with value 10, and A2 with value 20 where A1 and A2 are children of A. By default the value is set to false, and you will see a value of 30 for A. With the value set to true, you will count the posted value 100 for A as well and see a result of 130. If you are sure that there is no data posted on aggregate nodes you should set the option to false. The engine will then calculate the hierarchy faster as when the option is set. Note that this flag is only interpreted by the SAP HANA MDX engine. In the BW OLAP engine the node values are always counted.
    2. Enter a value for the default member.
    3. To specify how to handle the orphan nodes in the hierarchy, select the required option as described below from the dropdown.
      OptionDescription
      Root NodeTreat them as root nodes
      ErrorStop processing and show an error
      IgnoreIgnore them
      Step ParentPut them under a step-parent node Note This enables you to create a text node and place all the orphaned nodes under this node.
  • Optional Step: If you have selected Step Parent in the Orphan Nodes drop-down, enter a value to create the step-parent node.
  • Select the Add a Root Node check-box if required. Note If a hierarchy does not have a root node but needs one for reporting use case, select this option. This will create a root node with the technical name “ALL” .
  • If the level hierarchy needs to support multiple parents for its elements for example, Country 'Turkey' to be assigned to two regions 'Europe' and 'Asia', select the Multiple Parent check-box.
  • Choose OK.

Let us take a scenario to explain Level Hierarchies. In eFashion package we created an Attribute View for Article_Lookup table below is the table fields

            • ARTICLE_ID
            • ARTICLE_LABEL
            • CATEGORY
            • SALE_PRICE
            • FAMILY_NAME
            • FAMILY_CODE

Here we can define a hierarchy by using field CATEGORY & ARTICLE_LABEL.

  • Open AV_AL (Attribute View-Article_Lookup)
  • Select Semantics.
  • Click on PLUS icon on Hierarchy box.
  • Provide Hierarchy definition(Name, Type, Node)
  • In Hierarchy Type –select Level Hierarchy.
  • OK,Validate & Save and Activate.
  • Activate Analytic View wherein AV_AL consumed.

AV_HR.jpgAV_HR1.jpgCONSUMED HEIRARCHY USING MDX PROVIDER:

  • Open Excel workbook > Data > From Other
    Sources > Select “From Data Connection Wizard”
  • Select Other/Advanced from Data ConnectionWizard.
  • Select “SAP HANA MDX PROVIDER”, NEXT
  • Provide SAP HANA login credential, OK.
  • Select Package and cube (AV_SHOP_FACT), NEXT

AV_HR2.jpg

 

PARENT/CHILD HIERARCHY:

  1. Select the Semantics node.
  2. In the Hierarchies panel, choose Create option .
  3. Enter a name and description for the hierarchy.
  4. In the Hierarchy Type dropdown, choose Parent Child Hierarchy.
  5. In the Node tab page, add the parent and child nodes by selecting the Parent Node and Child Node from the drop-down list. Note In case you decide to place the orphaned parent-child pair under a node called Step Parent from the Advanced tab page, you can specify its value in the Step Parent column. The step-parent node could only be one of the column or calculated column or the current view. You can specify different step-parent values for all the parent-child pairs. These values appear as a comma separated list in the Advance tab page Step Parent field. In case of a single parent-child node, you can also specify the value for step-parent node in the Advanced tab page. The same value appears in the Node tab page.
  6. In the Advanced tab page, do the following: Choose OK.
    1. Select the required value in the Aggregate All Nodes. Note This option indicates that data is posted on aggregate nodes and should be shown in the user interface. For example, if you have the members A with value 100, A1 with value 10, and A2 with value 20 where A1 and A2 are children of A. By default the value is set to false, and you will see a value of 30 for A. With the value set to true, you will count the posted value 100 for A as well and see a result of 130. If you are sure that there is no data posted on aggregate nodes you should set the option to false. The engine will then calculate the hierarchy faster as when the option is set. Note that this flag is only interpreted by the SAP HANA MDX engine. In the BW OLAP engine the node values are always counted.
    2. Enter a value for the default member.
    3. To specify how to handle the orphan nodes in the hierarchy, select the required option as described below from the dropdown.
      OptionDescription
      Root NodeTreat them as root nodes
      ErrorStop processing and show an error
      IgnoreIgnore them
      Step ParentPut them under a step-parent node Note This enables you to create a text node and place all the orphaned nodes under this node.
    4. Optional Step: If you have selected Step Parent in the Orphan Nodes dropdown, enter a value to create the step-parent node.
    5. Select the Add Root Node checkbox if required as described below. Note If a hierarchy does not have a root node but needs one for reporting use case, set the option to true. This will create a root node.
    6. If the level hierarchy needs to support mulitple parents for its elements for example, Country 'Turkey' to be assigned to two regions 'Europe' and 'Asia', select the Mulitple Parent checkbox.

 

Note: The hierarchies belonging to an attribute view are available in an analytic view that reuses the attribute view, in read-only mode. However, the hierarchies belonging to an attribute view are not available in a calculation view that reuses the attribute view.


Let us take a scenario of Parent –Child relationship in ITEM_MASTER table. Here for Item ID 2 and 3 the parent ID is 1 i.e. to say item id 2 & 3 falls under CONSUMABLES. Similarly Item 5 to 8 fall under STEEL category. Parent/Child Hierarchy type can be used to define Hierarchy in this scenario.

 

ITEM_IDPARENT_ITEM_IDITEM_DESCRIPTION
1CONSUMABLES
21Cutting Disc 4”
31Grinding Disc 4”
4STEEL
54Plate 10mm
64Beam
74Angle
84Channel

 

 

AV_HR4.jpg

RESTRICTED COLUMN:

Restricted Columns are used to filter the value based on the user defined rules on the attribute values.

Restricted Column dialog helps to create a restricted column and filter its value based on the columns that you select in the Restrictions view. In the Column dropdown, you can select a column of type measure for which you want to apply filter. In the Restrictions view, to apply a filter, you need to choose Add Restriction. Select a Column, an Operator, and enter a value. For example, you can create a restricted column to view the Revenue of a particular country, where Revenue is the measure and Country is the attribute having a list of countries. If you have added many restrictions, and do not want to apply all of them but want to retain them, deselect the Include checkbox.

Creating a Restricted Column

You use this procedure to create a restricted column to filter the value based on the user-defined restrictions for the attribute values.

 

For example, to filter the sales of a product based on a category you can create a restricted column Sales based on a measure Total Sales amount, and attribute category where you can restrict the value for category.

 

1.     In the Output panel of the Logical Join, right-click Restricted Columns, and choose New.

 

2.     Enter a name and description for the restricted column.

 

3.     From the Column dropdown, select a measure.

 

4.      In the Restrictions view, choose Add Restriction.

 

1.      In the Column dropdown, select column to define filter.

 

2.      Select the required operator.

 

3.      Enter the filter value.

 

4.      If you want to remove a particular filter on the column, deselect its corresponding Include checkbox.

 

5.      Choose OK.

REST_COL.jpg

Data preview shows Sales amount for only restricted attributes i.e, category "Jewelry" & Category "Pants".

REST_COL1.jpg

Lets review below senario where in "Jewelry" category has been excluded.

REST_COL2.jpg

Here you can see sales amount of all category except category "Jewelry".

REST_COL3.jpg

 

CALCULATED COLUMN:Calculated columns are used to derive some meaningful information in the form of columns, from existing columns.

Calculated Column dialog helps you to derive a calculated column of type attribute or measure based on the existing columns, calculated columns,restricted columns, and input. You can write the formula in the Expression panel or assemble it making use of the available Elements, Operators and Functions.

You can specify how to aggregate row data for calculated column of type measure using the Calculate Before Aggregation checkbox and specifying Aggregation Type. If you select the Calculate Before Aggregation,the calculation happens as per the expression specified and then the results are aggregated as SUM, MIN, MAX or COUNT. If Calculate Before Aggregation is not selected, the data is not aggregated but it gets calculated as per calculation expression (formula), and the Aggregation is shown as FORMULA. After writing the expression, you can validate it using Validate.

You can also associate a calculated column with Currency or Unit of Measure using the Advanced tab page.

CREATING CALCULATED COLUMN

You use calculated columns to derive some meaningful information, in the form of columns, from existing columns, calculated columns, restricted columns and input parameters. 

 

For example:

  • To derive postal address based on the existing attributes.
  • To prefix the customer contact number with the country code based on the input parameter country.
  • To write formula in order to derive values like,
    if("PRODUCT" = 'ABC, "DISCOUNT" * 0.10,
    "DISCOUNT") if attribute PRODUCT equals the string ‘ABC’ then DISCOUNT
    equals to DISCOUNT multiplied by 0.10 should be returned. Otherwise the
    original value of attribute DISCOUNT should be used.

 

Procedure

       
1.
     In the Output panel of the Logical Join,right-click Calculated Columns, and choose New.

2.     Enter a name and description for the calculated column.

3.     Select the data type, and enter length and scale for the calculated column.

4.     Select the Column Type to specify the calculated column as attribute or measure.

5.     In case of measure column type, if you select Calculate Before Aggregation, select the aggregation type.

Note: If you select Calculate Before Aggregation, the calculation happens as per the expression specified and then the results are aggregated as SUM, MIN, MAX or COUNT. If Calculate Before Aggregation is not selected, the data is not aggregated but it gets calculated as per calculation expression (formula), and the Aggregation is shown as FORMULA.If the aggregatoin is not set, then it will be considered as attribute.

6.     In the expression editor enter the expression or assemble it using the menus in the below window.

7.     If you want to associate the calculated column with currency and unit of measuring quantity, select the Advanced tab page and select the required type.

8.     Choose OK.

 

calculated_column.png

 


INPUT PARAMETER:

You use this procedure to allow you to provide input for the parameters within stored procedures, to obtain a desired functionality when the procedure is executed.

In an Analytic View you use input parameters as placeholders during currency conversion, formulas like calculated columns where the calculation of the formula is based on the input you provide at runtime during data preview. Input parameters are not used for filtering attribute data in Analytic View that is achieved using variables.

 

In calculation Views you can use input parameter to during currency conversion, calculated measures, input parameters of the script node and to filter data as well.

You can apply input parameters in analytic and calculation views. If a calculation view is created using an analytic view with input parameters, those input parameters are also available in the calculation view but you cannot edit them.

The following types of input parameters are supported:              

 

 


Type


Description


Attribute Value/
  Column


Use this when the value
  of a parameter comes from an attribute.


Currency (available
  in Calculation View only)


Use this when the
  value of a parameter is in a currency format, for example, to specify the
  target currency during currency conversion.


Date (available in Calculation
  View only)


Use this when the
  value of a parameter is in a date format, for example, to specify the date
  during currency conversion.


Static List


Use this when the
  value of a parameter comes from a user-defined list of values.


Derived From Table
  (available in Analytic View and Graphical Calculation View)


Use this when the
  value of a parameter comes from a table column based on some filter
  conditions and you need not provide any input at runtime.


Empty


Use this when the
  value of a parameter could be anything of the selected data type.


Direct Type
  (avaliable in Analytic View)


To specify an input
  parameter as currency and date during currency conversion.

 

 

Each type of input parameter can be either mandatory or non-mandatory. For a mandatory input parameter, it is necessary to provide a value at runtime. However, for a non-mandatory input parameter, if you have not specified a value at runtime,the data for the column where the input parameter is used remains blank.

 

Note:You can check whether an input parameter is mandatory or not from the properties of the input parameter in the properties pane.

  • If you want to create a formula to analyze the annual sales of a product in various regions, you can use Year and Region as input parameters.
  • If you want to preview a sales report with data for various countries in their respective currency for a particular date for correct currency conversion, you can use Currency and Date as input parameters.

Procedure

In Analytic View

  1. In the Output panel of the Data Foundation or Logical Join node, right-click the Input Parameters node.
    • Note: You can also create input parameters at the Semantics node level, using the Create Input Parameter option in the Variables/Input Parameters panel.
  2. From the context menu, choose New.
  3. Enter a name and description.
  4. Select the type of input parameter from the Parameter Type drop-down list.
    1. For the Column type of input parameter, you need to select the attribute from the drop-down list. At runtime the value for the input parameter is fetched from the selected attribute's data.
    2. For input parameter of type Derived from Table, you need to select a table and one of it's column as Return Column whose value is used as input for formula caluclation. You can also define conditions to filter the values of Return Column in the Filters   panel. For example, to calculate Discount for specific clients, you can create an input parameter based on Sales table and return column Revenue with filter set on  the Client_ID.
    3. For Direct Type input parameter, specify the Semantic Type that describes the use parameter as a currency or date , for example, to specify the target currency during currency conversion.
  5. If required, select a data type.
  6. Enter length and scale for the input parameter.
  7. Choose OK.

    In Calculation View

     

    1. In the Output panel,right-click the Input Parameters node.
    2. From the context menu, choose New.
    3. Enter a name and description.
    4. Select the type of input parameter from the drop-down list.
      1. For the Attribute Value type of input parameter, you need to select the attribute from the drop-down list. At runtime the value for the input parameter is fetched from the selected attribute's data.
      2. For input parameter of type Derived from Table, you need to select a table and one of it's column as Return Column whose value is used as input for formula caluclation. You can also define conditions to filter the values of Return Column in the Filters panel. For example, to calculate Discount for specific clients, you can create an input parameter based on Sales table and return column Revenue with filter set on the Client_ID.
    5. Select a data type.
    6. Enter length and scale for the input parameter.
    7. Choose OK.

    CURRENCY & UNIT OF MEASURE:

    Measures used in analytic views and calculation views can be defined as amount or weight in the analytical space using currency and unit ofmeasure. You can also perform currency conversion and unit of measure conversion.

    For example, you need to generate a sales report for a region in a particular currency, and you have sales data in database tables ina different currency. You can create an analytic view by selecting the table column containing the sales data in this other currency as a measure, and perform currency conversion. Once you activate the view, you can use it to generate reports.

    Similarly, if you need to convert the unit of a measure from cubic meters to barrels to perform some volume calculation and generate reports, you can convert quantity with unit of measure.

    To simplify the process of conversion, system provides the following:

    • For currency conversion - a list of currencies, and exchange rates based on the tables imported for currency
    • For quantity unit conversion - a list of quantity units and conversion factors based on the tables imported for units.

    Currency conversion is performed based on the source currency, target currency, exchange rate, and date of conversion. You can also select currency from the attribute data used in the view.

    Similarly, quantity unit conversion is performed based on the source unit,target unit, and conversion factor.

    You can also select the target currency or unit of measure at query runtime using input parameters. If you use this approach then, you have to first create an input parameter with the desired currency/unit specified, and use the same input parameter as target in the conversion dialog.
    Note
    Currency conversion is enabled for analytic views and base measures of calculation views.

     

    Prerequisites

    You have imported tables T006, T006D, and T006A for Unit of Measure.

    You have imported TCURC, TCURF, TCURN, TCURR, TCURT, TCURV, TCURW, and TCURX for currency.

     

    Procedure

    1. Select a measure.
    2. In the Properties pane, select Measure Type.
    3. If you want to associate the measure with a currency, perform the following substeps:

                 a. In the Measure Type dropdown list, select the value Amount with Currency.

                 b.In the Currency Dialog,select the required Type as follows:


      Type

      Purpose

      Fixed

      To select currency from the currency table TCURC.

      Attribute

      To select currency from one of the attributes used in the view.

                c. Select the required value, and choose OK.

                d. If you want to convert the value to another currency, choose Enable for Conversion.

                            i.To select the source currency, choose Currency.

                            ii..Select the target currency.

                                Note: For currency conversion, in addition to the types Fixed and Attribute, you can select Input Parameter to provide target currency at                             runtime. If you select an input parameter for specifying target currency and deselect Enable for Conversion checkbox, the target currency                             field gets clear because input parameters can be used only for currency conversion.

                             iii.To specify exchange rate type, in the Exchange Rate Types dialog, select the Type as follows:


      Type

      Purpose

      Fixed

      To select exchange rate from the currency table TCURW.

      Input Parameter

      To provide exchange rate input at runtime as input parameter.


                             iv.To specify the date for currency conversion, in the Conversion Date dialog, select the Type as follows:

     


      Type

      Purpose

      Fixed

      To select conversion date from the calendar.

      Attribute

      To select conversion date from one of the attributes used in the view.

      Input Parameter

      To provide conversion date input at runtime as input parameter.


                           v.To specify the schema where currency tables are located for conversion, in the Schema for currency conversion, select the required schema.

                           vi.To specify the client for which the conversion rates to be looked for, in the Client for currency conversion, select the required option.

     

                  e. From the dropdown list, select the required value that is used populate data if the conversion fails:

     


    Option

    Result

    Fail

    In data preview, the system displays an error for conversion failure.

    Set to NULL

    In data preview, the value for the corresponding records is set to NULL.

    Ignore

    In data preview, you view the unconverted value for the corresponding records.


    4.If you want to associate a measure with a unit of measure other than currency, perform the following substeps:

        a. Select the value Quantity with Unit of Measure in the Measure Type drop-down list.

        b. In the Quantity Units dialog, select the required Type as follows:

     


      Type

      Purpose

      Fixed

      To select a unit of measure from the unit tables T006 and T006A.

      Attribute

      To select a unit of measure from one of the attributes used in the view.


        c. Select the required value, and choose OK.

    5.Choose OK.

    Note You can associate Currency or Unit of Measure with a calculated measure, and perform currency conversion for a calculated measure by editing it

    Currency_Con_Fixed.png

    To Be or Not To Be: HANA Text Analysis CGUL rules has the answer

    $
    0
    0

    To set the context lets do things in HANA TA without the CGUL rules first.

     

    1. Lets create a small table with texts

     

    So lets create a Table which looks something like this:

    TableDefinition.PNG

     

    Now lets create two texts in it:

    insert into "S_JTRND"."TA_TEST" values(1,'EN','TO BE','');

    insert into "S_JTRND"."TA_TEST" values(2,'EN','NOT TO BE','');

     

    So now the table entries look like:

    TableData.PNG

     

    2. Text Analysis Via Dictionary

    Now lets say we want to do text Analysis where we Say

    1. if the text is "TO BE" it is to be treated as POSITIVE_CONTEXT
    2. if the text is "NOT TO BE" it is to be treated as NEGATIVE_CONTEXT

     

    Lets create a dictionary to have these two values:

    So in XSJS Project we create a english-Contextdict.hdbtextdict and content will be as follows(also attached):

    <dictionary xmlns="http://www.sap.com/ta/4.0">

      <entity_category name="POSITIVE_CONTEXT">

        <entity_name standard_form="TO BE">

          <variant name="TO BE" />

        </entity_name>

        </entity_category>

      <entity_category name="NEGATIVE_CONTEXT">

        <entity_name standard_form="NOT TO BE">

          <variant name="NOT TO BE" />

        </entity_name>

        </entity_category>

     

     

    </dictionary>

     

    Now we use the dictionary above to create a configuration file(also attached):

    So, pick content from any .hdbtextconfig and add the path to the above dictionary in it:

      <configuration name="SAP.TextAnalysis.DocumentAnalysis.Extraction.ExtractionAnalyzer.TF" based-on="CommonSettings">

      <property name="Dictionaries" type="string-list">

      <string-list-value>JTRND.TABlog.dictonary::english-Contextdict.hdbtextdict</string-list-value>

        </property>

      </configuration>

     

     

    3. Create Full text index on the Table using this configuration

    CREATE FULLTEXT INDEX "IDX_CONTEXT" ON "S_JTRND"."TA_TEST" ("TEXT")

      LANGUAGE COLUMN "LANG"

      CONFIGURATION 'JTRND.TABlog.cfg::JT_TEST_CFG' ASYNC

      LANGUAGE DETECTION ('en','de')

      PHRASE INDEX RATIO 0.000000

      FUZZY SEARCH INDEX OFF

      SEARCH ONLY OFF

      FAST PREPROCESS OFF

      TEXT MINING OFF

      TEXT ANALYSIS ON;

     

     

    Check the TA results:

    TA_1.PNG

     

    Note* for NOT TO BE, we did not get both POSTIVE(for substring TO BE) AND NEGATIVE, altough this is good, its a fluke, as TA did take the longest string matching and hence for NOT TO BE, and its sub String TO  BE we got a Negative, but this could create problems.

     

    Now moving on, lets add more to this context, lets add text NOT-TO BE as also a possibility of NEGATIVE_CONTEXT, infact NOT, followed by, TO BE,in same sentence is to be a NEGATIVE_CONTEXT.

     

    Without changing anything lets insert some more values and see how they look:

    insert into "S_JTRND"."TA_TEST" values(3,'EN','NOT-TO BE','');

    insert into "S_JTRND"."TA_TEST" values(4,'EN','NOT, TO BE','');

    insert into "S_JTRND"."TA_TEST" values(5,'EN','NOT, Negates TO BE','');

     

    Check the TA results:

    TA_2.PNG

     

    So you see we now have a problem, Also we could have NOT, -, NEG etc as possible predecessors before TO BE to point that its a NEGATIVE_CONTEXT

     

    Solution 1: Lets have synonyms of NOT as one category, TO BE as "CONTEXT" category, and in post processing of TA lets see if we have TA_TYPE value of CONTEXT and NEGATIVE in same sentence then its a NEGATIVE CONTEXT,

     

    But wouldnt it be great if index could do this on its own?

     

    CGUL Rules save the day:

     

    So here we go:

     

    4. CREATE A .rul file

    CONTEXT.rul(also attached) containing following rule:

    #group NEGATIVE_CONTEXT (scope="sentence") : { <NOT> <>*? <TO> <>*? <BE> }

     

    We need to compile this rule to get a .fsm file and put it on server under ...lexicon/lang (oos for this blog, I have attached the complied file here)

     

    Now enhance you configuration file with reference to this fsm file.

     

    <configuration name="SAP.TextAnalysis.DocumentAnalysis.Extraction.ExtractionAnalyzer.TF" based-on="CommonSettings">

      <property name="Dictionaries" type="string-list">

      <string-list-value>JTRND.TABlog.dictonary::english-Contextdict.hdbtextdict</string-list-value>

        </property>

        </property>

      <property name="ExtractionRules" type="string-list">

          <string-list-value>CONTEXT.fsm</string-list-value>

        </property>

       

      </configuration>

     

     

    5. Restart the indexserver process so that the newly compiled rule file is picked up by the system.

    indexServerProcessRestart.PNG

     

     

    6. Recreate the index using the same statement as above and check the TA table:

     

    TA_3.PNG

     

    So, as you see the highlighted values come from the rule and mark extracted NEGATIVE CONTEXT, below I kept the dictionary value which wrongly identified the POSITIVE_CONTEXT for comparison, this should ideally not be handled by dictionaries.

     

    So, in this context: To Be or Not To Be: HANA Text Analysis CGUL rules indeed has the answer!!

     

    Hope this helps,

    Bricks and Bats are Welcome

    TIPS AND MY EXPERIENCE- SAP-HANA CERTIFICATION

    $
    0
    0

    I would like to share my SAP HANA certification exam experience and some of the TIPS for the preparation of SAP-HANA certification.

     

    First I would like to brief about myself  , well, I started my carrier as programmer and then got interest in database development.I am NON-SAP professional,having hand on experience in MSSQL,MSSAS. I then started reading about In-Memory database technology and quite impressed with SAP-HANA technology and platform as In-memory database which is dual in nature (OLTP as well as OLAP).I generated huge interest to learn the new innovation technology HANA. I started my learning journey of SAP HANA in the starting of year 2014 and today on 22nd May 2014 I became SAP CERTIFIED APPLICATION ASSOCIATE-SAP HANA by qualifying the certification exam and scored 97%.

     

    My MSSQL knowledge help me a lot to understand the SQLScript,Procedure,database development part of SAP HANA.

    My MSSSAS knowledge help me to understand the DATA MODELING concept of SAP HANA.

    My MSSIS knowledge help me to understand the ETL technology of SAP(BO-DATA SERVICES) of SAP HANA.

     

    EXAM PREPARATION GUIDELINE:

     

    1. Fist step to know the exam course content and topic areas as described in SAP training site-

     

              Data Provisioning                                                                                                                                > 12%

              Describe possible scenarios and tools for replicating and loading data into SAP HANA from different data sources (e.g. SAP Landscape Transformation (SLT), SAP Data Services, or Direct Extractor Connection (DXC)).

    Security and Authorization                                                                                                               8% - 12%

    Describe the concept of authorization in SAP HANA, and implement a security model using analytic privileges, SQL privileges, pre-defined roles and schemas. Perform basic security and authorization troubleshooting.

    Data modeling - Analytical views                                                                                                    8% - 12%

    Implement data models with SAP HANA using analytical views. Advise on the modeling approach and best practices.

    Data modeling - Calculation views                                                                                                  8% - 12%

    Implement data models with SAP HANA using calculation views, and advise on the modeling approach and best practices.

    Advanced data modeling                                                                                                                  8% - 12%

    Apply advanced data modeling techniques, including currency conversion, variables and input parameters. Implement decision automation using business rules.

    Optimizion of data models and reporting                                                                                       8% - 12%

    Monitor, investigate and optimize data models and reporting performance on SAP HANA. Advise on modeling approach and tools to achieve optimum performance. Evaluate the impact of different implementation options such as table joins, aggregation, or filters. Understand the implication on performance of the various reporting tools and connetion types.

    Administration of data models                                                                                                         8% - 12%

    Administer data models in SAP HANA, including setting of information validation rules, managing schemas, the importing/exporting and transporting of data models.

    Reporting                                                                                                                                              < 8%

    Provide advice on reporting strategies, and perform appropriate reporting solutions with SAP HANA. Build reports using various tools, for example, Microsoft Excel or SAP BusinessObjects BI tools.

    Data modeling - SQL Script                                                                                                               < 8%

    Apply SQL Script to enhance the data models in SAP HANA using AFL, CE functions, and ANSI-SQL.

    Data modeling - Attribute views                                                                                                       < 8%

    Implement data models with SAP HANA using attribute views, and advise on the modeling approach and best practices.

    Deployment scenarios of SAP HANA                                                                                             < 8%

    Describe the deployment scenarios for SAP HANA and evaluate appropriate system configurations.

    SAP HANA Live & Rapid Deployment Solutions for SAP HANA                                                 < 8%

    Describe the value of HANA and identify scenarios for SAP delivered content for SAP HANA, such as SAP HANA Live and Rapids Deplyment Solutions.

     

       2. Analyse which topic area covers more proportionate of the course content.As stated above DATA PROVISIONING and DATA MODELING covers almost 70% of the course content.If we keep hold on this then 70% of exam topic covers.

     

       3. Now we know the focus area of exam,start collecting and reading content of above topics.There are complete resources available in SAP site which gives broad level of understanding of SAP HANA DATA PROVISIONING,MODELING and other topic area.      

              http://help.sap.com/hana_appliance#section5

     

       4. SAP HANA Academy - YouTube video tutorials is huge library to gain further knowledge of above topics.     

     

       5. openSAP course materials and videos will help a lot to get hang on new innovation technology of SAP HANA.

     

       6. Refer my blog post on DATA MODELING.

       

              Data Modeling in SAP HANA with sample eFashion Database-Part I

              Data Modeling in SAP HANA with sample eFashion Database-Part II

              SAP HANA- ADVANCE MODELING FEATURES

        

       7. Start reading content and video tutorials topic by topic,let say first start DATA PROVISIONING and read all content and watch all videos tutorials on this topic.Note down the key concept,methods and flow of topic.Keep focusing on graphical representation of content area which will help to memorize the key        concept easily.

     

       8. Practical experience is quite important as most of the question are simply based on HANA STUDIO UI.SAP provides 30 days free access to HANA on Amazon cloud services.Please refer SAP HANA One on AWS Marketplace

     

    QUESTION PATTERN AND TYPE:

     

    1. All question shall be multiple choice.most of the questions shall be single selection.However there are questions which contains more than one answer too(multiple select).
    2. No negative marking for wrong answer.If question contains more than one wright answer then all right option must be selected to make the question rightly answered otherwise it will be treated as wrong answer even if one option selected rightly.
    3. All questions has been grouped based on the topic described above.For example reporting section will contain 4 question and grouped by heading Reporting and all 4 question will be in sequential order then question form next topic and so on.
    4. DATA PROVISION question patterns like,how different provisioning tools connect to HANA,like one of the question asked is which connection involves in DXC. Type of replication like which tools support real-time replication,which tools support ETL,etc.
    5. MODELING topic question patters like,function of different views(Attribute,Analytic & calculation),variable,Input parameters,Hierarchy, different Join,CE function,SQLScript and procedure etc
    6. Reporting section generaly covers how SAP HANA talk to different reporting tool.Mainely the connectivity of SAP HANA to reporting tools.Like how Crystal Report,WEBI,DASHBOARD,EXPLORER,ANALYSIS EDITION FOR OLAP,ANALYSIS FOR MS OFFICE,MS EXCEL etc. can be connected to HANA.
    7. Different type of Privileges(Object,System,Pakage,SQL),which privilege does what authorization rule in HANA,Users and role assignment etc.

     

    Hope this blog will help for those who set their mind to go for HANA certification.

     

    Good Luck.

     

    Mohammad Shafiullah


    Introduction to HANA XS application development (Part 1): BlogProject creation

    $
    0
    0

    Hi all,

     

    This is the first post of a series that will engage with the development, from start to end, of a simple application on the SAP HANA platform, accompanied of course by the underlying theory. To offer a spherical presentation of the platform, I tried to use a variety of HANA capabilities, but of course some things will be missing. Further information can be found in the developer guide.

     

    • Application

      To begin with, let’s see what our application is about. The application that I will present here is called BlogProject and I think that what it will do is pretty obvious considering the name, but I am going to describe it anyway.  So, in the app a user will be able to login/register, search for and view a post, write one or write a comment on a certain post.  Simple as that. I may extend the application, adding a user profile, for a user to be able to see his/her own posts and comments, or some kind of statistics page, for example a geographical distribution of the posts.

     

    • Prerequisites

    1.   You have access to an SAP HANA system. If you don’t have, then explore your options and choose the right one for

           you: http://scn.sap.com/community/hana-in-memory/blog/2014/05/21/get-access-to-a-sap-hana-instance

    2.   Add your HANA system to your HANA studio

     

    • Studio explanation

      Let’s take a glimpse to the HANA studio. The studio includes different perspectives, each giving a different view.

    1.png10.png3.png

     

    Project Explorer: here we can see all our projects and files included in each.

    Repositories: here we see all your local workspaces

    Systems: in this view we can add and manage our HANA systems and database objects

     

    • Repository Workspace

    First, we have to create a workspace for our project. In the “Repositories” view right click -> New Repository Workspace. In the window choose your user (SYSTEM in this case). Then give a name to your workspace. In our case “workspace1” (don’t ask why…) and a location where to save it.

     

    • Project

      After the workspace creation we have to create our project. To do so we have to go to the “Project Explorer”, right click and then “New” -> “Project”. Then find and choose XS Project (SAP HANA -> Application Development).

     

    After the project is created, it is a good practice to create the following folders, in order for our project to be organized: data, model, procedures, services and ui. We will see later what files we create in each folder.

     

    Share and activate the project: To be able to activate our project we have to share it first, adding it to our workspace. To share it follow this procedure: right click on your project-> “Team” -> “Share Project” -> Choose “SAP HANA Repository” -> Choose (or add) your Workspace and click “Finish”.

     

    We must always activate our project when we make any changes, for them to be committed and shared. To do soright click on your project or the file(s) you want to activate -> “Team” -> ”Activate”.

     

    .xsapp .xsaccess files: These files are necessary for exposing any content via the XSEngine web server. Create the three files but without a name, just the extension. The .xsapp file does not contain any content. In the .xsaccess file paste the following:

    {

    "exposed":true

    }

     

    • Schema

    For our application to be able to create and access database objects we have to create a schema, which will hold all the database objects we will create, such as tables, SQL views and stored procedures. By default we are granted a schema called “SYSTEM”, but it is a better practice to create a separate schema for our application, so that your database is more organized.

     

    The procedure is very easy. We just have to create in the “data” folder of our project a new file with the extension .hdbschema. The definition of the schema is the following

     

    schema_name="BlogProject";

     

    Or we can open the SQL console from the “Systems” view and execute the following create statement:

     

    CREATE SCHEMA BlogProject [OWNED BY SYSTEM]

     

    • Authorization model

    Authorization is about providing access to specific resources only to certain users. The basic entity of the authorization model is a privilege. Privileges can be granted to a role, which is a group of privileges, or a user. Additionally, a user can be granted certain roles. The best practice it to assign privileges to roles and roles to users. That way the authorization model will be more organized.

     

    First, let’s create a privilege to our application. We have to create an .xsprivileges file without a name. In the .xsprivileges paste the following:

    {

    "privileges":[

    {

    "name":"Basic",

    "description":"Basic usage privilege"

    },

    {

    "name":"Admin",

    "description":"Administration privilege"

    } ]

     

    Now that we have created the privileges to our application, we must grant them to a role so that we can have access. To create a role we just create a new file with the extension .hdbrole. Inside the file we type the following definition.

     

    role MyBlogProject::model_access {

           applicationprivilege: MyBlogProject::Admin;

    }     

     

    Now our role has Admin rights upon our application. Then to assign the role to our user “SYSTEM” we have to go to the “Systems” view -> “Security” -> “Users” -> SYSTEM -> “Granted Roles”  and then add the role MyBlogProject::model_access we created.

     

    Next for our user and application to be able to access and write on the schema we created, we have to go to again to the security tab -> “roles” -> MyBlogProject::model_access -> object privileges and add our schema  “BlogProject. Now that we added this privilege to our role, all the users who have the certain role will also have the same privileges to the schema.

     

    As you may have noticed I granted our role certain privileges twice, the first time via the .hdbrole file in the repository and the second via the “Systems” view. We can edit the authentication model using both ways.

     

     

     

    This concludes this first post of the series. Next we will see how to create our data definition (aka persistence model) beyond the creation of the schema that was illustrated above, creating tables and sequences:

    http://scn.sap.com/community/developer-center/hana/blog/2014/05/27/introduction-to-hana-xs-application-development-part-2-blogproject-persistence-model

    Introduction to HANA XS application development (Part 2): BlogProject persistence model

    $
    0
    0

    Hi all,

     

    This is the second post of a series that talks about the BlogProject application development with HANA XS. If you missed the first one make sure to have a look:

    http://scn.sap.com/community/developer-center/hana/blog/2014/05/27/introduction-to-hana-xs-application-development-part-1-blogproject-creation

     

    Given the fact that I will develop the application in parallel with the posts, I will update them when and if new requirements arise. If I miss anything, forgive me if there are any inconsistencies regarding what I am proposing and what I will actually do.

     

    • Application schema

    1.png.jpg

    The above schema represents the data model of the application. Some things that might by confusing are the “subject”, “latitude” and “longitude” columns and the POST2POST table. Let’s start with the columns. The “subject” column of the POST table is going to hold the subject of a post, which will be the result of the use of the text analysis capabilities of SAP HANA. The “latitude” and “longitude” columns will help us with the geospatial statistics. Lastly, the POST2POST table will save all the link action between the posts, storing, for each link, the post that includes the link and the post that is referenced.

     

    • Tables

    Column or row?

    In the most cases the tables we create in HANA are columnar, but you can use row tables of course, depending on your application’s needs. Each table type benefits specific processes.

     

    Row stores:

    a) Easier to insert and update

    b) Better if you need the complete row, because the reconstructing is one of the most expensive column store operations

    c) Preferred if the table has a small number of rows (e. g. configuration tables). 

    d) If your application needs to only process a single record at one time

    Column stores:

    a) Only affected columns have to be read during the selection process of a query.

    b) All the columns can serve as an index

    c) Facilitates higher compression rates

    d) Better if the table holds huge amounts of data that should be aggregated and analyzed

    e) Better if the table has a large number of columns. 

    f) Preferred if calculations are typically executed on single or a few columns only. 

    To sum up, the basic difference is the type of processes for which we use each table type. In OLAP processes it is better to use column stores because for analysis we query certain columns and column stores provide a much better compression, thus minimizing the querying time. On the other hand, row stores are better for OLTP processes facilitating fast inserts and updates.

     

    Notes and suggestions:

    To enable fast on-the-fly aggregations, ad-hoc reporting, and to benefit from compression mechanisms on transactional data it is recommended to store them in a column-based table.

    If you need to join tables avoid using different storage types, since using both storage engines reduces performance.

    Attribute, Analytic and Calculation Views are only supported on columnar tables.

    Enabling search is only possible on column tables.

     

    How to create

    There are two ways to create a table:

     

    Via repository

    We can create a table via the repository by creating a new file in the “data” folder, with the extension .hdbtable. (If we choose to create a new “Database Table” and not a “File”, then the repository understands the file extension and we don’t need to add it. This applies to all file types). Then all we have to do is execute an SQL CREATE statement.

     

    Via catalog

    To create a table, simply create a “New Table” in the “tables” folder of the schema and then add the columns, types etc. in a graphical manner. If we create our tables that way, we will not be able to add a foreign key to the table. To do so we will have to add it with a simple SQL statement.

     

    For our application I have created the 5 tables below via the “Systems” view (Catalog):

    2.png

    3.png

    4.png

    5.png

    6.png

     

    • Sequences

    HANA does not support the “autoincrement” of a column (e.g. id), so we need to create a sequence to do that, which also provides additional capabilities.

     

    Sequence variables

    A sequence definition includes the below variables, of which only the schema is compulsory, some have default values and the rest are optional:

     

    schema = "";

    increment_by = integer; //the incrementation value (default = 1)

    start_with = integer; //the first value of the sequence (default = -1)

    maxvalue = integer; //the maximum value of the sequence

    nomaxvalue = boolean; //if the sequence has a max value or not (default = false)

    minvalue = integer; // the minimum value of the sequence

    nominvalue = boolean; // if the sequence has a min value or not (default = false)

    cycles = boolean;//if the sequence starts with the minvalue after the maxvalue has been reached, or the opposite

    reset_by = ""; //the query that will be used on server reboot to find the value that the sequence will start with

    public = boolean; //(default = false)

    depends_on_table = ""; //the dependency to a specific table

    depends_on_view = ""; // the dependency to a specific view

     

    How to create

    Via repository

    Create a new file inside the “data” folder of the repository with the extension .hdbsequence. In our application, we just want the ID columns to increment by 1 so I used a few variables. For example, for the USER table:

     

    schema= "BlogProject";

    start_with= 1;

    depends_on_table= "BlogProject::USER";

     

    Via catalog

    Create a “New Sequence” in the “Sequences” folder of the schema and add the values of the sequence in a graphical manner as shown below:

    7.png

    For the BlogProject I created the below sequences. The POST2POST table does not need a sequence because it does not have an ID column, but two columns that are foreign keys referencing the POST table’s ID.

     

    8.png

    Note: A single sequence can be used for all tables, but its value is incrementing regardless of the table. For example if we get the value 5 from a call of the sequence and then call it for another table, we get the value 6 and not the next value we were probably expecting for the specific column of the table. If we want a single incremental value for each table, then we must create different sequences, one for each.

     

    Calling a sequence

    Sequences are not associated with tables, but can only be restricted to apply to certain tables (with the “depends_on_table” variable). In fact, they are used by applications through SQL statements, which can use CURRVAL to get the current value of the sequence, or NEXTVAL to get the next value.

     

    For example if we want to insert a new user to our USER table we execute this statement:

     

    insertinto"BlogProject"."USER"

    values ("BlogProject"."MyBlogProject.data::userID".NEXTVAL, 'user', 'user', 'user', 1)

     

    • Accessing database objects

    When we access a table from a file in the repository we write “schema::table_name”, and if we call it from the SQL console we write “schema”.”table_name”. The same rule applies to all the database objects.

     

     

     

     

     

    And now we came to the end of our persistence model creation. In the next post we will talk about the modeling of our data using information views and other data-centric application logic enablers (triggers, functions and procedures).

     

    Thanks for reading!

    How to install SAP HANA studio on Mac OS?

    $
    0
    0

    This may be useful for any Mac users who are developing or administering SAP HANA: as you may noticed long time ago Mac version of SAP HANA Studio was available for download here:https://hanadeveditionsapicl.hana.ondemand.com/hanadevedition/ . Unfortunately, recentlyMac version disappeared and only versions available for download are Windows and Linux versions.

     

    Don't worry.

     

    You can get most recent version of HANA Studio running on your MacOS easily.

     

    Follow next steps to install SAP HANA Studio on your Mac:

     

    1. Download Eclipse Kepler from http://www.eclipse.org
    2. Unzip and move to Applications folder
    3. Start Eclipse
    4. Help -> Install New Software ...
    5. Add... to add repository, use this URL: https://tools.hana.ondemand.com/kepler
    6. Use this repository to see available software list
    7. Pick SAP HANA Tools (or more, depending on your needs)
    8. Finish installation (you will be asked to restart Eclipse)
    9. After restart, switch to HANA perspective and you are ready to start!

     

    How to install HANA studio on Mac - Eclipse Update Site.png

    How to install HANA studio on Mac - Eclipse Update Site 2.png

     

    HANA Studio - Switch to HANA Perspective - Eclipse Kepler.png

    SAP LT REPLICATION-PART 1

    $
    0
    0

    SAP LANDSCAPE TRANSFORMATION REPLICATION:

    slt.png

    • SAP LT is trigger based ,Real time replication method.
    • SAP LT replication server for SAP HANA leverage on proven SLO(System Landscape Optiomization) technology.SAP LT replication server is the ideal solution for all HANA customer who need real-time or schedule based replication sourcing from SAP and NON-SAP sources.
    • SLT replication server can be installed as stand alone server or can run on any SAP system with SAP NetWeaver 7.02 ABAP Stack(Kernel 7.02EXT) .
    • Add-on DMIS_2010_1_700 with SP5-7, NW 7.02; SAP Kernel 7.20EXT need to be installed for SLT replication to apply SPS04 for SAP HANA 1.0.
    • SLT replication system is connected with Source SAP ECC system by RFC(Remote Function call) connection and with target HANA system by DB Connection.In case of NON-SAP source system SLT connect with DB connection.
    • SAP source system contains Application tables,logging tables,triggers and read module.SLT server contains Control Module(Structure mapping & Transformation) and Writer module.In case of Non-SAP source system, Read Module resides in SLT System.
    • When replication start,logging table and trigger created in source system and data gets replicated to target system via SLT replication server and any changes in source system gets automatically replicated to target system.

     

     

    SLT CONFIGURATION:


    Configuration steps for SAP LT Replication Server

    • Define a schema for each source system
    • Define connection to source system
    • Define DB connection into SAP HANA
    • Define replication frequency (real-time; frequency for scheduled replication)
    • Define maximum number of background jobs for data replication

     

    STEP 1: Open Configuration and Monitoring Dashboard by using transaction code “LTR”. Click on New button to start SLT configuration.

    SLT1.png

    STEP 2: Provide name of SLT configuration. Please note that a schema will be created with the same name in HANA after the completion of configuration settings.

    SLT2.png

    STEP 3: Specify source system details. If the source system in SAP (ECC) then select RFC Connection. Select DB Connection for Non-SAP Source system.

    SLT3.png

    STEP 4: Specify target system details. The target HANA system is connected through DB Connection. Provide HANA system connection parameters. Administrator user privilege required.

    SLT4.png

    When the popup to create a new configuration is closed by pressing the OK button, the following actions are performed automatically:

     

    • Configuration settings are saved on the LT Replication Server
    • New user and schema are created on the HANA system with the defined target schema name (not performed if an existing schema is reused).
    • Replication control tables (RS_* tables) are created in target schema.

              RS_LOG_FILES, RS_MESSAGES, RS_ORDER, RS_ORDER_EXT, RS_SCHEMA_MAP, RS_STATUS

    • User roles for the target schema are created:
      • <target_schema>_DATA_PROV -> Role to manage data provisioning
      • <target_schema>_POWER_USER -> Contains all SQL privileges of the target schema
      • <target_schema>_USER_ADMIN -> Role to execute authority
    • A procedure to grant (RS_GRANT_ACCESS) or revoke (RS_REVOKE_ACCESS) are created in the target schema
    • Replication of tables DD02L (stores the table list), DD02T (stores the table short descriptions) and DD08L (R/3 DD: relationship definitions) is started automatically. Once those tables are replicated, the HANA studio knows which tables are available in the source system.
    • SYS_REPL and table RS_REPLICATION_COMPONENTS are created (If they don’t exist already based on a previous configuration)
    • Replication is registered in table RS_REPLICATION_COMPONENTS

    SLT4a.png

    SAP LT REPLICATION OPTIONS:

    SLT8a.png

    Use LTRC transaction code to open LT replication Server Cockpit.We can use SLT options from SLT replication server as well as from  HANA data provisioning.Select table and replication option from LT Replication server to start/replicate/stop/suspend/resume replication.

    SLT5.png

    HANA data provisioning option can be seen in Quick launch view.

    SLT7.png

    Select table to start replication.

    SLT8.png



    SAP HANA Smart Data Access(1): A brief introduction to SDA

    $
    0
    0

    We have a Chinese version (”SAP HANA Smart Data Access(一)——初识SDA”) of this blog.


    Introduction

          In the application scenarios of SAP HANA, it is common to analyze and process the data located in other systems. Usually, customers would like to replicate the data from other systems to SAP HANA, and then do analysis and processing in SAP HANA. However, data replication not only costs time and memory, but also, usually, requires another replication system deployed which is always not easy. SDA, abbreviation for Smart Data Access, provides customers a new way to access the date in remote data source.

    What is SDA

    SDA is a new method of SAP HANA for accessing the data stored in remote data sources. With the help of SDA, SAP HANA can create  so-called “virtual table” mapping to tables located in remote data sources, and then SAP HANA can access the data directly by accessing the “virtual table”. “virtual table” can be manipulated by SAP HANA just like an ordinary table, which means the operations, such as select, update, insert, delete, and so on, are all available for “virtual table”. Besides, join operation between local table and “virtual table”  is supported. When such join operation taken, optimizer of SAP HANA sends the relevant operations to remote data source for processing, and then the result set would be sent back to SAP HANA for further processing.

    SDA was introduced in SAP HANA SPS06. At that time, the data sources supported by SAP HANA SDA includes: SAP HANASybase ASESAP Sybase IQTeradata database and Apache Hadoop. And only read operation was permitted for virtual table. In SAP HANA SPS07, the data sources and operations supported by SAP HANA SDA are both extended. MSSQL Sever and Oracle are added to the list of supported data source, and write operation is permitted. The comparison of SDA in SPS06 and SPS07 is as below:

     

    SPS06

    SPS07

    Supported data sources

    SAP  HANASAP  Sybase ASE 15.7 ESD#4SAP Sybase  IQ version 15.4 ESD#3 and 16.0

    Teradata  database version 13.0Intel Distribution for Apache Hadoop version 2.3

    All data sources supported in SPS06Oracle Database 12cMicrosoft SQL Server version 11 for SQL Server 2012

    Supported operations for virtual table

    select

    selectinsertupdatedelete

    Notethe data sources officially supported by SAP HANA are limited to specific versions above, other versions are not guaranteed to work well.

    Creating Data Source

         The first step of accessing remote data source is to create remote data source in SAP HANA. The communication between SAP HANA and remote data source is based on ODBC protocol. The subsequent blogs of this series will talk about how to deploy remote data source in SAP HANA server side. Here, let’s simply talk about how to create remote data source in SAP HANA Studio.

          In SAP HANA Studio, there are two ways to create remote data sources, one is by GUI, another is using SQL statement.

    (1) Create remote data source with GUI:

    1.png

    Firstly, Open the folder called “Provisioning”. And then right click the “Remote Sources” with mouse, select “New Remote Source…”:

    2.png

    Secondly, choose one adapter from the adapter list in the popup dialog, and fill in corresponding connection and authentication information of the remote data source.

    Lastly, press the run button 5.png to create data source.

    (2) Create remote data source with SQL:

    CREATE REMOTE SOURCE  <src_name>

    ADAPTER <adapter_name> [CONFIGURATION FILE 'filename']

    CONFIGURATION <connection_info_string>

    [opt_credentials_clause]


    Example:

    CREATE REMOTE SOURCE ORCL_11g_LNX

    ADAPTER "odbc"

    CONFIGURATION FILE ‘property_orcl.ini’

    CONFIGURATION ‘DSN=oral11g_lnx'

    WITH CREDENTIAL TYPE ‘PASSWORD'

    USING ‘user=OUTLN;password=Aa111111';


    In above SQL statement, <adapter_name> can be one of:  ASEODBC, IQODBC,TDODBC, HIVEODBC,ODBC. Obviously, ASEODBC is for Sybase ASE as data source, IQODBC is for Sybase IQ, TDODBC is for Teradata Database, HIVEODBC is for Hadoop. And ODBC adapter is for other common data sources. <connection_info_string> is used to specify the connection information for data source, the name of DSN is usually  given here. <opt_credentials_clause> is used to specify the authentication information of data source. Attention please, only adapter ODBC requires the CONFGURATION FILE, the functionality of configuration file will be introduced in next section.

    Generic Adapter Framework

    With the help of SDA, SAP HANA can communicate with the data sources who supports the ODBC protocol. However, just as discussed above, the supported data source of SAP HANA SDA is still limited now. For the supported specialized data source, SAP HANA has provided native code to support their operations. But SAP HANA SDA can’t guarantee other data sources to work well. What stop SAP HANA SDA from supporting more ODBC data sources is that some operation and configuration of these ODBC data sources can’t be processed by standard ODBC interface. For example, prepare a transaction for Sybase ASE requires some additional code which is not included in standard ODBC protocol. As SAP HANA provides such code, the operation for Sybase ASE is supported.

    In order to decrease the influence of this issue, SAP HANA SDA applies Generic Adapter Framework to implement the communication with those unsupported ODBC data sources instead of calling the specialized native code for that data source in SAP HANA. With the help of Generic Adapter Framework, you can customize the feature and action of data source by setting a configuration file. For example, you can specify the supported operations, function mapping, data type mapping of the data source in the configuration file. For convenience of illustration, we call the configuration file “Property Configuration File” in the rest of this blog.

    When creating data source, SAP HANA SDA will use the Generic Adapter Framework to communicate with remote data source if the ODBC adapter is chosen. SAP HANA SDA will search the property configuration file in the folder specified by environment variable DIR_EXECUTABLE, and the file name is specified by CONFIGURATION FILE option. By SPS07, SAP HANA SDA has provided the template of property configuration file for MSSQL and Oracle. They are called property_mss.ini and property_orcl.ini, and they are both located in the folder: $DIR_EXECUTABLE/config.

    After data source created, SAP HANA SDA will parse the relevant property configuration file, all the features, function mappings, data type mappings and other properties will be linked together with the data source, and influence the communication between the SAP HANA and the data source.

      A part of content of property_orcl.ini is below, we can figure out some format and function of property configuration file:

    3.png

    Typical process of creating data source

          Creating a remote data source in SAP HANA usually involves steps below

    1. Check whether SAP HANA provides specialized adapter for the data source, such as "ASEODBC","IQODBC","TDODBC";
    2. If specialized adapter is available, then just use it to create data source;
    3. If specialized adapter is not available, then check whether there is a specialized property configuration template file, such as template for Oracle, MSSQL;
    4. If specialized property configuration template exists, you can change the property configuration file according to your requirement, and then create data source using the modified file. For example, as long as the correctness of the modification ensured, you can disable the unnecessary functions, modify the mapping of data type or function based on you requirement.
    5. If specialized property configuration template exists, you have to create a brand new property configuration file from scratch. To create such a file, you must be familiar with the properties of the data source and the driver it use;
    6. Create the data source in SAP HANA Studio using the specialized adapter or the common adapter (ie. ODBC adapter). When using common adapter, you need to specify the property configuration file for the data source.

    Note: when modifying or creating the property configuration file, only the property which is different from default value needs to be set. The mistake in property configuration file may result in incorrect action or result of data source.

    Creating virtual table

                    After data source created in SAP HANA Studio, “virtual table” mapping to data in remote data source can now be created in SAP HANA Studio. Similar to creating data source, there are also two ways to create virtual table:

    (1) Create virtual table with GUI:

    4.png

    (2) Create virtual table with SQL Statement below:

       CREATE VIRTUAL TABLE <table_name> <remote_location_clause>

    Example:

      CREATE VIRTUAL TABLE sda.vt AT "ORCL_11G_WIN"."NULL"."OUTLN"."CANDIDATES";

    Conclusion

        In this blog, we have talked about  basic content of SAP HANA SDA. In the subsequent blogs of this series, we will further talk about how to deploy SDA data source in SAP HANA Server, how to access Hadoop with the help of SAP HANA SDA, and so on. Please pay attention.

    Reference

    1. What’s New SAP HANA Platform Release Notes for SPS07:

       http://help.sap.com/hana/Whats_New_SAP_HANA_Platform_Release_Notes_en.pdf

    2. Section 6.1.1 of SAP HANA Administrator Guide

      http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

    SAP HANA Smart Data Access(2): How to install and configure ODBC driver for SDA data source

    $
    0
    0

    We have a Chinese version(SAP HANA Smart Data Access(二)——SDA数据源驱动的安装与配置) of this blog.

    Introduction

          In the blog “SAP HANA Smart Data Access (1):A brief introduction to SDA”, we introduced the architecture of SDA and talked about how to add remote data source for SDA in SAP HANA Studio. Before adding remote data source for SDA, it is necessary to finish the installation and configuration the ODBC manager and ODBC driver for SDA date source in SAP HANA server side. For different SDA data sources, the process of installing and configuring ODBC driver is similar. In this article, we take the Oracle data source as an example to talk about how to install and configuration ODBC driver.

    Installation of unixODBC driver manager

          Since SAP HANA SDA communicates with remote data source using ODBC protocol, the ODBC driver manager must be installed in SAP HANA server side. Usually, the unixODBC is chosen as the driver manager for SAP HANA SDA. Software package of unixODBC can be downloaded from website: http://www.unixodbc.org/ . Please note that the version of unixODBC needs to be 2.3.0 for SQL Server data source, and 2.3.1 or newer version is required for other data sources. The process of installing unixODBC is below:

    1. Download corresponding version of unixODBC package, the name of package is unixODBC-x.x.x.tar.gz, x here stands for version number.
    2. Login in the SAP HANA server as the root user, decompress the unixODBC package to specified folder.
    3. Enter the folder specified in step 2, then execute the commands below in order:

    ./configure

    make

    make install

       4. unixODBC should be installed by now, you can execute “isql --version” to check whether the unixODBC is installed successfully.

    Installation of ODBC driver for data source

          So far, the data sources supported by SAP HANA SDA includes: SAP HANA, SAP Sybase ASE, Teradata database, Oracle, MS SQL Server and Hadoop. The ODBC driver for the database productions, such as Sybase ASE and Oracle, can be downloaded from the official website of the database. For example, you can download ODBC driver for Oracle form website: http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html . As for Hadoop data source, SAP HANA SDA can communicate with it through Hive. More details about connection between SAP HANA SDA and Hadoop in subsequent blogs in this series. And the SAP official recommendation for Hive driver is the HiveODBC driver provided by Simba Technology. Simba HiveODBC can be achieved from the Simba website: http://www.simba.com/connectors/apache-hadoop-hive-odbc.

          After downloading the ODBC driver, the driver can be installed according to relevant installation guide. Take Oracle as an example here, two zip packages should be downloaded: instantclient-basic-linux.x64-xx.x.x.x.x.zipinstantclient-odbc-linux.x64-xx.x.x.x.x.zip, x stands for version number. Then decompress the two packages to a same folder with unzip command. The default unzipped directory is instantclient_xx_x. if everything goes well, you can find the ODBC driver file for Oracle in the unzipped directory, the file name is libsqora.so.xx.x . By now, the ODBC driver for Oracle database is installed.

    Configuration of ODBC data source

          ODBC configuration file needs to be created after installing ODBC data source, and some environment variable should be set accordingly. Let’s also take Oracle as an example:

    1. Login SAP HANA Server as sidadm, sid is the ID of SAP HANA instance.
    2. Enter the home directory of sidadm, create the ODBC configuration file named “.odbc.ini”.
    3. Edit .odbc.ini with vim, the content should be like below:

     

    [ORCL_DSN]

    Driver=/path/to/driver/ libsqora.so.xx.x

    ServerName=ORCL

     

    ORCL_DSN is the name of ODBC data source, which is used by ODBC Manager to find the information for connection. Driver is a keywords used to specify the path of ODBC Driver of data source. For oracle, ServerName is the name of Oracle database which is defined in the file called “tnsnames.ora” located in home directory of sidadm. For other kinds of database, “ServerName” is replaced by other keywords, such as “ServerNode” for SAP HANA, “Server” and “Database” for MS SQL Server.

        4. For Oracle data source, create a file called “tnsnames.ora” in the home directory of sidadm, then edit it with vim, the content should be like below:

     

    ORCL=

    (DESCRIPTION =

                            (ADDRESS = (PROTOCOL = TCP)(HOST = <host_ip>)(PORT = 1521))

                            (CONNECT_DATA =

                                            (SERVICE = DEDICATED)

                                            (SERVICE_NAME =orcl)

    )

    )

     

        5. Set some environment variables in the shell script file “hdbenv.sh” of SAP HANA, this shell script file is located in the folder specified by environment variable “DIR_INSTANCE”. Add the commands below to this shell script file:

     

    export LD_LIBRARY_PATH = $LD_LIBRARY_PATH:your_oracle_driver_dir/instantclient_12_1

    export LD_LIBRARY_PATH = $LD_LIBRARY_PATH:/usr/local/lib

    export ODBCINI = $HOME/.odbc.ini

    export TNS_ADMIN=~/   (For Oracle only)

     

    Here, please pay more attention to “LD_LIBRARY_PATH” variable, all the libraries depended by ODBC driver are searched orderly in the folder list specified in the “LD_LIBRARY_PATH” variable. If any folder which contains the depended library is not included in the folder list, ODBC Manager will say can’t find the library file.

         6. Restart SAP HANA, login SAP HANA Server as sidadm, and execute the command “isql –v <DSN> ” to test the connection to data source. If connection succeeded, the configuration is finished. If connection failed, please analyze and process the error according to error message. Some tips for error processing would be introduced in next section.

    Trouble shooting

       1. Error message:

    [08004][unixODBC][Oracle][ODBC][Ora]ORA-12154: TNS:could not resolve the connect identifier specified

    Analysis:

    This error is raised by ODBC Manager, it is easily misunderstood as an oracle error. However, this reason of this error is that the environment variable TNS_ADMIN is not set or set incorrectly. The TNS_NAME tells where is the file tnsnames.ora. So, if TNS_ADMIN is not set correctly, the ServerName specified in .odbc.ini can’t be resolved. Then the error is raised.

    Solution:

    Set the environment variable TNS_ADMIN to the home directory of sidadm in the hdbenv.sh.

       2. Error message:

    [01000][unixODBC][Driver Manager]Can't open lib '/path/to/driver/libsqora.so.12.1' : file not found [ISQL]ERROR: Could not SQLConnect

    Analysis:

    The error message says the file libsqora.so.12.1 can’t be found, but this file do exist. Let’s use command ldd to check the dependency of this file, we get:

    1.png

    We can see that the file libclntsh.so.12.1 is depended by libsqora.so.12.1, and it can’t be found. Although libclntsh.so.12.1 is in the same folder with libsqora.so.12.1, but the folder is not in the folder list specified by LD_LIBRARY_PATH, so it will not be searched.

    Solution:

          Add the folder which contains libsqora.so.12.1 to LD_LIBRARY_PATH in the hdbenv.sh.

    Create remote data source

           After all the configuration finished, create a remote data source following the procedure introduced in “SAP HANA Smart Data Access(1)——A brief introduction to SDA”.  The tables in remote data source can be viewed through SAP HANA Studio after remote data source created, as shown below:

    2.png

    Conclusion

          In this blog, we take oracle as an example to illustrate how to install and configure the ODBC manager and ODBC driver of remote data source, and simply discuss some error which may happen during the procedure of installation and configuration. The installation and configuration of other data sources supported by SAP HANA SDA is similar, some little difference would be introduced in subsequent blogs of this series.

    Reference

    1. SAP HANA Smart Data Access(1)——A brief introduction to SDA
    2. Section 6.1.1 of SAP HANA Administrator Guide

            http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

    SAP HANA Smart Data Access(3)—How to access Hadoop data through Hive with SDA

    $
    0
    0

    We have a Chinese version(SAP HANA Smart Data Access(三)——如何利用SDA通过Hive访问Hadoop数据) of this blog.

    Introduction

          In previous blog of this series, we talked about how to install and configure the data source of SDA in SAP HANA Server side. As most data sources supported by SAP HANA are all database, the procedure of installation and configuration is similar. But for Hadoop data source, something is different. As a distributed data processing platform, Hadoop usually store data in HDFS file system, or in NoSQL database HBase which is also usually based on HDFS. However, both of HDFS and HBase don’t support ODBC protocol. So we need another member of Hadoop family  to solve this problem, it is Hive. Hive implements SQL interface for HDFS and HBase, and HiveODBC driver is also provided. In this blog, we’ll talk about how does SDA access the Hadoop data through Hive.

    Deploy Hadoop and Hive

          The Official version of Hadoop supported by SAP HANA SDA is “Intel Distribution for Apache Hadoop version 2.3” (Including Apache Hadoop version 1.0.3 and Apache Hive 0.9.0). Although there’s only one version in the official supported version list, the experiment of this blog shows that SDA can also access the data stored in ordinary Apache version of Hadoop. The experiment of this blog build up a Hadoop cluster containing 3 nodes, and the version of Hadoop and Hive is : Apache Hadoop 1.1.1 and Apache Hive 0.12.0.

          As the guide of deploying Hadoop and Hive can be easily found in internet, we don’t discuss it here. After deploying Hadoop and Hive, some data for experiment needs to be prepared. Here, We use a user information table, the structure of the table is :

    Column Name

    Data Type

    USERID

    VARCHAR(20)

    GENDER

    VARCHAR(6)

    AGE

    INTEGER

    PROFESSION

    VARCHAR(20)

    SALARY

    INTEGER

          Data can be imported from csv file into hive table. Firstly, create a table using hive shell.

    create table users(USERIDstring, GENDERstring, AGE int, PROFESSION string, SALARY int)

    row format delimited

    fields terminated by '\t';

          Then, import data from csv file to the users table:

    load data local inpath '/input/file/path'

    overwrite into table users;

    Here, the data is imported from local file system, Hive can also import data from HDFS. In this experiment, the number of records in users table is 1,000,000. After importing, count the record number:

    1.png

    As shown in the picture above, Hive call the MapReduce to query data, and it takes 14.6 seconds to count the record number of users table. Afterwards, select out the top 10 records of the table:

    2.png

    As we see, it takes 0.1 second.

    Installing and configuring HiveODBC Driver

          Same as installing driver for other data sources, installing HiveODBC driver also requires unixODBC installed in the SAP HANA Server side. HiveODBC requires unixODBC-2.3.1 or newer version. For more details about installing unixODBX, please see the reference [2].

          unixODBC installed, Begin to install HiveODBC driver. As introduced in reference [2], we use the HiveODBC provided by Simba Technologies. The procedure of installing is like below:

    1. Download Simba HiveODBC driver package, and decompress the package to a certain directory. Then enter the directory: /<DRIVER_INSTALL_DIR>/samba/hiveodbc/lib/64 (use 32 to replace 64 if it’s for 32-bit system) to check the driver file libsimbahiveodbc64.so.
    2. Login SAP HANA Server as sidadm.
    3. Execute “HDB stop” to stop the SAP HANA.
    4. Copy the file “/<DRIVER_INSTALL_DIR>/simba/hiveodbc/Setup/simba.hiveodbc.ini” to the home directory of sidadm.
    5. Edit the ~/.simba.hiveodbc.ini with vim.
    6. If there’s one row “DriverManagerEncoding=UTF-32”, change it to UTF-16.
    7. Check the ErrorMessagePath = /<DRIVER_INSTALL_DIR>/simba/hiveodbc/ErrorMessages, correct it if it doesn’t points to right path.
    8. Comment out the row: ODBCInstLib=libiodbcint.so, and add a new row: ODBCInstLib=libodbcinst.so.
    9. Edit the .odbc.ini file in home directory of sidadm, add a new DSN for hive, the default port for hive is 10000, here’s an example:

    [hive1]

    Driver=/<DRIVER_INSTALL_DIR>/simba/hiveodbc/lib/64/libsimbahiveodbc64.so

    Host=<IP>

    Port=10000

       10. Edit the file $HOME/.customer.sh to set some environment variable:

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:

    /< DRIVER_INSTALL_DIR>/simba/hiveodbc/lib/64/

    export ODBCINI=$HOME/.odbc.ini

       11. Use isql to check whether SAP HANA Server can connect to remote data source successfully:

    isql –v hive1

       12. If connect successfully, execute “HDB start” to start SAP HANA.

    Create hive data source

          When installing and configuring HiveODBC finished, create the hive data source in SAP HANA Studio following the steps introduced in Reference [1]. Here, you need to choose the HIVEODBC as the adapter.

          After hive data source created, you can view the tables in hive, as shown in picture below:

    3.png

    Query Hive virtual table

          Add a new virtual table which maps to table users in Hive following the steps introduced in Reference [1]. Then count the record number of the virtual table:

    4.png

    As shown above, it takes 14.1 seconds to count the virtual table in SAP HANA Studio, which is close to the time it cost in Hive side. The result tells us that SAP HANA SDA doesn’t influence the performance of operation in remote data source when little data transmission involved.

    Conclusion

          In this blog, we illustrate how SAP HANA SDA access the Hive table stored in Hadoop using a simple example. Hive is a tool which provide SQL interface for Hadoop. From the experiment result, querying the virtual table in SAP HANA Studio and querying the Hive table in Hive side is very close in performance when little data transmission involved.

    Reference

    1. SAP HANA Smart Data Access1)——A brief introduction to SDA
    2. SAP HANA Smart Data Access2)——How to install and configure the data source driver of SDA
    3. Section 6.1.1 of SAP HANA Administrator Guide: http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

    SAP River(1)—SAP River overview

    $
    0
    0

    We have a Chinese version (SAP River ()SAP River概述) of this blog.

    Introduction

    The most common architecture of SAP HANA application is like below:

    1.png

    Figure 1: Traditional SAP HANA application architecture

       With the traditional architecture shown in Figure 1, application developer is responsible for both creating  data model in database level  and implementing control logic in XS level. SQL and SQL Script are required to create data model in SAP HANA database, while xsjs needed to implement control logic in XS level. Therefore, the developer of SAP HANA application at least needs to grasp two technologies to finish a SAP HANA application. Sometimes, developing a SAP HANA application needs the cooperation of two or more developers. SAP River is an option to void this problem.

       SAP River is a brand new method to develop SAP HANA application. SAP River consists of a programming language, a programming model and a suit of development tools. With help of SAP River, developer can concentrate on the design of business intent, ignoring how is it implemented and optimized in SAP HANA database.  SAP River only exists in design-time. When SAP River objects activated, all SAP River code is compiled into SQL Script or XSJS code and then transferred to XS engine indexserver for execution.

    2_en.png

    Figure 2: Function model of SAP River

    The function model is shown in Figure 2. SAP River integrates all the segments involved in development of SAP HANA application, including data modeling, control logic and access control, which makes developer capable of accomplishing a SAP HANA application grasping only one single technology. Firstly, developer can design the data model for SAP HANA application with SAP River language. During compilation, SAP River will create corresponding database objects for the data model you designed using SAP River language. For example, the entity in SAP River program will be mapped to a table in SAP HANA database. Secondly, using SAP River language, developer can define methods for data objects, which implements business logic. Last but not least, SAP River provides developer a way to design the access control for data and method.

    SAP River Language

          As a new generation of SAP HANA application development method, SAP River provides a strongly typed and declarative programing language. With this programming language, even a non-computer-major developer can also easily develop a SAP HANA application. In addition, SAP River also supports embedded SQL Script and XSJS code in SAP River program, which is useful for some complicated logic.

          SAP River Language mainly includes:

    1. Application: Application is the largest object in SAP River Language, all other objects must be included in an application. Objects in an application can be exposed to external applications via some ways, such as OData.
    2. Entity: Entity is used to store data in database. Usually, one entity is mapped to a table in SAP HANA Database. An entity consists of one or more elements. Usually, one element is mapped to a column of a table. Each element has its own type, the type here can be fundamental type such as integer, string, and can also be a custom type or entity.
    3. Types: Type defined the size and validity of data. SAP River supports different kinds of types, including fundamental type, structured type and stream type. Each entity will automatically define a corresponding implicit type, which has same name with the entity.
    4. Action: Action is similar to the function or method in other programming languages. Usually, SAP River define the business logic of the application in the actions.
    5. View: With the help of view, you can create a data stream using select statement, this data stream is a subset of target data set. The data of this data stream will be dynamically extracted from target data set when the data stream is used.
    6. Role: Role can be created in SAP River code, and assigned some privileges to this role. In this way, access control is implemented.

    3.png

    Figure 3. SAP River example program: Hello World

          As a programming language, SAP River also provides some libraries, which contains many common functions. These libraries can be divided to some categories according to their functionality:

    1. Math Functions: mathematic calculation functions, such as absolute, ceil, floor, etc.
    2. Integer Functions: process the data of integer type, such as toDecimalFloat, toString, etc.
    3. String Functions: deal with strings, such as length, substring, etc.
    4. Date Functions: date functions, such as day, month, etc.
    5. Session Functions: session functions, such as getUserName.
    6. Logging Functions: log function, such log.
    7. Utility Functions: some utility functions, such as parseDecimalFloat, parseInt, etc.
    8. HTTP Calls: Http library is used to send Http request in SAP River code, such as OData or REST services.

    OData Calls

          SAP River can expose the data and business logic to client via OData protocol. SAP River can expose data to client either in application level or in namespace level via OData. If exposed in namespace level, then every entities, views, and actions in this namespace are exposed to client. If exposed in application level, then only the objects that are tagged for exposure are exposed to client. Here, let’s take application level as example to illustrate how to expose data via OData.

    4.png

    Figure 4. Expose SAP River application via OData

       ① when expose object in application level, you must use the keyword export to specify which objects are to be exposed to client. Such as Employee in TestSyntax here.

       ② There are more than one ways to expose SAP River object, OData is one of them. So it is necessary to add a notation “@OData” to declare the way of exposure. SAP River will create corresponding OData service for the application or namespace.

       ③ By Default, SAP River object is private. To let specified users to access certain object, you need to use the keyword “accessible by” to tell which role or privileges are required to access the object.  If every users are allowed to access, you can just use “accessible by sap.hana.All”.

    How to learn SAP River

          SAP HANA begin to support SAP River from SPS07, we have many materials to study SAP River. Here are some materials for studying SAP River:

    1. SAP River Datasheet
    2. SAP River Help Docs
    3. Videos and PPTs introducing SAP River
    4. Series of videos introducing SAP River in SAP Academy

    Conclusion

            In this blog, we talked about functionality of SAP River, advantage and structure, including the advantages of SAP River over traditional development framework, main features of SAP River and how to expose data to client via OData.

    Reference

    1. SAP River Language Reference
    2. SAP River Developer Guide

    Update HANA table and column descriptions from ABAP Dictionary.

    $
    0
    0

    If you are modelling HANA views based on a Suite (like ERP, CRM, SRM) on HANA system, you would probably like to have the table and table field descriptions from the ABAP Dictionary available to you in the HANA Studio Modeler/Editor. Once this ABAP report has been ran the table and column description are shown in the HANA Studio editor and automatically copied to the semantic layer:

     

    at_view.PNG


    Run the following ABAP report in the background to update the descriptions in the HANA database based on the ABAP Dictionary. Only works on NW 7.4 (ERP Ehp7 or BW 7.4 or CRM 7 Ehp 3) system:

     

    *&---------------------------------------------------------------------*
    *& Report  Z_HANA_SET_TABLE_COMMENTS
    *&
    *&---------------------------------------------------------------------*
    REPORT z_hana_set_table_comments.
    DATA: lt_tables       TYPE TABLE OF dd02t,      lv_table        TYPE dd02t,      lt_dfies        TYPE TABLE OF dfies,      lv_dfies        TYPE dfies,      lv_fieldcomment TYPE string.
    SELECT t~tabname t~ddtext into corresponding fields of table lt_tables FROM dd02t AS t  INNER JOIN dd02l AS l ON l~tabname = t~tabname WHERE t~as4local = 'A' AND t~ddlanguage = sy-langu AND l~tabclass = 'TRANSP'.
    LOOP AT lt_tables INTO lv_table.  TRY.      NEW cl_sql_statement( )->execute_query(           ` COMMENT ON TABLE "` && lv_table-tabname && `" is '` && lv_table-ddtext && `' ` ).      CALL FUNCTION 'DDIF_FIELDINFO_GET'        EXPORTING          tabname        = lv_table-tabname          langu          = sy-langu        TABLES          dfies_tab      = lt_dfies        EXCEPTIONS          not_found      = 1          internal_error = 2          OTHERS         = 3.      IF sy-subrc = 0.        LOOP AT lt_dfies INTO lv_dfies.          TRY.              lv_fieldcomment = cl_abap_dyn_prg=>escape_quotes( CONV string( lv_dfies-fieldtext ) ).              NEW cl_sql_statement( )->execute_query(                         ` COMMENT ON COLUMN "` && lv_table-tabname && `"."` && lv_dfies-fieldname && `" IS '` && lv_fieldcomment && `' ` ).            CATCH cx_sql_exception INTO DATA(oreffield).              WRITE: / 'Error: ', oreffield->get_text( ).               COMMIT WORK.          ENDTRY.        ENDLOOP.      ENDIF.    CATCH cx_sql_exception INTO DATA(oref).  ENDTRY.
    COMMIT WORK.
    ENDLOOP.
    WRITE: / 'Table and field comments updated.'.

    Reading and Writing to HADOOP HBASE with HANA XS

    $
    0
    0

    This is a companion to my earlier blog, where I demonstrated HADOOP HBASE records being read by HANA and presented using SAPUI5.

    Reading and Writing to HADOOP HBASE with HANA XS

     


    In this blog I expand on this topic further by enabling HANA to read and write to HBASE.

    I’ve created a simple application where HBASE is used to capture the change log of a HANA table.


    But why, you might say?


    HADOOP is designed for BIG Data (e.g. at the Petabyte scale), at it’s core it uses disks for storage. It’s still very much a ‘do it yourself’ technology with very little Enterprise ready applications built on top yet.

    HANA is arguably designed for LARGE Data (e.g. at  the Terabyte scale) and makes use of the latest In-Memory technology. By contrast it has a huge catalog of SAP software now running on it.


    The two technologies are complementary in an Enterprise Data Architecture.


    Many modern website (e.g. Facebook, Twitter, Linkedin) use a complex combination of HADOOP, traditional RDBMS, custom development and a wide collection of website scripting tools.  The aspiration for many of these websites is millions or billions of users.


    For Enterprises, who don’t consider IT their core business, then venturing down this path may not be the most straightforward option. HANA is more that just a database, it also provides the platform for building and deploying custom Enterprise Applications for Desktop and Mobile. In this case a far simpler BIG Data Architecture might be to use just HANA & HADOOP. Enterprise applications may be only targeted to 100’s to 10K’s of users. 


    In the following example I’ve built a very simple application on HANA, which enables a table to be maintained on a HTML/5 webpage.  For audit and control purposes I wanted to keep a log of all the changes to Column values by the users.  I could have implemented this purely in HANA, but to demonstrate the integration potential of HANA and HADOOP HBASE, I have opted to also write the changes to HBASE.  On a small scale the benefit of this is negligible, but on a larger scale there may considerable cost saving for storing low value data on an open source disk based solution such as HADOOP.


    The following diagram represents the technical data flow of my example:

     



    Now for an example.

    The following shows the DOCUMENTS table in HANA, and HBASE Table Log prior to a change:


    In HBASE the change log ( for Doc_Name DOC1 & Row 1)  is stored as:



    Now  I make a change, e.g. Changing the Free Text from ‘APPLE’ to ‘APPLE CIDER’

    Update Successful!  (Changes written to HANA and HBASE)


    From SAPUI5 the HBASE Change Log appears as:

    Above you can now see the history of the of the 'Free Text' field



    In HBASE the Change log Table appears as:

    NOTE:  the HADOOP User Interface (HUE) only the latest change is shown, however behind the scenes I’ve defined the HBASE table to store up to 100 changes (versions) of the an individual column.


    I can also check these in Hbase Stargate directly, though they are BASE64 encoded:


    OR by checking the HbaseTableLog xsjs I created, using the GET Method (which decodes):

    NOTE: I used POSTMAN here to help format the returned JSON to make a bit easier to test. Above you’ll see the history for the Free_Text field.



    The key feature to get this prototype working is the HANA SPS07 functionaility which enables XS Javacript libraries (xsjslib)  to be called  on ODATA CRUD  events.


    E.g. DOCUMENTS.xsodata

    service {

      "HADOOP"."HanaHbase::DOCUMENTS"  as "DOCUMENTS"

         update using "HanaHbase:DOCUMENTS_table_exits.xsjslib::update_instead"

         ;

    }


    NOTE:  For the comparison of Before and After record, to determine the field changed, I’ve made use of the example code provided by Thomas Jung , for convert a SQL record set to a JSON object, see http://scn.sap.com/thread/3447784


    During the ODATA PUT (UPDATE) I’ve modified both HANA & HBASE with the most recent change.


    The HANA table is only setup to store the current value.

    The HBASE equivalent table  I’ve defined to keep the most recent 100 changes per COLUMN.


    The HBASE Log table was defined as:

    create 'HanaTableLog', {NAME => 'table', VERSIONS => 100}


    The complete code for this prototype is available on github.

    https://github.com/AronMacDonald/HanaHbase

    HANA Artifacts Part3.1 (SQLScript Debugger SP7)

    $
    0
    0

    Please refer http://scn.sap.com/docs/DOC-48851 for debugging procedure in HANA SP6.

     

    Before SPS 07, anyone with SYS.DEBUG role was authorized to debug everything, this authorization is no longer required in SPS 07.

    Now this restricts the debugging of procedures owned by other users.

     

    With roles:

    Grant debug on <proc> to <user>

    Grant debug on schema <schema> to <user>

    By giving these roles, users can be assigned for debugging a particular procedure or at schema level.

     

    Let's start debugging now...

     

    First create a wrapper procedure and set the break point (open is _SYS_BIC schema):

    1.png

    Open debug prospective and configure a new debugger.

     

    2.png

    Now select Catalog schema and select the procedure to debug from the schema, click 'Browse' button on the right.

     

    3.png

     

    5.png

    4.png

     

    OUT_MARA has one row selected.

    Introduction to HANA XS application development (Part 3): BlogProject data modeling with information views

    $
    0
    0

    In this third part of the BlogProject, we will see how to create the data model so that we can manage and analyze our data in a more robust manner and add our application logic on top of the data.

    If you missed the previous posts about the creation of an XS project and the persistence model (especially the data schema in the 2nd part) make sure to have a look:

    http://scn.sap.com/community/developer-center/hana/blog/2014/05/27/introduction-to-hana-xs-application-development-part-1-blogproject-creation

    http://scn.sap.com/community/developer-center/hana/blog/2014/05/27/introduction-to-hana-xs-application-development-part-2-blogproject-persistence-model

     

    The modeling components that HANA provides are SQL views and information views. Of course HANA provides the capabilities of the standard SQL views. But what seems to be the most interesting part of HANA modeling are information views. When we talk about information views, we refer to the three types of views: attribute views, analytic views and calculation views. These views are non-materialized, leading to increased agility and less storage requirements.

     

       info views.png

                                Image 1: Information views architecture


    Information Views


    Attribute views

    Attribute views are dimensions, BW characteristics or master data (mostly) and are used to join dimensions or other attribute views, creating logical entities (like Product, Employee, Business Partner) based on multiple objects. However, if the needs of the use case require it, attribute views can be created on transaction data too. Attribute views are re-usable and often shared in Analytic and Calculation Views. Lastly, attribute views include only attribute, and not measure, columns, and user defined calculated columns that use certain functions on attribute columns.


    Analytic views

    Analytic views are thought of as OLAP views, representing star schemas, where one or more fact tables are surrounded by dimensions. Analytic views are used for multidimensional data analysis and cube queries. They are typically defined on at least one fact table, that contains transactional data, along with a number of tables or attribute views, that play the role of dimensions to the fact table(s). In addition, in this type of views, you can define calculated columns based on other columns, or restricted columns, which enable you to filter the data on certain columns. Except from that, there is the capability of adding input parameters and variables, to affect the result of the view. Furthermore, except from attribute columns, an analytic view can contained measures, which can be aggregated.

    Calculation views

    Calculation views are composite views used on top of analytical and attribute views, as indicated by Image 1, used to perform more complex calculations not possible with other views. They are usually used when working with multiple fact tables, or when Joins are not sufficient. They can use any combination of any number of tables, attribute and analytic views, using union, join, aggregation and projection operators. A calculation view can be defined in a graphical manner, which is usually the case, producing graphical calculation views, or in a scripted manner with the use of SQL Script, producing scripted calculation views. Lastly, a calculation view allows the definition of attributes and measures, as well as the creation of calculated and restricted columns and input parameters and variables.

    When to use each view

    Searching the web and communities, I found the below picture that contains the whole concept of the data analysis in HANA and what component to use, depending on the requirements of your application. In this part we will deal only with the information views, leaving the scripting methods for later analysis.

     

                                  HANA analyze.png

                                                                          Image 2: Information views usage

    Information views and BlogProject

    In our use case we will only use attribute views for joining our master data tables and calculation views to provide multiple joins and aggregations. We will not use analytic views because in this particular application the data do not require a star schema representation. In fact what we want to do here is create collections of attributes, measures, calculations and aggregations that will be exposed to the user interface and provide the data that a user will need. So, if we think about it, what a user wants (in fact what we want our user to want) is to read and write posts, read and write comments, see his/her profile and see country statistics.

      all views.png

    Create attribute views

    • AT_COMMENT


    First we will create the comment entity that we want to expose to a user. First of all the user must see the text of the comment, the date and the user that wrote it. If we consider creating a profile screen for the user, then his/her comments will also be shown there, but it would be great for the user to be able to see the corresponding post of a certain comment, too.

    So, in the below attribute view we join the tables POST and USER with the COMMENT in the columns that play the roles of foreign keys. We keep both the ID and name(Username, PostTitle) columns because it is easier for as to manage ids, but easier for the user to manage names.

    To create this view you have to drag your mouse from the foreign key to the primary key, for example from COMMENT.UserID to USER.ID, and then choose the type of join and cardinality from the properties to the right down side. In this case we have chosen “referential” join with cardinality 1(USER, POST) to many (COMMENT).

             AT_COMMENT.png

     

     

     

    • AT_POST


    For the central entity, the POST, and following the previous logic, we want to show to the user not the UserID, but the Username. So, same as before we drag from the foreign to the primary key, choose from the properties “referential” join and 1(USER) to many(POST) cardinality. Again we keep the UserID for our own purpose. Then, we have to add to the output the columns we want by clicking on the sphere next to each column.

      AT_POST.png

    In this example we extend our attribute view logic by adding a Calculated Column. Let’s say we want a column to give as a clue about a post’s popularity, except from Views. To do that we will create a column called LikesDislikesRatio, by pressing right click on the Calculated Columns folder, choose “New” and follow the wizard. The formula is simple, it’s the fraction Likes/Dislikes. Let’s make the convention that when Dislikes are zero(0) then the result will be the number of Likes, and when the Likes are zero(0) the result will be zero(0), given that we only care for the most popular.




    Create calculation views

    We could keep on creating attribute views, but at some points we will need some aggregations. In our case we want to know how many comments are written for a single post, how many incoming links there are for this post, the number of posts and comments written by a certain user and how many posts are made in the country level.


    • CALC_POST


    Let’s start with the POST. We would like to count the number of comments and inbound links for each post. To do so we follow the procedure shown below.


    First, we have to join each COMMENT with the corresponding POST. For the post entity we use the attribute view AT_POST we created previously, that holds the additional information we wanted. To join, after we have created a join (JOIN_1) and added the 2 objects, we drag the mouse from COMMENT.PostID to AT_POST.ID and we choose cardinality 1(AT_POST) to many(COMMENT). In contrast to the joins we made in the attribute views, now we will choose Left Outer join, because we want to keep all the rows of the AT_POST, even those that don’t have comments or links.

      CALC_POST(Join_1).png

    As a result we have for each post, the ids of all its comments.


    Then, we will create another join (JOIN_2) to join the previous join (JOIN_1) with the POST2POST table. Again, we drag the mouse, choose cardinality 1 to 1 and change the join type to Right Outer, because we want in the result all the columns and not only the ones that exist in both tables.

      CALC_POST(Join_2).png

    Doing this, we added to each post the ids of the posts that have outbound links.


    The next step is to actually create the aggregations. To do this we have to connect our last join (JOIN_2) to the Aggregation operator. In the Aggregation, we will create our two aggregated columns that will hold the number of comments and incoming links for each post. We simply right-click on the Calculated Columns folder, but this time we choose “New counter” and follow the wizard choosing which column we want to count. We do this twice, creating the two columns “CommentsNo” and “InLinks”.

      CALC_POST(Aggregation).png

     

     

    • CALC_COUNTRY


    For the Country entity we will follow the same procedure, first joining the country with the user, then the result of the join with the post. In the properties we can see the details of each join.

      CALC_COUNTRY(Join 1).png

    CALC_COUNTRY(Join 2).png

     

    The result is for each country all the user ids and for each user id the post ids that the user has written.

    Then, in the Aggregation operator, we create again two new counter, counting the number of posts and the number of users for each country.

      CALC_COUNTRY(Aggregation).png

     

     

     

    • CALC_USER


    Lastly, we want to create a view that holds all the important information about a user. In addition, we want to know for each user how many posts and comments he/she has written.


    Again following a similar procedure, at the beginning we join the USER with the COUNTRY, to replace the countryID with actual name of the country.

      CALC_USER(Join 1).png

    Then, we join the result with the POST table to get the postIDs of each user’s post.

      CALC_USER(Join 2).png

    After we have joined the JOIN_1 with the POST, we join the resulted JOIN_2 with the COMMENT table to get the commentIDs of each user’s comments.

      CALC_USER(Join 3).png

    Lastly, we create the two columns (Counters) that will hold the number of posts(PostsNo) and comments(CommentsNo) of each user. The properties tab shows the details.

      CALC_USER(Aggregation).png

     

     

    Having done all the previous procedures, we have completed our data model and we are ready to expose it to our ui application. The data will be exposed with OData protocol and more specifically via a number of OData services in the next post.


    Hope you enjoyed!!

    Viewing all 737 articles
    Browse latest View live


    <script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>