Introduction
In June-2018 Microsoft released the connector for SAP BW version 2.0. You can read about the announcement here.As you can see there one of the new features is:
"Ability to retrieve several million rows of data, and fine-tuning through the batch size parameter."
I decided to give it a try.
Keep in mind that SAP says: "Bex queries are not made for mass data extraction." as stated on SAP note: 1968598 - Mass data in Bex queries Options and Workarounds.
Also, note that you may need to fine-tune some memory parameters in your application server to avoid memory problems, which at some point will anyways occur since BEx queries are not meant for mass data extraction.
The configuration
The source system
SAP BW/4 HANA SP5 deployed from CAL.SAP.COM to Google cloud platform. The dataset consists of 2.4 million records, with one characteristic or dimension that is a unique ID.This unique ID it is generated by ABAP code in the transformation and is the key of the DSO. This happens in real life scenarios, having a dimension with millions of different values, examples are billing document numbers, accounting document numbers, etc.
I loaded the data to two targets, both advanced DSOs, one where the unique key (the dimension with 14 million different values) was an info object (which means SID generation) and another where the unique key was a HANA field (no SID generation).
On top of those advanced DSOs, I created queries, and Power BI connects to those queries.
The data I loaded is from here.
The Target
I was using Power BI Desktop 64-bits 2.59.5135.781 64-bit (June 2018)Note: Since this new connector is in BETA, I did not test it using the Power BI gateway, which is a must (in most cases) for production use.
The first parameter tested - Batch Size
There is not much documentation by Microsoft about execution mode, but in transaction MDXTEST if you press "Execute..." you can find the same parameters, and from there you can do some research.
Image 2 - Correlation between PowerBI parameters and MDXTEST parameters
My intuition for this parameter was a follow:
For a smaller batch size, more RFC calls, and less memory consumption, for bigger batch size, fewer RFC calls, and more memory consumption.
With batch size 50.000
Each batch was around 50k records, which is just a coincidence. Depending on the width of your dataset this can vary.
Here is a short video (30 seconds) where you can see the load:
And here is a screenshot from transaction ST04:
In 1 we can see the memory consumption, in 3 (R=30) we can see that there have been 30 RFC calls (30 x 5000 = 1 500 000 records at the moment of the screenshot).
With batch size 100
Each batch was around 600 records. Here is a short video (30 seconds) where you can see the load:
And here is a screenshot from transaction ST04:
This load was noticeably slower.
With batch size 1 000 000
With this batch size, the load ended with an out of memory (TSV_TNEW_PAGE_ALLOC_FAILED) error in BW.
You can see the video here:
This load was noticeable faster but did not finish successfully.
Conclusion
Start with a small batch size and increment, to get better performance. You will know it's too big when it fails. Always experiment in a SandBox system, never in production!
This parameter is a great addition to the connector!
In a new blog I will test the other parameters!

No hay comentarios.:
Publicar un comentario