After completing these steps, you will have created a new dataset using self-service data preparation. This new dataset will help to isolate invalid claims.
Click ‘More Actions’.

Select ‘Prepare Data’.

The self-service data preparation room shows up.

The first record of the data is actually the column header.

Check ‘Use first row as header’.

Click ‘Continue’.

The application will automatically recreate a new sample with the updated metadata structure.

The dataset now has the proper header.

Click ‘Actions’.

Click ‘Enrich Preparation’.

The enrich preparation user interface opens.

Click ‘+’ to add a new data source to merge with.

Click ‘Browse’.

Select ‘HANA_HC’ as a ‘Connection’.

Click ‘PHARMA’.

Select ‘PHARMA_CLAIMS’.

Click ‘OK’.

The application is acquiring a sample of the newly selected dataset.

The newly selected dataset can now be joined to the existing dataset.

Drag and drop ‘PHARMA_CLAIMS’ on the cell on the left hand-side of the main dataset.

Select ‘Left Join’.

Scroll down the list of output columns and uncheck ‘ORIG_PRODUCT’, ‘POTENCY’, ‘DOSAGE’, ‘ROUTE_ADMINISTERED’, ‘NOTES’.

Click Apply.

The application displays a preview of the merge data.

The merged data now shows a null value for the column ‘DRUG_NAME_0’ when a record from the claim data is for a drug that is not listed in the list of supported drugs.

Click ‘Apply Enrichment’.

The main self-service data preparation room now shows the enriched dataset.

The enriched dataset now contains null records for the field ‘DRUG_NAME_0’ for the records in the claim dataset which the drug name did not exist in our reference.
Potentially, there are multiple reasons for that. Some might be spelling mistakes of drug names, some other might be drugs that are not taken into account by the insurance company, some could be that the drug name in our claim was null.
You can now use this enriched dataset to isolate the data quality issues to further understand the data.
Click ‘Actions’.

Click ‘Add Columns’.

Type ‘ValidClaim’ for the ‘Column Name’.

Click ‘Expression’.

Type the following expression:

Click ‘OK’.

Click ‘Apply’.

A new column is now created.

Select the column ‘DRUG_NAME_0’.

Click ‘Remove’.

The column ‘DRUG_NAME_0’ has been deleted.

Click ‘<’ to navigate back to the ‘Actions’ menu.

Click ‘Run Preparation’.

Type ‘PHARMA_CLAIMS_ENRICHED_’.

Click ‘Apply’.

Click ‘Data Intelligence Metadata Explorer’.

Select ‘Monitor’ and click ‘Monitor Tasks’.

The ‘Monitoring’ application displays all tasks. Click on the ‘Preparation’ tab to filter on preparation tasks.

Filter on your task (i.e. DRUG_). Wait for your task to complete.

The task is completed.

Click ‘Data Intelligence Metadata Explorer’, and click ‘Home’.

Click ‘Browse Connections’.

Click ‘DI_DATA_LAKE’.

Navigate to the ‘shared’ directory. Click ‘Pharma’.

Type ‘’ in the Filter field.

Click .

Click ‘More Actions’ on the newly created dataset named PHARMA_CLAIMS_ENRICH_.

Select ‘View Fact Sheet’, Click ‘Overview’.

The factsheet for the dataset is not profiled and not published.

Click the ‘Profiling’ icon.

Click ‘Yes’.

Wait for the profiling to be executed (there will be two notifications which you can check by clicking on the notification icon). Then Click ‘Refresh’.

The dataset is now profiled.

Click ‘Data Intelligence Metadata Explorer’ and Click ‘Home’.

You returned to the Metadata Explorer home page.

Well done! You’ve now used Metadata Explorer to create a new enriched dataset using self-service data preparation. This new dataset helps to isolate invalid claims.