EducationSoftwareStrategy.com
StrategyCommunity

Knowledge Base

Product

Community

Knowledge Base

TopicsBrowse ArticlesDeveloper Zone

Product

Download SoftwareProduct DocumentationSecurity Hub

Education

Tutorial VideosSolution GalleryEducation courses

Community

GuidelinesGrandmastersEvents
x_social-icon_white.svglinkedin_social-icon_white.svg
Strategy logoCommunity

© Strategy Inc. All Rights Reserved.

LegalTerms of UsePrivacy Policy
  1. Home
  2. Topics
  3. Kaggle

Kaggle


Henri-Francois Chadeisson

Director, Sales Engineering • MicroStrategy


This low code approach helps Data Scientists send data from Kaggle to MicroStrategy, would the dataset be enriched or not. Kaggle is the world's largest community of data scientists and machine learners with over 1,000,000 users in 194 countries.

Starting with the release of Strategy ONE (March 2024), dossiers are also known as dashboards.
Goal
Before mstrio, importing a dataset from Kaggle (or other opendata / datascience website) required to download the target data as a CSV or Excel file, and upload it manually with Data Import.
Kaggle provides a web-based Python console to interact with Kaggle Datasets. This is where mstrio steps in! With a very few lines of code, you will take an existing dataset and push it to Strategy with a click on a button. We do not need a Strategy Connector for Kaggle when we can have a Kaggle Connector for Strategy !
How to use the connector
1. Open a Dataset on Kaggle

ka0PW0000001JRCYA2_0EM44000000R7MO.png
ka0PW0000001JRCYA2_0EM44000000R7MT.png

 
In order to create a Dossier with this, you would normally download the CSV file and run a data import. This would need to be repeated for every update. This is where mstrio helps streamline data transfers to Strategy
2. Open a Python Console in Kaggle using the 

ka0PW0000001JRCYA2_0EM44000000R7MY.png

 button and select Script option. You should see the following Console.

ka0PW0000001JRCYA2_0EM44000000R7Md.png

 
Kaggle runs in a closed network by default. You need to enable internet in order to communicate with your Strategy Server.
3. From the Settings, enable Internet BETA (note: this might require phone verification)
4. From the Settings again, install mstrio package as in the screenshot below. Package name is “mstrio-py” and click on the little right arrow next to the package field 

ka0PW0000001JRCYA2_0EM44000000R7Ms.png

Once installation is complete, you should see the following screen. We can now use mstrio.

ka0PW0000001JRCYA2_0EM44000000R7Mx.png

5. Click the “Restart” button as requested

ka0PW0000001JRCYA2_0EM44000000R7N2.png

The console will show up and then Stop, Reinitialize, Start and Run:

ka0PW0000001JRCYA2_0EM44000000R7N7.png

6. Grab/copy all the code in mstr-kaggle-python.py attached (also available here : https://github.com/hchadeisson/mstr-kaggle-python)
7. Empty the code that is in the console
8. Paste the code from mstr-kaggle-python.py in the console

ka0PW0000001JRCYA2_0EM44000000R7NC.png

We are almost done. Read the code and comments very briefly. Even if you are not a programmer, it should be very straightforward to understand
9. Configure your Strategy connectivity by adjusting the variables from rows 6 to 8 in the code

ka0PW0000001JRCYA2_0EM44000000R7NH.png

10. The script will execute step by step. In order to run it entirely, select / highlight the entire code in the editor (or go in the editor and type Ctrl/Cmd + A)

ka0PW0000001JRCYA2_0EM44000000R7NM.png

11. Run the script using the 

ka0PW0000001JRCYA2_0EM44000000R7NR.png

button. It will grab all the datasets added to your Console, and push these as individual cubes in your Strategy environment
12. If all goes well, the output should look like this, and your dataset should be in the Strategy project you specified in your Server (default goes to My Reports folder)

ka0PW0000001JRCYA2_0EM44000000R7Nb.png
ka0PW0000001JRCYA2_0EM44000000R7Ng.png

All ready to create a nice Dossier!

ka0PW0000001JRCYA2_0EM44000000R7Nl.png

15 steps! Seriously? Why all the trouble?
Agreed, this might seem like a little over doing it for just one simple dataset. The main way people use Kaggle is: 1/ loading many datasets at once in the console and 2/ run training algorithms and predictions all ran in Python.
1/ loading many datasets at once in the console
Loading 20 datasets with that console would be as fast as loading 1: a couple minutes. If you were using Data Import, it would take an hour to do that.
2/ run training algorithms and predictions all ran in Python
Data Scientists get insights and make predictions with Python. Most of the time, their output is just a JSON, CSV or Excel file they send via email drop on a Big Data cluster that no one has access to.
These very few lines of code allow to push the result straight to Strategy while relying on Kaggle computing power. This will ultimately help make these outputs popular and spread to the world with Strategy.


Comment

0 comments

Details

Example

Published:

October 22, 2018

Last Updated:

March 21, 2024