When I first saw Jupyter Notebook at IBM Insight (at World of Watson) in 2016, I knew it was something I had to try. I have been excited about the possibility of making greater use of it. I’ve finally found the time to work with it a bit more and wanted to share the basics on how to install and use Jupyter Notebook with DB2.
What is Jupyter Notebook?
Jupyter Notebook is a Python based tool that is often used in data science. It is open source and web-based. The real power with Jupyter Notebook is that it allows you to combine cells of formatted text with cells of code that can be executed right inline. The data can then be easily displayed in powerful ways including graphs and charts with very little effort.
For the DB2 DBA, this can offer several advantages. For one thing, being familiar with a data science tool is never a bad thing for a data professional. For another thing, this format offers a powerful way to explain things in DB2 in a way that others can easily try things out or visualize the data. Just a few things I can see using Jupyter Notebook for:
- Team documentation
- Common troubleshooting procedures or cookbook
- Database health check
- Reorg/runstats script with details teaching what is being done and why
- Documentation of repetitive database task – report generation, etc
- Presentation or blog entry with SQL built in for an awesome takeaway
Installing Jupyter Notebook on Windows
This is not a one-step install, but it is also not impossible. I am listing the steps for Windows here because it is sometimes my first sandbox for trying new things and is a very accessible place (because it is the host OS for my laptop) for trying new things. I expect to add a blog entry with details for Linux in the future.
The steps I took using Windows 10:
- Download Anaconda
- Install Anaconda, accepting the defaults
- If you don’t have it already, install a DB2 client or server on the same machine
- Download and install visual C++ tools
- Open a DB2 command window (as administrator) and launch Jupyter Notebook:
- A browser window will open. Select New, and Python 3:
Congratulations, you have created your first Jupyter Notebook. Proceed to the next section to learn how to configure and use it with DB2.
Configuring Jupyter Notebook to Run with the SQL Magic
Enter the following in a cell, and then execute it (by clicking play at the top, or by hitting shift+enter). You will need to change the path to your SQLLIB directory for the DB2 installation, if it is different.
import sys,os,os.path os.environ['IBM_DB_HOME']='C:\Program Files\IBM\SQLLIB' !pip install ipython-sql !pip install ibm_db !pip install ibm_db_sa
At this point, you need to restart the Jupyter Notebook Kernel – the following steps will fail if you do not. This can be accomplished by clicking the little refresh like button at the top:
Once the Jupyter Notebook Kernel has been restarted, enter the following in a cell and execute it.
import ibm_db import ibm_db_sa import sqlalchemy %load_ext sql
You now have the basics you need to connect to any local or cataloged DB2 database.
Jupyter Notebooks are often shared using github. Check out my basic connection Jupyter Notebook. It has all the details above plus a connection statement, and can be used as a great starting point.
Summary
This blog entry doesn’t even touch on the power of Jupyter Notebook and why you should consider using it. That will come. Play with it and let me know what you think and what ideas you have.
Worked, I am able to connect database and run queries . You need to add %sql in front of query. I had to restart system after installing visual C++ tools to work
Thank you
Ember, you should correct this in the 1st cell: !pip install ibm_db
Should be: !pip install “ibm_db==2.0.8a”
Otherwise. errors described here occur ==>> https://github.com/ibmdb/python-ibmdb/issues/286
Cheers, Thom
I followed the above steps :-
Getting below mentioned error
Collecting ipython-sql
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError(, ‘Connection to pypi.python.org timed out. (connect timeout=15)’)’: /simple/ipython-sql/
Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError(, ‘Connection to pypi.python.org timed out. (connect timeout=15)’)’: /simple/ipython-sql/
Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError(, ‘Connection to pypi.python.org timed out. (connect timeout=15)’)’: /simple/ipython-sql/
Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError(, ‘Connection to pypi.python.org timed out. (connect timeout=15)’)’: /simple/ipython-sql/
Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘ConnectTimeoutError(, ‘Connection to pypi.python.org timed out. (connect timeout=15)’)’: /simple/ipython-sql/
Could not find a version that satisfies the requirement ipython-sql (from versions: )
No matching distribution found for ipython-sql
Do you have an internet connection available for it to be able to download and install ipython-sql? It looks like it cannot install it for you.
Hi Ember,
great article. I am also trying to get it done (on Windows 10), but I get this on
!pip install ibm_db==2.0.5
(it’s 2.0.5 in dsdrivers in 11.1.3.3)
…
C:\Users\MYUSER\AppData\Local\Temp\pip-install-l0wpx_u0\ibm-db\setup.py:52: UserWarning: Detected usage of IBM Data Server Driver package. Ensure you have downloaded 64-bit package of IBM_Data_Server_Driver and retry the ibm_db module install
…
error: Microsoft Visual C++ 14.0 is required. Get it with “Microsoft Visual C++ Build Tools”: http://landinghub.visualstudio.com/visual-cpp-build-tools
I installed Microsoft Build tools 15.0 and rebootet.
Any idea?
Cheers,
Christian
Have you tried this? https://github.com/ibmdb/python-ibmdb/issues/303
There are some issues with ibm_db and 11.1.3.3
I have installed IBM_DB on Python in Windows. But, when I am trying to create a connection it’s failing with the following message:
import ibm_db as db
conn = db.connect(“DATABASE=MEMWMS2;HOSTNAME=MEMWMS2;PORT=446;PROTOCOL=TCPIP;UID=VERIDIAN;PWD=rsq3prfs;”, “”, “”)
Traceback (most recent call last): File “”, line 1, in SQLCODE=-1598M][CLI Driver] SQL1598N An attempt to connect to the database server failed because of a licensing problem. SQLSTATE=42968
I tried adding the license in the license folder of CLIDriver.
Are you trying to connect to Db2 on z/OS or Db2 on LUW?
https://www.idug.org/p/fo/et/thread=45720
Hi Ember,
I have installed the IBM Data Server Client on my Windows 10 Laptop, and have catalogued a remote database entry in it. Installed Anaconda, and imported the following : ibm_db, ibm_db_sa,sqlalchemy,%load_ext sql . While the connection to the remote db is successful using the ibm_db.connect() feature, I am unable to use SQL Magic to establish a db connection to my remote database and query objects .
%sql ibm_db_sa://userid:password@localhost:50013/remote db name
SQL30081N A communication error has been detected. Communication protocol being used: “TCP/IP”. Communication API being used: “SOCKETS”. Location where the error was detected: “127.0.0.1”. Communication function detecting the error: “connect”. Protocol specific error code(s): “10061”, “*”, “*”. SQLSTATE=08001\r SQLCODE=-30081′)
NOTE : Here, 50013 is the port number on the remote database server where my database is created locally.
https://www-01.ibm.com/support/docview.wss?uid=swg21164785
Have you set DB2COMM and SVCENAME to allow tcpip connections?
Hi Ember,
The issue was with using ‘localhost’ in the connection string – replacing it with the IP address of the remote server resolved the issue. As I’d mentioned earlier, I was attempting to connect to a remote database catalogued on my local DB2 client. When I changed localhost to IP address of the remote server, the connection was successful.
Here’s how the connect string looks like now :
%sql ibm_db_sa://userid:password@:
/remote db name
Hi Ember ,
I have followed the steps and it give me this error always
(ibm_db_dbi.ProgrammingError) ibm_db_dbi::ProgrammingError: [IBM][CLI Driver] SQL1042C An unexpected system error occurred. SQLSTATE=58004\r SQLCODE=-1042
(Background on this error at: http://sqlalche.me/e/f405)
Connection info needed in SQLAlchemy format, example:
postgresql://username:password@hostname/dbname
or an existing connection: dict_keys([])
SQL1042C is a serious Db2 system error. Have you verified that connecting directly on the database server works, and that there isn’t a serious issue on the database server?