The Python/Jupyter ecosystem: today's problem-solving environment for computational science
2021
The March/April issue of CiSE's inaugural year (1999) carried an essay by eminent computer science professor John R. Rice (who at the time was area editor for Software, together with Matlab inventor Cleve Moler) titled A Perspective on Computational Science in the 21st Century \cite{Rice_1999}. In it, he looked at the development directions for the future of computational science and engineering, and threaded across these was what he called "problem-solving environments." This routine-sounding term hides an ambitious vision, for the time. Rice imagined a software system for tackling problems within a science domain without all the agonizing toil of programming by hand every solution method. He and Ronald Boisvert had a previous article (1996) explaining the idea in more detail \cite{Rice_1996}. A problem-solving environment would include a collection of mathematical and domain-specific software libraries, offer (semi-)automatic selection of solution method for a given problem, help check the problem formulation, display or assess the correctness of solutions, allow extensibility to add new methods, and even manage the overall computational process. They envisaged an environment that could be "all things to all people," meaning: it is effective when solving simple or complex problems, it supports rapid prototyping and detailed analysis, and it can be used both in introductory teaching and in productive research at the edges of knowledge. An ideal problem-solving environment would even make decisions for the user by means of an integrated knowledge base. What fabulous ambition!Prof. Rice led a research group at Purdue University that worked to develop early problem-solving environments. The Ellpack system for solving elliptic boundary value problems, developed in the early 1980s, included dozens of software modules implementing solution methods and a descriptive language to formulate problems (today, we might call it a domain-specific language, DSL). For example, the line: equation. uxx + uyy + 3 * ux - 4 * u = exp(x+y) * sin(pi * x) would be used in an Ellpack program for defining the differential equation to be solved. Similarly expressive statements would define the boundary conditions, and the grid parameters to discretize the domain (a full example at https://www.cs.purdue.edu/ellpack/example.html). Later versions of the system offered parallel solvers and a graphical user interface (screenshots of the Ellpack system from Prof. John R. Rice's website at Purdue can be found in the Internet Archive https://web.archive.org/web/19990506040312/https://www.cs.purdue.edu/research/cse/index.html).While Ellpack was licensed by Purdue University for a modest yearly fee, this project did not branch off commercially or otherwise. Perhaps a few hundred copies were distributed, mostly for use in university settings, and the project wound down by the early 2000s. By contrast, three commercial software packages for high-productivity scientific and engineering computation—Maple, Mathematica, and Matlab—had by then become very popular \cite{Chonacky_2005}. These systems continue to be widely used in education, industry, and government settings. Their purchase price and proprietary implementations, however, led many champions of open-source software to conceive alternatives, oftentimes closely imitating their functionality.In March/April 2011, twelve years after Prof. Rice's Perspective essay, CiSE ran a special issue on Python for Scientific Computing, showcasing a maturing stack of tools and a highly productive environment for researchers. The issue included one of the most-widely cited articles in the history of the magazine, discussing the high-level multidimensional array structure at the core of NumPy \cite{van_der_Walt_2011}. By this time, the scientific community had expanded Python for its purposes, and the four keystone libraries had been put in place in the first half of the decade: SciPy was consolidated as a standard collection of modules for common mathematical and statistical functions.The first version of IPython, an enhanced interactive shell for Python, was created by Fernando Perez.Matplotlib, the rich 2D visualization and now standard Python plotting library, was released by John Hunter.Travis Oliphant created NumPy from a rewrite of the early Python array library Numeric, adding functionality from the competing array package called numarray.CiSE had previously featured the developing Python support for scientific workflows in an issue organized by Paul F. Dubois, who was the project lead for Numeric from 1997 to 2002. Paul was an editor for Computer in Physics (which got merged into CiSE) since 1993, and joined CiSE with its founding. He wrote and edited for the Scientific Programming department until 2006, and continued with the column "Cafe Dubois" until 2008. The issue he led included the Hunter piece on Matplotlib \cite{Hunter_2007}, the Perez and Granger article about IPython \cite{Perez_2007}, and Travis Oliphant's general overview of the Python language and its extensions with NumPy and SciPy \cite{Oliphant_2007}. Other articles in the issue highlight applications in various science contexts: space observation, systems biology, robotics, nanophotonics, and more. An author team from the Simula Research Laboratory in Norway discussed new Python tooling for solving partial differential equations with finite element methods in what was the early development of the Fenics project (http://www.fenics.org/) \cite{Mardal_2007}. This work heralded the compelling combination of symbolic mathematics and code generation, which Ellpack anticipated.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI