If you want to master, or even just use, data analysis, Python is the place to do it. Python is easy to learn, it has vast and deep support, and most every data science library and machine learning framework out there has a Python interface.
Over the past few months, several data science projects for Python have released new versions with major feature updates. Some are about actual number-crunching; others make it easier for Pythonistas to write fast code optimized for those jobs.
Python data science essential: SciPy 1.7
Python users who want a fast and powerful math library can use NumPy, but NumPy by itself isn’t very task-focused. SciPy uses NumPy to provide libraries for common math- and science-oriented programming tasks, from linear algebra to statistical work to signal processing.
How SciPy helps with data science
SciPy has long been useful for providing convenient and widely used tools for working with math and statistics. But for the longest time, it didn’t have a proper 1.0 release, although it had strong backward compatibility across versions.
The trigger for bringing the SciPy project to version 1.0, according to core developer Ralf Gommers, was chiefly a consolidation of how the project was governed and managed. But it also included a process for continuous integration for the MacOS and Windows builds, as well as proper support for prebuilt Windows binaries. This last feature means Windows users can now use SciPy without having to jump through additional hoops.
Since the SciPy 1.0 release in 2017, the project has delivered seven major point releases, with many improvements along the way:
- Deprecation of Python 2.7 support, and a subsequent modernization of the code base.
- Constant improvements and updates to SciPy’s submodules, with more functionality, better documentation, and many new algorithms — e.g., a new fast Fourier transform module with better performance and modernized interfaces.
- Better support for functions in LAPACK, a Fortran package for solving common linear equation problems.
- Better compatibility with the alternative Python runtime PyPy, which includes a JIT compiler for faster long-running code.